Lineage Basics
Lineage tracks how data flows through your systemβfrom raw sources through transformations to final metrics. Olytix Core provides column-level lineage, showing exactly which source columns contribute to each metric.
What is Lineage?β
Lineage answers: "Where does this data come from?"
Column-Level Lineage
Track data flow from source to metric. Edges show transformation types.
Lineage Levelsβ
Olytix Core tracks lineage at multiple levels:
Model-Level Lineageβ
Which models depend on which:
stg_orders ββββββββββββββ
β
stg_customers βββββββββββΌβββΊ fct_orders
β
stg_products ββββββββββββ
Column-Level Lineageβ
Which columns flow to which:
stg_orders.order_id βββΊ fct_orders.order_id
stg_orders.amount βββΊ fct_orders.order_amount
stg_customers.name βββΊ fct_orders.customer_name
Measure/Metric Lineageβ
Which columns compose measures and metrics:
raw.orders.amount
ββββΊ stg_orders.amount
ββββΊ fct_orders.order_amount
ββββΊ orders.total_revenue (measure)
ββββΊ monthly_revenue (metric)
Lineage Edge Typesβ
Olytix Core categorizes lineage relationships:
| Edge Type | Description | Example |
|---|---|---|
DIRECT | Column passed unchanged | SELECT id FROM source |
RENAMED | Column renamed | SELECT id AS order_id |
DERIVED | Column calculated | SELECT a + b AS total |
AGGREGATED | Column aggregated | SELECT SUM(amount) |
JOINED | Column from join | LEFT JOIN ... ON |
FILTERED | Column used in filter | WHERE status = 'active' |
MEASURE_SOURCE | Column to measure | sql: amount in measure |
DIMENSION_SOURCE | Column to dimension | sql: region in dimension |
METRIC_SOURCE | Measure to metric | expression: cube.measure |
Viewing Lineageβ
CLI Commandsβ
# View lineage for a metric
olytix-core lineage metric monthly_revenue
# View lineage for a column
olytix-core lineage column fct_orders.order_amount
# View lineage for a model
olytix-core lineage model fct_orders
# Export lineage graph
olytix-core lineage --format json > lineage.json
API Endpointsβ
# Get metric lineage
curl http://localhost:8000/api/v1/lineage/metrics/monthly_revenue
# Get column lineage
curl http://localhost:8000/api/v1/lineage/columns/fct_orders/order_amount
# Get full lineage graph
curl http://localhost:8000/api/v1/lineage
Example Outputβ
Lineage: monthly_revenue
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
metric.monthly_revenue
β
βββ METRIC_SOURCE
β
cube.orders.total_revenue (measure: SUM)
β
βββ MEASURE_SOURCE
β
model.fct_orders.order_amount
β
βββ RENAMED (from: amount)
β
model.stg_orders.amount
β
βββ DIRECT
β
source.raw.orders.amount
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
How Lineage is Extractedβ
From SQLβ
Olytix Core parses SQL to extract lineage:
SELECT
o.order_id, -- DIRECT from stg_orders.order_id
o.amount AS order_amount, -- RENAMED from stg_orders.amount
o.amount * o.quantity AS total, -- DERIVED from amount, quantity
c.customer_name -- JOINED from stg_customers
FROM {{ ref('stg_orders') }} o
LEFT JOIN {{ ref('stg_customers') }} c
ON o.customer_id = c.customer_id
WHERE o.status = 'completed' -- FILTERED by status
Extracted lineage:
fct_orders.order_id β stg_orders.order_id (DIRECT)
fct_orders.order_amount β stg_orders.amount (RENAMED)
fct_orders.total β stg_orders.amount (DERIVED)
fct_orders.total β stg_orders.quantity (DERIVED)
fct_orders.customer_name β stg_customers.customer_name (JOINED)
From Cube Definitionsβ
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
measures:
- name: total_revenue
type: sum
sql: order_amount # MEASURE_SOURCE from fct_orders.order_amount
From Metric Definitionsβ
metrics:
- name: monthly_revenue
expression: orders.total_revenue # METRIC_SOURCE from cube measure
Upstream vs Downstreamβ
Upstream Lineageβ
"What does this depend on?" (looking backwards)
olytix-core lineage --upstream monthly_revenue
Upstream of: monthly_revenue
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Sources:
β’ raw.orders.amount
β’ raw.orders.quantity
Staging:
β’ stg_orders.amount
β’ stg_orders.quantity
Marts:
β’ fct_orders.order_amount
Cubes:
β’ orders.total_revenue
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Downstream Lineageβ
"What depends on this?" (looking forwards)
olytix-core lineage --downstream raw.orders.amount
Downstream of: raw.orders.amount
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Staging:
β’ stg_orders.amount
Marts:
β’ fct_orders.order_amount
β’ fct_orders.total
Cubes:
β’ orders.total_revenue
β’ orders.avg_order_value
Metrics:
β’ monthly_revenue
β’ avg_order_value
β’ revenue_growth
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Impact Analysisβ
Use lineage for impact analysis:
Before Changing a Source Columnβ
olytix-core impact raw.orders.amount
Impact Analysis: raw.orders.amount
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Changing this column will affect:
Models (3):
β’ stg_orders (DIRECT dependency)
β’ fct_orders (via stg_orders)
β’ fct_order_summary (via fct_orders)
Cubes (1):
β’ orders
- Measures: total_revenue, avg_order_value
- Dimensions: (none)
Metrics (5):
β’ monthly_revenue
β’ quarterly_revenue
β’ avg_order_value
β’ revenue_growth
β’ revenue_per_customer
Dashboards (3):
β’ Executive Dashboard
β’ Sales Performance
β’ Finance Overview
Users Affected: 156
β HIGH IMPACT: This change affects critical financial metrics.
Recommend: Review with Finance team before proceeding.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Before Deprecating a Modelβ
olytix-core impact --deprecate stg_orders
Shows all downstream artifacts that would break.
Lineage Visualizationβ
Text Formatβ
raw.orders.amount
β
βββ stg_orders.amount (DIRECT)
β
βββ fct_orders.order_amount (RENAMED)
β β
β βββ orders.total_revenue (AGGREGATED)
β β
β βββ monthly_revenue (METRIC)
β
βββ fct_orders.total (DERIVED)
β
βββ orders.order_total (AGGREGATED)
JSON Formatβ
olytix-core lineage --format json monthly_revenue
{
"root": "metric.monthly_revenue",
"edges": [
{
"source": "metric.monthly_revenue",
"target": "cube.orders.total_revenue",
"type": "METRIC_SOURCE"
},
{
"source": "cube.orders.total_revenue",
"target": "model.fct_orders.order_amount",
"type": "MEASURE_SOURCE"
}
]
}
Graphviz DOT Formatβ
olytix-core lineage --format dot monthly_revenue > lineage.dot
dot -Tpng lineage.dot > lineage.png
Lineage in the APIβ
Query Lineageβ
{
"query": {
"metrics": ["monthly_revenue"],
"dimensions": ["orders.region"]
},
"options": {
"include_lineage": true
}
}
Response includes lineage:
{
"data": [...],
"lineage": {
"monthly_revenue": {
"sources": ["raw.orders.amount"],
"path": [
"raw.orders.amount",
"stg_orders.amount",
"fct_orders.order_amount",
"orders.total_revenue",
"monthly_revenue"
]
}
}
}
Best Practicesβ
1. Maintain Clean Lineageβ
Avoid complex SQL that obscures lineage:
-- Hard to trace
SELECT * FROM (
SELECT * FROM (
SELECT * FROM {{ ref('model') }}
)
)
-- Easy to trace
SELECT
order_id,
amount
FROM {{ ref('model') }}
2. Use Meaningful Namesβ
Column names should be traceable:
-- Unclear lineage
SELECT a AS x FROM table
-- Clear lineage
SELECT order_amount AS revenue FROM orders
3. Document Derivationsβ
For complex calculations, add comments:
SELECT
-- Revenue = amount * quantity * (1 - discount)
amount * quantity * (1 - discount_pct) AS revenue
FROM {{ ref('stg_orders') }}
4. Regular Lineage Auditsβ
Periodically review lineage for:
- Orphaned columns (no upstream)
- Dead ends (no downstream)
- Unexpected connections
Next Stepsβ
Now that you understand lineage basics: