Skip to main content

Compilation Process

For Data Analysts

Compilation transforms your YAML and SQL files into a validated, executable project. This page explains what happens during compilation and how to troubleshoot issues.

Compilation Overview

When you run olytix-core compile, Olytix Core performs these steps:

Compilation Pipeline

Click a step to see details. Data flows from Discovery to Output.

1
🔍Discovery
Find all project files
2
📄Parsing
Parse YAML & SQL
3
🔗Resolution
Resolve refs & build graph
6
📦Output
Generate manifest
5
📊Lineage
Build column lineage
4
Validation
Validate definitions
Legend
Process Step
Data Flow
Completed

Step 1: Discovery

Olytix Core scans your project directory:

Discovering project files...
✓ Found olytix-core_project.yml
✓ Found 3 source files in sources/
✓ Found 12 model files in models/
✓ Found 4 cube files in cubes/
✓ Found 2 metric files in metrics/
✓ Found 5 macro files in macros/
✓ Found 2 seed files in seeds/
✓ Found 8 test files in tests/

File Types

ExtensionTypeLocation
.yml, .yamlSources, Cubes, Metricssources/, cubes/, metrics/
.sqlModels, Macros, Testsmodels/, macros/, tests/
.csvSeedsseeds/

Step 2: Parsing

YAML Parsing

YAML files are parsed and validated against schemas:

# Input: cubes/orders.yml
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
measures:
- name: total_revenue
type: sum
sql: amount
# Parsed result (conceptual)
CubeDefinition(
name="orders",
sql="SELECT * FROM {{ ref('fct_orders') }}",
measures=[
MeasureDefinition(
name="total_revenue",
type=MeasureType.SUM,
sql="amount"
)
]
)

SQL Parsing

SQL files with Jinja are parsed in two phases:

Phase 1: Jinja Extraction

-- Input
SELECT * FROM {{ ref('stg_orders') }}
WHERE created_at > '{{ var("start_date") }}'

Extract Jinja expressions:

  • {{ ref('stg_orders') }} → Model reference
  • {{ var("start_date") }} → Variable reference

Phase 2: SQL AST

After Jinja rendering, parse SQL to Abstract Syntax Tree for lineage extraction.

Step 3: Resolution

Reference Resolution

Resolve all ref() and source() calls:

Resolving references...
model.stg_orders
→ ref('stg_orders') resolves to model.stg_orders
model.fct_orders
→ ref('stg_orders') resolves to model.stg_orders
→ ref('stg_customers') resolves to model.stg_customers
cube.orders
→ ref('fct_orders') resolves to model.fct_orders

Dependency Graph Construction

Build the DAG from resolved references:

Building dependency graph...
Adding edge: model.stg_orders → source.raw.orders
Adding edge: model.fct_orders → model.stg_orders
Adding edge: model.fct_orders → model.stg_customers
Adding edge: cube.orders → model.fct_orders
Adding edge: metric.monthly_revenue → cube.orders.total_revenue

✓ Graph built: 15 nodes, 22 edges
✓ Cycle check: No cycles detected

Step 4: Validation

Schema Validation

Ensure all definitions conform to their schemas:

Validating definitions...
✓ Source 'raw' valid
✓ Model 'stg_orders' valid
✓ Model 'fct_orders' valid
✓ Cube 'orders' valid
✓ Measure 'total_revenue' valid
✓ Dimension 'order_date' valid
✓ Metric 'monthly_revenue' valid

Reference Validation

Ensure all references exist:

Validating references...
✓ ref('stg_orders') exists
✓ ref('stg_customers') exists
✓ source('raw', 'orders') exists
✓ cube measure 'orders.total_revenue' exists

Type Validation

Check type compatibility:

Validating types...
✓ Measure 'total_revenue' SQL 'amount' compatible with SUM
✓ Dimension 'order_date' SQL 'order_date' compatible with TIME

Step 5: Lineage

Column-Level Lineage

Extract column lineage from SQL:

-- fct_orders.sql
SELECT
o.order_id,
o.amount AS order_amount,
c.customer_name
FROM {{ ref('stg_orders') }} o
JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.customer_id

Lineage extracted:

fct_orders.order_id       ← stg_orders.order_id (DIRECT)
fct_orders.order_amount ← stg_orders.amount (RENAMED)
fct_orders.customer_name ← stg_customers.customer_name (DIRECT)

Measure Lineage

Track lineage from cube measures to model columns:

orders.total_revenue
← fct_orders.order_amount (AGGREGATED)
← stg_orders.amount (RENAMED)
← raw.orders.amount (DIRECT)

Step 6: Output

Manifest Generation

Generate target/manifest.json:

{
"metadata": {
"project_name": "my_analytics",
"generated_at": "2024-01-20T10:30:00Z",
"olytix-core_version": "1.0.0"
},
"sources": {...},
"models": {...},
"cubes": {...},
"metrics": {...},
"lineage": {...},
"graph": {...}
}

Compiled SQL

Generate target/compiled/:

target/compiled/
├── models/
│ ├── staging/
│ │ └── stg_orders.sql # Jinja rendered
│ └── marts/
│ └── fct_orders.sql # Jinja rendered
└── tests/
└── assert_positive.sql # Jinja rendered

Example compiled SQL:

-- Original: SELECT * FROM {{ ref('stg_orders') }}
-- Compiled:
SELECT * FROM "analytics"."staging"."stg_orders"

Running Compilation

Basic Compile

olytix-core compile

Output:

Compiling project...
✓ Loaded 3 sources with 8 tables
✓ Compiled 12 models
✓ Registered 4 cubes with 15 measures and 20 dimensions
✓ Created 8 metrics
✓ Built column-level lineage
✓ Generated manifest at target/manifest.json

Compilation completed in 1.24s

Compile with Warnings

olytix-core compile --warn-error

Treats warnings as errors (useful in CI).

Partial Compilation

# Compile specific model and dependencies
olytix-core compile --select +fct_orders

# Compile without lineage (faster)
olytix-core compile --skip-lineage

Common Compilation Errors

Missing Reference

Error: Reference 'stg_orderz' not found

File: models/marts/fct_orders.sql
Line: 5
SELECT * FROM {{ ref('stg_orderz') }}
^^^^^^^^^^^

Did you mean: 'stg_orders'?

Fix: Correct the typo in ref().

Circular Dependency

Error: Circular dependency detected

Path:
model.a → model.b → model.c → model.a

Location:
models/marts/model_c.sql:3
SELECT * FROM {{ ref('model_a') }}

Fix: Restructure models to break the cycle.

Invalid Measure Type

Error: Invalid measure configuration

Cube: orders
Measure: total_revenue
Issue: Type 'sum' requires numeric SQL expression

SQL: status
Type: string (inferred)

Fix: Use a numeric column or change measure type.

Fix: Update SQL to reference numeric column.

Schema Mismatch

Error: Column not found in base query

Cube: orders
Measure: total_revenue
SQL: amountt

Column 'amountt' not found in cube SQL.
Available columns: order_id, customer_id, amount, order_date

Did you mean: 'amount'?

Fix: Correct the column name typo.

YAML Syntax Error

Error: Invalid YAML syntax

File: cubes/orders.yml
Line: 15
measures:
- name: total_revenue
type sum # Missing colon!

Fix: Add colon after 'type'

Fix: Correct YAML syntax.

Compilation Performance

Performance Tips

  1. Use --skip-lineage for quick validation

    olytix-core compile --skip-lineage  # Much faster
  2. Compile specific nodes

    olytix-core compile --select my_model  # Just one model
  3. Enable caching

    # olytix-core_project.yml
    compile:
    cache: true
    cache_dir: .olytix-core_cache

Compilation Metrics

Compilation Statistics:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase │ Duration │ Items
─────────────────┼──────────┼───────
Discovery │ 0.05s │ 32 files
YAML Parsing │ 0.12s │ 18 files
SQL Parsing │ 0.34s │ 12 files
Reference Res. │ 0.08s │ 45 refs
Graph Building │ 0.02s │ 15 nodes
Validation │ 0.15s │ 32 checks
Lineage │ 0.42s │ 156 columns
Manifest Gen. │ 0.06s │ 1 file
─────────────────┼──────────┼───────
Total │ 1.24s │
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Next Steps

Now that you understand compilation:

  1. Learn about lineage →
  2. Start defining sources →
  3. Build your first model →