Skip to main content

Models Schema

Models are SQL transformations that transform raw source data into analytics-ready datasets. This reference documents all fields available when defining models.

File Structure

Models consist of two file types:

  1. SQL files (.sql): Contain the transformation logic
  2. YAML files (.yml): Contain metadata, documentation, and tests
my-project/
└── models/
├── staging/
│ ├── stg_orders.sql
│ └── staging.yml
└── marts/
├── fct_orders.sql
├── dim_customers.sql
└── marts.yml

SQL File Structure

Model SQL files support Jinja templating with special functions:

{{ config(materialized='table') }}

SELECT
id AS order_id,
customer_id,
total_amount,
created_at AS order_date
FROM {{ source('raw', 'orders') }}
WHERE status = 'completed'

Inline Configuration

Use the config() block at the top of SQL files:

{{ config(
materialized='incremental',
unique_key='order_id',
incremental_strategy='merge',
schema='marts'
) }}

YAML Schema

Top-Level Structure

version: 2

models:
- name: model_name
# ... model configuration

Model Definition

FieldTypeRequiredDefaultDescription
namestringYes-Model name (must match SQL filename without extension)
descriptionstringNonullHuman-readable description
configobjectNo{}Model configuration options
columnslistNo[]Column definitions and tests
metaobjectNo{}Custom metadata key-value pairs

Example

version: 2

models:
- name: fct_orders
description: Fact table containing completed orders with enriched attributes
config:
materialized: table
schema: marts
columns:
- name: order_id
description: Primary key
tests:
- unique
- not_null

Configuration Options

The config object supports the following fields:

FieldTypeDefaultDescription
materializedstringviewHow the model is built: view, table, incremental, ephemeral
schemastringnullTarget schema override
tagslist[]Tags for organizing and selecting models
enabledbooleantrueWhether the model is enabled
pre_hooklist[]SQL statements to run before model creation
post_hooklist[]SQL statements to run after model creation
unique_keystring or listnullUnique key column(s) for incremental models
incremental_strategystringnullStrategy for incremental updates
on_schema_changestringignoreBehavior when schema changes

Materialization Types

TypeDescription
viewCreates a database view (default)
tableCreates a physical table, rebuilt on each run
incrementalAppends or merges new data into existing table
ephemeralInlined as a CTE, not materialized in database

Incremental Strategies

StrategyDescription
appendInsert new rows only
mergeUpsert based on unique key
delete+insertDelete matching rows, then insert

Schema Change Behavior

ValueDescription
ignoreIgnore schema changes (default)
append_new_columnsAdd new columns, keep existing
sync_all_columnsSync schema to match model
failFail if schema changes detected

Configuration Example

models:
- name: fct_orders
config:
materialized: incremental
unique_key: order_id
incremental_strategy: merge
on_schema_change: append_new_columns
schema: marts
tags:
- daily
- revenue
pre_hook:
- "{{ log('Starting fct_orders build', info=True) }}"
post_hook:
- "ANALYZE {{ this }}"

Column Definition

FieldTypeRequiredDefaultDescription
namestringYes-Column name
descriptionstringNonullHuman-readable description
data_typestringNonullSQL data type
testslistNo[]Tests to run on this column
metaobjectNo{}Custom metadata key-value pairs

Example

columns:
- name: order_id
description: Unique order identifier
data_type: INTEGER
tests:
- unique
- not_null

- name: customer_id
description: Reference to dim_customers
tests:
- not_null
- relationships:
to: ref('dim_customers')
field: customer_id

- name: order_status
description: Current status of the order
tests:
- accepted_values:
values: ['pending', 'shipped', 'delivered', 'cancelled']
meta:
business_owner: sales_ops

Column Tests

TestDescriptionParameters
uniqueValues must be unique-
not_nullValues cannot be null-
accepted_valuesValues from predefined listvalues: list of allowed values
relationshipsForeign key validationto: target ref, field: target column

Template Functions

ref()

Reference another model:

SELECT *
FROM {{ ref('stg_orders') }}

source()

Reference a source table:

SELECT *
FROM {{ source('raw', 'orders') }}

config()

Inline configuration:

{{ config(materialized='table', schema='marts') }}

Complete Example

SQL File: models/marts/fct_orders.sql

{{ config(
materialized='incremental',
unique_key='order_id',
incremental_strategy='merge',
schema='marts'
) }}

WITH orders AS (
SELECT *
FROM {{ ref('stg_orders') }}
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}
),

customers AS (
SELECT *
FROM {{ ref('dim_customers') }}
)

SELECT
o.order_id,
o.customer_id,
c.customer_name,
c.customer_segment,
o.order_date,
o.total_amount,
o.status,
o.updated_at
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.customer_id

YAML File: models/marts/marts.yml

version: 2

models:
- name: fct_orders
description: |
Fact table containing all orders with customer attributes.
Updated incrementally based on the updated_at timestamp.
config:
materialized: incremental
unique_key: order_id
tags:
- revenue
- daily
meta:
owner: analytics-team
refresh_frequency: hourly
columns:
- name: order_id
description: Primary key - unique order identifier
tests:
- unique
- not_null

- name: customer_id
description: Foreign key to dim_customers
tests:
- not_null
- relationships:
to: ref('dim_customers')
field: customer_id

- name: customer_name
description: Customer full name

- name: customer_segment
description: Customer segment classification
tests:
- accepted_values:
values: ['enterprise', 'mid-market', 'smb', 'consumer']

- name: order_date
description: Date when the order was placed
tests:
- not_null

- name: total_amount
description: Total order value in USD
meta:
format: currency

- name: status
description: Current order status
tests:
- accepted_values:
values: ['pending', 'completed', 'cancelled', 'refunded']

- name: updated_at
description: Last update timestamp
tests:
- not_null

- name: dim_customers
description: Customer dimension with current attributes
columns:
- name: customer_id
description: Primary key
tests:
- unique
- not_null
- name: customer_name
description: Customer full name
- name: customer_segment
description: Customer segment classification

Best Practices

  1. Use consistent naming: Prefix staging models with stg_, fact tables with fct_, and dimensions with dim_.

  2. Document all columns: Provide descriptions for every column in your YAML files.

  3. Add tests: Include at minimum unique and not_null tests on primary keys.

  4. Use incremental where possible: For large tables, use incremental materialization to improve performance.

  5. Organize by layer: Group models in directories by transformation layer (staging/, intermediate/, marts/).

  6. Tag for selection: Use tags to organize models for selective runs (olytix-core run --select tag:daily).