Skip to main content

Cubes Schema

Cubes are the core building blocks of the Olytix Core semantic layer. They define the business logic for measures, dimensions, and relationships that power your analytics queries. This reference documents all fields available when defining cubes.

File Location

Cube definitions are stored in YAML files within the cubes/ directory:

my-project/
└── cubes/
├── orders.yml
├── customers.yml
└── products/
├── inventory.yml
└── catalog.yml

Top-Level Structure

cubes:
- name: cube_name
# ... cube configuration

Cube Definition

FieldTypeRequiredDefaultDescription
namestringYes-Unique identifier for the cube
sqlstringConditionalnullSQL query defining the cube's base data
sql_tablestringConditionalnullDirect table reference (alternative to sql)
modelstringConditionalnullReference to a Olytix Core model (alternative to sql)
extendsstringNonullName of another cube to extend
titlestringNonullDisplay title for the cube
descriptionstringNonullHuman-readable description
publicbooleanNotrueWhether the cube is exposed via API
refresh_keyobjectNonullCache refresh configuration
data_sourcestringNonullData source override
measureslistNo[]List of measure definitions
dimensionslistNo[]List of dimension definitions
joinslistNo[]List of join definitions
segmentslistNo[]List of segment definitions
pre_aggregationslistNo[]List of pre-aggregation definitions
metaobjectNo{}Custom metadata key-value pairs

Data Source Options

You must specify one of the following to define the cube's data source:

  • sql: A SQL query with optional Jinja templating
  • sql_table: Direct reference to a database table
  • model: Reference to a Olytix Core model using ref()
# Option 1: SQL query
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"

# Option 2: Direct table reference
cubes:
- name: orders
sql_table: "analytics.marts.fct_orders"

# Option 3: Model reference
cubes:
- name: orders
model: fct_orders

Measure Definition

Measures are quantitative values that can be aggregated.

FieldTypeRequiredDefaultDescription
namestringYes-Unique identifier within the cube
typestringNocountAggregation type
sqlstringConditionalnullSQL expression (required for most types)
titlestringNonullDisplay title
descriptionstringNonullHuman-readable description
formatstringNonullOutput format: currency, percent, number
filterslistNo[]Filters applied to this measure
rolling_windowobjectNonullRolling window configuration
drill_memberslistNo[]Dimensions to show in drill-down
shownbooleanNotrueWhether to expose via API
metaobjectNo{}Custom metadata

Measure Types

TypeDescriptionRequires sql
countCount of rowsNo
count_distinctCount of unique valuesYes
count_distinct_approxApproximate count distinctYes
sumSum of valuesYes
avgAverage of valuesYes
minMinimum valueYes
maxMaximum valueYes
numberCustom SQL expressionYes
running_totalCumulative sumYes
booleanBoolean aggregation (ANY/ALL)Yes

Measure Examples

measures:
- name: count
type: count
description: Total number of orders

- name: total_revenue
type: sum
sql: total_amount
format: currency
description: Sum of all order amounts
drill_members:
- order_id
- order_date
- customer_name

- name: average_order_value
type: avg
sql: total_amount
format: currency
description: Average order amount

- name: unique_customers
type: count_distinct
sql: customer_id
description: Number of unique customers

- name: completed_orders
type: count
filters:
- sql: "{CUBE}.status = 'completed'"
description: Orders with completed status

- name: revenue_margin
type: number
sql: "SUM(revenue - cost) / NULLIF(SUM(revenue), 0)"
format: percent
description: Revenue margin percentage

Rolling Window Configuration

measures:
- name: rolling_30d_revenue
type: sum
sql: total_amount
rolling_window:
trailing: 30
leading: 0
offset: start

Dimension Definition

Dimensions are attributes used for grouping and filtering.

FieldTypeRequiredDefaultDescription
namestringYes-Unique identifier within the cube
sqlstringNonullSQL expression (defaults to column name)
typestringNostringDimension type
titlestringNonullDisplay title
descriptionstringNonullHuman-readable description
primary_keybooleanNofalseWhether this is the primary key
shownbooleanNotrueWhether to expose via API
casestringNonullCase transformation: upper, lower
granularitieslistNo[]Time granularities (for time dimensions)
metaobjectNo{}Custom metadata

Dimension Types

TypeDescription
stringText values (default)
numberNumeric values
booleanTrue/false values
timeDate/time values
geoGeographic values

Time Dimension Granularities

GranularityDescription
secondSecond-level precision
minuteMinute-level precision
hourHour-level precision
dayDay-level precision
weekWeek-level precision
monthMonth-level precision
quarterQuarter-level precision
yearYear-level precision

Dimension Examples

dimensions:
- name: order_id
sql: order_id
type: number
primary_key: true
description: Unique order identifier

- name: order_date
sql: order_date
type: time
description: Date when the order was placed
granularities:
- day
- week
- month
- quarter
- year

- name: status
sql: status
type: string
description: Current order status

- name: customer_name
sql: customer_name
type: string
case: upper
description: Customer name (uppercase)

- name: is_premium
sql: "CASE WHEN total_amount > 1000 THEN true ELSE false END"
type: boolean
description: Whether order qualifies as premium

- name: region
sql: region_code
type: geo
description: Geographic region

Join Definition

Joins define relationships between cubes.

FieldTypeRequiredDefaultDescription
namestringYes-Name of the target cube
sqlstringYes-SQL ON clause
relationshipstringNobelongs_toRelationship type
descriptionstringNonullHuman-readable description

Relationship Types

TypeDescriptionSQL Equivalent
belongs_toMany-to-one relationshipLEFT JOIN
has_manyOne-to-many relationshipLEFT JOIN
has_oneOne-to-one relationshipLEFT JOIN
leftStandard left joinLEFT JOIN
innerInner joinINNER JOIN
fullFull outer joinFULL OUTER JOIN
rightRight joinRIGHT JOIN

Join Examples

joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
description: Customer who placed the order

- name: order_items
sql: "{orders}.order_id = {order_items}.order_id"
relationship: has_many
description: Line items in the order

Segment Definition

Segments are predefined filters for common use cases.

FieldTypeRequiredDefaultDescription
namestringYes-Unique identifier
sqlstringYes-SQL WHERE clause condition
titlestringNonullDisplay title
descriptionstringNonullHuman-readable description

Segment Examples

segments:
- name: completed
sql: "{CUBE}.status = 'completed'"
title: Completed Orders
description: Orders that have been completed

- name: high_value
sql: "{CUBE}.total_amount > 1000"
title: High Value Orders
description: Orders over $1000

- name: recent
sql: "{CUBE}.order_date >= CURRENT_DATE - INTERVAL '30 days'"
title: Recent Orders
description: Orders from the last 30 days

Pre-Aggregation Definition

Pre-aggregations are materialized aggregations for performance optimization.

FieldTypeRequiredDefaultDescription
namestringYes-Unique identifier
measureslistNo[]Measures to include
dimensionslistNo[]Dimensions to include
time_dimensionstringNonullTime dimension for partitioning
granularitystringNonullTime granularity
partition_granularitystringNonullPartition granularity
segmentslistNo[]Segments to apply
refreshobjectNo{}Refresh configuration
indexeslistNo[]Index definitions
build_range_startstringNonullStart of build range
build_range_endstringNonullEnd of build range

Refresh Configuration

FieldTypeDefaultDescription
everystring1 hourRefresh interval
incrementalbooleanfalseIncremental refresh
update_windowstringnullWindow for updates
timezonestringUTCTimezone for scheduling

Pre-Aggregation Examples

pre_aggregations:
- name: daily_orders
measures:
- count
- total_revenue
dimensions:
- status
time_dimension: order_date
granularity: day
refresh:
every: "1 hour"
timezone: "America/New_York"

- name: monthly_by_region
measures:
- count
- total_revenue
- average_order_value
dimensions:
- region
- status
time_dimension: order_date
granularity: month
partition_granularity: month
refresh:
every: "6 hours"
incremental: true
update_window: "7 days"
indexes:
- columns: [region, status]

Complete Example

cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
title: Orders
description: E-commerce order transactions
public: true
meta:
owner: analytics-team
domain: sales

measures:
- name: count
type: count
description: Total number of orders

- name: total_revenue
type: sum
sql: total_amount
format: currency
description: Sum of all order amounts
drill_members:
- order_id
- order_date
- customer_name

- name: average_order_value
type: avg
sql: total_amount
format: currency
description: Average order amount

- name: unique_customers
type: count_distinct
sql: customer_id
description: Number of unique customers

dimensions:
- name: order_id
sql: order_id
type: number
primary_key: true
description: Unique order identifier

- name: order_date
sql: order_date
type: time
description: Order creation date
granularities:
- day
- week
- month
- quarter
- year

- name: status
sql: status
type: string
description: Order status

- name: region
sql: region
type: string
description: Customer region

joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
description: Customer who placed the order

segments:
- name: completed
sql: "{CUBE}.status = 'completed'"
description: Completed orders only

- name: high_value
sql: "{CUBE}.total_amount > 1000"
description: Orders over $1000

pre_aggregations:
- name: daily_stats
measures:
- count
- total_revenue
dimensions:
- status
- region
time_dimension: order_date
granularity: day
refresh:
every: "1 hour"

Best Practices

  1. Define primary keys: Always mark a dimension as primary_key: true for proper join handling.

  2. Use descriptive names: Choose clear, business-friendly names for measures and dimensions.

  3. Document thoroughly: Provide descriptions for all cube members.

  4. Optimize with pre-aggregations: Create pre-aggregations for frequently queried combinations.

  5. Use segments for common filters: Define segments for filters that are used repeatedly.

  6. Set appropriate granularities: Only include granularities that users actually need.

  7. Use drill_members: Define drill-down paths for measures to enable data exploration.