Joins
Joins define relationships between cubes, enabling queries that combine measures and dimensions from multiple cubes. Olytix Core supports standard SQL join types along with semantic relationship types.
What is a Cube Join?
A join connects two cubes through a relationship, allowing you to:
- Query measures from one cube with dimensions from another
- Build a connected semantic model across your data
- Maintain referential integrity between entities
Cube Joins & Relationships
Define relationships between cubes for multi-cube queries.
Join Types
Olytix Core supports the following join types:
SQL Join Types
| Type | Description | Use Case |
|---|---|---|
left | Include all rows from source cube | Most common for analytics |
inner | Only matching rows from both cubes | When relationship is required |
full | All rows from both cubes | Complete data union |
right | All rows from target cube | Rarely used |
Semantic Relationship Types
| Type | SQL Equivalent | Cardinality | Example |
|---|---|---|---|
belongs_to | LEFT JOIN | Many-to-one | Orders belong to customers |
has_many | LEFT JOIN | One-to-many | Customers have many orders |
has_one | LEFT JOIN | One-to-one | Order has one shipment |
Basic Join Definition
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
Join Configuration Options
Each join supports these properties:
joins:
- name: customers # Target cube name
sql: "{orders}.customer_id = {customers}.customer_id" # ON clause
relationship: belongs_to # Join type or relationship
description: "Link to customer who placed the order"
LEFT JOIN
Left joins include all rows from the source cube, with NULL values for unmatched target cube columns:
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: left
description: "Customer information (NULL if customer deleted)"
Generated SQL:
SELECT ...
FROM fct_orders AS orders
LEFT JOIN dim_customers AS customers
ON orders.customer_id = customers.customer_id
INNER JOIN
Inner joins only include rows that have matches in both cubes:
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: inner
description: "Only orders with valid customers"
Generated SQL:
SELECT ...
FROM fct_orders AS orders
INNER JOIN dim_customers AS customers
ON orders.customer_id = customers.customer_id
Use INNER JOIN when:
- The relationship is mandatory
- You want to filter out orphan records
- Data integrity requires matching records
FULL JOIN
Full joins include all rows from both cubes:
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
- name: returns
sql: "{orders}.order_id = {returns}.order_id"
relationship: full
description: "All orders and all returns"
Generated SQL:
SELECT ...
FROM fct_orders AS orders
FULL OUTER JOIN fct_returns AS returns
ON orders.order_id = returns.order_id
Use FULL JOIN when:
- Comparing two related datasets
- Finding unmatched records on both sides
- Creating reconciliation reports
Semantic Relationships
belongs_to (Many-to-One)
Use when many records in the source cube relate to one record in the target:
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
# Many orders belong to one customer
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
description: "Customer who placed the order"
# Many orders belong to one product
- name: products
sql: "{orders}.product_id = {products}.product_id"
relationship: belongs_to
description: "Product that was ordered"
has_many (One-to-Many)
Use when one record in the source cube relates to many records in the target:
cubes:
- name: customers
sql: "SELECT * FROM {{ ref('dim_customers') }}"
joins:
# One customer has many orders
- name: orders
sql: "{customers}.customer_id = {orders}.customer_id"
relationship: has_many
description: "Orders placed by this customer"
# One customer has many support tickets
- name: support_tickets
sql: "{customers}.customer_id = {support_tickets}.customer_id"
relationship: has_many
description: "Support tickets from this customer"
has_one (One-to-One)
Use when one record in the source cube relates to exactly one record in the target:
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
# One order has one shipment
- name: shipments
sql: "{orders}.order_id = {shipments}.order_id"
relationship: has_one
description: "Shipment for this order"
# One order has one invoice
- name: invoices
sql: "{orders}.order_id = {invoices}.order_id"
relationship: has_one
description: "Invoice for this order"
Complex Join Conditions
Multi-Column Joins
Join on multiple columns:
joins:
- name: inventory
sql: |
{orders}.product_id = {inventory}.product_id
AND {orders}.warehouse_id = {inventory}.warehouse_id
relationship: belongs_to
description: "Inventory record for product at warehouse"
Joins with Expressions
Include expressions in join conditions:
joins:
- name: price_history
sql: |
{orders}.product_id = {price_history}.product_id
AND {orders}.order_date >= {price_history}.effective_from
AND {orders}.order_date < COALESCE({price_history}.effective_to, '9999-12-31')
relationship: belongs_to
description: "Price effective at time of order"
Self-Referential Joins
Join a cube to itself:
cubes:
- name: employees
sql: "SELECT * FROM {{ ref('dim_employees') }}"
joins:
- name: manager
sql: "{employees}.manager_id = {manager}.employee_id"
relationship: belongs_to
description: "Direct manager of this employee"
Join Chains
Olytix Core supports transitive joins through multiple cubes:
cubes:
- name: order_items
sql: "SELECT * FROM {{ ref('fct_order_items') }}"
joins:
- name: orders
sql: "{order_items}.order_id = {orders}.order_id"
relationship: belongs_to
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
This allows querying:
{
"measures": ["order_items.total_quantity"],
"dimensions": ["customers.region"]
}
Olytix Core automatically resolves the join path: order_items -> orders -> customers
Complete Example
Here is a comprehensive example with multiple cube relationships:
# cubes/orders.yml
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
description: "E-commerce order transactions"
measures:
- name: count
type: count
description: "Total number of orders"
- name: total_revenue
type: sum
sql: total_amount
format: currency
description: "Total revenue"
dimensions:
- name: order_id
type: number
sql: order_id
primary_key: true
- name: order_date
type: time
sql: order_date
granularities: [day, week, month, quarter, year]
- name: status
type: string
sql: status
joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
description: "Customer who placed the order"
- name: products
sql: "{orders}.product_id = {products}.product_id"
relationship: belongs_to
description: "Primary product in the order"
- name: shipments
sql: "{orders}.order_id = {shipments}.order_id"
relationship: has_one
description: "Shipment details"
# cubes/customers.yml
cubes:
- name: customers
sql: "SELECT * FROM {{ ref('dim_customers') }}"
description: "Customer master data"
measures:
- name: count
type: count
description: "Total customers"
dimensions:
- name: customer_id
type: number
sql: customer_id
primary_key: true
- name: name
type: string
sql: name
- name: email
type: string
sql: email
- name: region
type: string
sql: region
- name: created_at
type: time
sql: created_at
granularities: [day, month, year]
joins:
- name: orders
sql: "{customers}.customer_id = {orders}.customer_id"
relationship: has_many
description: "Orders placed by this customer"
# cubes/products.yml
cubes:
- name: products
sql: "SELECT * FROM {{ ref('dim_products') }}"
description: "Product catalog"
dimensions:
- name: product_id
type: number
sql: product_id
primary_key: true
- name: name
type: string
sql: name
- name: category
type: string
sql: category
- name: price
type: number
sql: price
# cubes/shipments.yml
cubes:
- name: shipments
sql: "SELECT * FROM {{ ref('fct_shipments') }}"
description: "Order shipment tracking"
dimensions:
- name: shipment_id
type: number
sql: shipment_id
primary_key: true
- name: carrier
type: string
sql: carrier
- name: shipped_at
type: time
sql: shipped_at
granularities: [day, week, month]
- name: delivered_at
type: time
sql: delivered_at
granularities: [day, week, month]
Querying with Joins
Once joins are defined, you can query across cubes:
{
"measures": ["orders.total_revenue"],
"dimensions": ["customers.region", "products.category"],
"filters": [
{
"dimension": "orders.order_date",
"operator": "gte",
"values": ["2024-01-01"]
}
]
}
Best Practices
Define Joins on the "Many" Side
For belongs_to relationships, define the join on the cube with many records:
# Good: Join defined on orders (many) pointing to customers (one)
cubes:
- name: orders
joins:
- name: customers
relationship: belongs_to
Use Primary Keys
Ensure both cubes have primary key dimensions for efficient joins:
dimensions:
- name: customer_id
type: number
sql: customer_id
primary_key: true
Document Relationships
Include descriptions explaining the relationship:
joins:
- name: customers
sql: "{orders}.customer_id = {customers}.customer_id"
relationship: belongs_to
description: |
Links to the customer who placed the order.
NULL if customer record has been deleted.
Avoid Circular Joins
Do not create circular join paths:
# Avoid this pattern
cubes:
- name: orders
joins:
- name: customers # orders -> customers
- name: customers
joins:
- name: orders # customers -> orders (creates cycle)
Instead, define the relationship from one direction only, or use separate cubes for different query patterns.
Consider Join Performance
For large tables, ensure join columns are indexed in your database:
-- In your database
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_product_id ON orders(product_id);
Next Steps
- Pre-aggregations - Optimize multi-cube queries
- Cube Fundamentals - Review cube basics
- Measures - Define aggregations for joined queries