Dimensions
Dimensions are the attributes used to group, filter, and slice your data. They provide the categorical context for your measures and enable multidimensional analysis.
What is a Dimension?
A dimension is an attribute that describes your data and can be used to:
- Group aggregations (e.g., revenue by region)
- Filter results (e.g., orders from 2024)
- Drill down into details (e.g., from country to city)
┌─────────────────────────────────────────────────────────────┐
│ Dimension Grouping │
├─────────────────────────────────────────────────────────────┤
│ │
│ Raw Data Grouped Result │
│ ┌────────────────────┐ ┌────────────────────────┐ │
│ │ region │ amount │ │ region │ total │ │
│ │ US │ 100 │ │ US │ 350 │ │
│ │ EU │ 200 │ ───► │ EU │ 400 │ │
│ │ US │ 250 │ │ APAC │ 150 │ │
│ │ EU │ 200 │ └────────────────────────┘ │
│ │ APAC │ 150 │ GROUP BY region │
│ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Dimension Types
Olytix Core supports five dimension types:
| Type | Description | Example |
|---|---|---|
string | Text values | Status, region, category |
number | Numeric values | Customer ID, quantity tier |
time | Date and timestamp values | Order date, created_at |
boolean | True/false values | Is active, has discount |
geo | Geographic data | Location, coordinates |
Basic Dimension Definition
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
dimensions:
- name: status
type: string
sql: status
description: "Order status"
Dimension Configuration Options
Each dimension supports these properties:
dimensions:
- name: region
type: string
sql: region # Column or expression
title: "Geographic Region" # Display name
description: "Sales region" # Documentation
primary_key: false # Is this the primary key?
shown: true # Whether to expose in API
case: upper # Case transformation (upper, lower, original)
meta: # Custom metadata
category: geography
STRING Dimensions
String dimensions represent categorical text values:
dimensions:
# Simple string column
- name: status
type: string
sql: status
description: "Order status (pending, completed, cancelled)"
# With case transformation
- name: region
type: string
sql: region
case: upper
description: "Geographic region (uppercase)"
# Computed string dimension
- name: order_size
type: string
sql: |
CASE
WHEN total_amount < 50 THEN 'Small'
WHEN total_amount < 200 THEN 'Medium'
WHEN total_amount < 500 THEN 'Large'
ELSE 'Enterprise'
END
description: "Order size category based on amount"
# String from concatenation
- name: full_name
type: string
sql: "first_name || ' ' || last_name"
description: "Customer full name"
NUMBER Dimensions
Number dimensions represent numeric identifiers or categorical numbers:
dimensions:
# Primary key
- name: order_id
type: number
sql: order_id
primary_key: true
description: "Unique order identifier"
# Foreign key
- name: customer_id
type: number
sql: customer_id
description: "Customer identifier"
# Numeric category
- name: quantity_tier
type: number
sql: |
CASE
WHEN quantity <= 1 THEN 1
WHEN quantity <= 5 THEN 2
WHEN quantity <= 10 THEN 3
ELSE 4
END
description: "Quantity tier (1-4)"
# Extracted number
- name: order_year
type: number
sql: "EXTRACT(YEAR FROM order_date)"
description: "Year of the order"
TIME Dimensions
Time dimensions represent dates and timestamps with granularity support:
dimensions:
# Basic time dimension
- name: order_date
type: time
sql: order_date
description: "Date the order was placed"
# Time dimension with granularities
- name: created_at
type: time
sql: created_at
granularities:
- second
- minute
- hour
- day
- week
- month
- quarter
- year
description: "Order creation timestamp"
# Commonly used granularities
- name: order_date
type: time
sql: order_date
granularities:
- day
- week
- month
- quarter
- year
description: "Order date with standard granularities"
Time Granularities
Olytix Core supports these granularity levels for time dimensions:
| Granularity | Description | Example Output |
|---|---|---|
second | Truncate to second | 2024-03-15 10:30:45 |
minute | Truncate to minute | 2024-03-15 10:30:00 |
hour | Truncate to hour | 2024-03-15 10:00:00 |
day | Truncate to day | 2024-03-15 |
week | Truncate to week start | 2024-03-11 |
month | Truncate to month | 2024-03-01 |
quarter | Truncate to quarter | 2024-01-01 |
year | Truncate to year | 2024-01-01 |
Querying Time Dimensions
When querying, specify the granularity using dot notation:
{
"measures": ["orders.total_revenue"],
"dimensions": ["orders.order_date.month"]
}
This generates SQL like:
SELECT
DATE_TRUNC('month', order_date) AS order_date_month,
SUM(total_amount) AS total_revenue
FROM fct_orders
GROUP BY DATE_TRUNC('month', order_date)
BOOLEAN Dimensions
Boolean dimensions represent true/false conditions:
dimensions:
# Direct boolean column
- name: is_active
type: boolean
sql: is_active
description: "Whether the customer is active"
# Computed boolean
- name: has_discount
type: boolean
sql: "discount_amount > 0"
description: "Whether the order has a discount"
# Complex condition
- name: is_high_value_customer
type: boolean
sql: |
EXISTS (
SELECT 1 FROM customer_stats cs
WHERE cs.customer_id = orders.customer_id
AND cs.lifetime_value > 1000
)
description: "Customer lifetime value exceeds $1000"
# Boolean from status
- name: is_completed
type: boolean
sql: "status = 'completed'"
description: "Whether the order is completed"
GEO Dimensions
Geo dimensions represent geographic data:
dimensions:
# Country-level
- name: country
type: geo
sql: country_code
description: "ISO country code"
# Coordinates
- name: location
type: geo
sql: "ST_POINT(longitude, latitude)"
description: "Order delivery location"
# Region
- name: shipping_region
type: geo
sql: shipping_region
description: "Shipping region"
Primary Keys
Every cube should have a primary key dimension for proper joins and lineage:
dimensions:
- name: order_id
type: number
sql: order_id
primary_key: true
description: "Unique order identifier"
Primary keys are used for:
- Efficient join operations
- Row-level deduplication
- Drill-down queries
- Lineage tracking
Computed Dimensions
Create dimensions from SQL expressions:
dimensions:
# Date extraction
- name: day_of_week
type: string
sql: "TO_CHAR(order_date, 'Day')"
description: "Day of week name"
- name: is_weekend
type: boolean
sql: "EXTRACT(DOW FROM order_date) IN (0, 6)"
description: "Whether order was placed on weekend"
# String manipulation
- name: email_domain
type: string
sql: "SPLIT_PART(email, '@', 2)"
description: "Customer email domain"
# Bucketing
- name: age_group
type: string
sql: |
CASE
WHEN age < 18 THEN 'Under 18'
WHEN age < 25 THEN '18-24'
WHEN age < 35 THEN '25-34'
WHEN age < 45 THEN '35-44'
WHEN age < 55 THEN '45-54'
ELSE '55+'
END
description: "Customer age group"
# Conditional logic
- name: customer_segment
type: string
sql: |
CASE
WHEN lifetime_value >= 10000 THEN 'Enterprise'
WHEN lifetime_value >= 1000 THEN 'Premium'
WHEN lifetime_value >= 100 THEN 'Standard'
ELSE 'Basic'
END
description: "Customer segment based on lifetime value"
Visibility
Control which dimensions are exposed in the API:
dimensions:
# Visible dimension (default)
- name: status
type: string
sql: status
shown: true
# Hidden dimension (used internally for joins/calculations)
- name: internal_flag
type: string
sql: internal_flag
shown: false
Hidden dimensions can still be used in:
- Join conditions
- Measure filters
- Other dimension calculations
Complete Example
Here is a comprehensive orders cube with various dimension types:
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
description: "E-commerce order transactions"
dimensions:
# Primary key
- name: order_id
type: number
sql: order_id
primary_key: true
description: "Unique order identifier"
# Time dimensions
- name: order_date
type: time
sql: order_date
granularities:
- day
- week
- month
- quarter
- year
description: "Date the order was placed"
- name: shipped_at
type: time
sql: shipped_at
granularities:
- day
- week
- month
description: "Shipment date"
# String dimensions
- name: status
type: string
sql: status
description: "Order status"
- name: region
type: string
sql: region
case: upper
description: "Geographic region"
- name: payment_method
type: string
sql: payment_method
description: "Payment method used"
# Number dimensions
- name: customer_id
type: number
sql: customer_id
description: "Customer identifier"
# Boolean dimensions
- name: has_discount
type: boolean
sql: "discount_amount > 0"
description: "Whether the order has a discount"
- name: is_first_order
type: boolean
sql: is_first_order
description: "Customer's first order"
# Computed dimensions
- name: order_size
type: string
sql: |
CASE
WHEN total_amount < 50 THEN 'Small'
WHEN total_amount < 200 THEN 'Medium'
WHEN total_amount < 500 THEN 'Large'
ELSE 'Enterprise'
END
description: "Order size category"
- name: day_of_week
type: string
sql: "TO_CHAR(order_date, 'Day')"
description: "Day of week the order was placed"
- name: is_weekend
type: boolean
sql: "EXTRACT(DOW FROM order_date) IN (0, 6)"
description: "Order placed on weekend"
- name: days_to_ship
type: number
sql: "DATE_PART('day', shipped_at - order_date)"
description: "Days between order and shipment"
measures:
- name: count
type: count
description: "Total number of orders"
- name: total_revenue
type: sum
sql: total_amount
format: currency
description: "Total revenue"
Best Practices
Always Define Primary Keys
dimensions:
- name: order_id
type: number
sql: order_id
primary_key: true
Use Appropriate Granularities
Only include granularities that make sense for your data:
# For daily data, don't include hour/minute/second
- name: order_date
type: time
sql: order_date
granularities:
- day
- week
- month
- quarter
- year
Document Expected Values
- name: status
type: string
sql: status
description: |
Order status. Possible values:
- pending: Order received, not processed
- processing: Order being prepared
- shipped: Order shipped
- completed: Order delivered
- cancelled: Order cancelled
- refunded: Order refunded
Use Computed Dimensions for Categories
# Instead of raw numeric values, create meaningful categories
- name: customer_tier
type: string
sql: |
CASE
WHEN lifetime_orders >= 50 THEN 'Platinum'
WHEN lifetime_orders >= 20 THEN 'Gold'
WHEN lifetime_orders >= 5 THEN 'Silver'
ELSE 'Bronze'
END
description: "Customer loyalty tier"
Keep Dimension Names Consistent
Use the same dimension names across related cubes:
# orders cube
- name: customer_id
type: number
sql: customer_id
# customers cube
- name: customer_id
type: number
sql: customer_id
primary_key: true
Next Steps
- Joins - Connect cubes using dimensions
- Measures - Define aggregations over dimensions
- Pre-aggregations - Optimize queries with time dimensions