Skip to main content

Dimensions

For Data Analysts

Dimensions are the attributes used to group, filter, and slice your data. They provide the categorical context for your measures and enable multidimensional analysis.

What is a Dimension?

A dimension is an attribute that describes your data and can be used to:

  • Group aggregations (e.g., revenue by region)
  • Filter results (e.g., orders from 2024)
  • Drill down into details (e.g., from country to city)
┌─────────────────────────────────────────────────────────────┐
│ Dimension Grouping │
├─────────────────────────────────────────────────────────────┤
│ │
│ Raw Data Grouped Result │
│ ┌────────────────────┐ ┌────────────────────────┐ │
│ │ region │ amount │ │ region │ total │ │
│ │ US │ 100 │ │ US │ 350 │ │
│ │ EU │ 200 │ ───► │ EU │ 400 │ │
│ │ US │ 250 │ │ APAC │ 150 │ │
│ │ EU │ 200 │ └────────────────────────┘ │
│ │ APAC │ 150 │ GROUP BY region │
│ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

Dimension Types

Olytix Core supports five dimension types:

TypeDescriptionExample
stringText valuesStatus, region, category
numberNumeric valuesCustomer ID, quantity tier
timeDate and timestamp valuesOrder date, created_at
booleanTrue/false valuesIs active, has discount
geoGeographic dataLocation, coordinates

Basic Dimension Definition

cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"

dimensions:
- name: status
type: string
sql: status
description: "Order status"

Dimension Configuration Options

Each dimension supports these properties:

dimensions:
- name: region
type: string
sql: region # Column or expression
title: "Geographic Region" # Display name
description: "Sales region" # Documentation
primary_key: false # Is this the primary key?
shown: true # Whether to expose in API
case: upper # Case transformation (upper, lower, original)
meta: # Custom metadata
category: geography

STRING Dimensions

String dimensions represent categorical text values:

dimensions:
# Simple string column
- name: status
type: string
sql: status
description: "Order status (pending, completed, cancelled)"

# With case transformation
- name: region
type: string
sql: region
case: upper
description: "Geographic region (uppercase)"

# Computed string dimension
- name: order_size
type: string
sql: |
CASE
WHEN total_amount < 50 THEN 'Small'
WHEN total_amount < 200 THEN 'Medium'
WHEN total_amount < 500 THEN 'Large'
ELSE 'Enterprise'
END
description: "Order size category based on amount"

# String from concatenation
- name: full_name
type: string
sql: "first_name || ' ' || last_name"
description: "Customer full name"

NUMBER Dimensions

Number dimensions represent numeric identifiers or categorical numbers:

dimensions:
# Primary key
- name: order_id
type: number
sql: order_id
primary_key: true
description: "Unique order identifier"

# Foreign key
- name: customer_id
type: number
sql: customer_id
description: "Customer identifier"

# Numeric category
- name: quantity_tier
type: number
sql: |
CASE
WHEN quantity <= 1 THEN 1
WHEN quantity <= 5 THEN 2
WHEN quantity <= 10 THEN 3
ELSE 4
END
description: "Quantity tier (1-4)"

# Extracted number
- name: order_year
type: number
sql: "EXTRACT(YEAR FROM order_date)"
description: "Year of the order"

TIME Dimensions

Time dimensions represent dates and timestamps with granularity support:

dimensions:
# Basic time dimension
- name: order_date
type: time
sql: order_date
description: "Date the order was placed"

# Time dimension with granularities
- name: created_at
type: time
sql: created_at
granularities:
- second
- minute
- hour
- day
- week
- month
- quarter
- year
description: "Order creation timestamp"

# Commonly used granularities
- name: order_date
type: time
sql: order_date
granularities:
- day
- week
- month
- quarter
- year
description: "Order date with standard granularities"

Time Granularities

Olytix Core supports these granularity levels for time dimensions:

GranularityDescriptionExample Output
secondTruncate to second2024-03-15 10:30:45
minuteTruncate to minute2024-03-15 10:30:00
hourTruncate to hour2024-03-15 10:00:00
dayTruncate to day2024-03-15
weekTruncate to week start2024-03-11
monthTruncate to month2024-03-01
quarterTruncate to quarter2024-01-01
yearTruncate to year2024-01-01

Querying Time Dimensions

When querying, specify the granularity using dot notation:

{
"measures": ["orders.total_revenue"],
"dimensions": ["orders.order_date.month"]
}

This generates SQL like:

SELECT
DATE_TRUNC('month', order_date) AS order_date_month,
SUM(total_amount) AS total_revenue
FROM fct_orders
GROUP BY DATE_TRUNC('month', order_date)

BOOLEAN Dimensions

Boolean dimensions represent true/false conditions:

dimensions:
# Direct boolean column
- name: is_active
type: boolean
sql: is_active
description: "Whether the customer is active"

# Computed boolean
- name: has_discount
type: boolean
sql: "discount_amount > 0"
description: "Whether the order has a discount"

# Complex condition
- name: is_high_value_customer
type: boolean
sql: |
EXISTS (
SELECT 1 FROM customer_stats cs
WHERE cs.customer_id = orders.customer_id
AND cs.lifetime_value > 1000
)
description: "Customer lifetime value exceeds $1000"

# Boolean from status
- name: is_completed
type: boolean
sql: "status = 'completed'"
description: "Whether the order is completed"

GEO Dimensions

Geo dimensions represent geographic data:

dimensions:
# Country-level
- name: country
type: geo
sql: country_code
description: "ISO country code"

# Coordinates
- name: location
type: geo
sql: "ST_POINT(longitude, latitude)"
description: "Order delivery location"

# Region
- name: shipping_region
type: geo
sql: shipping_region
description: "Shipping region"

Primary Keys

Every cube should have a primary key dimension for proper joins and lineage:

dimensions:
- name: order_id
type: number
sql: order_id
primary_key: true
description: "Unique order identifier"

Primary keys are used for:

  • Efficient join operations
  • Row-level deduplication
  • Drill-down queries
  • Lineage tracking

Computed Dimensions

Create dimensions from SQL expressions:

dimensions:
# Date extraction
- name: day_of_week
type: string
sql: "TO_CHAR(order_date, 'Day')"
description: "Day of week name"

- name: is_weekend
type: boolean
sql: "EXTRACT(DOW FROM order_date) IN (0, 6)"
description: "Whether order was placed on weekend"

# String manipulation
- name: email_domain
type: string
sql: "SPLIT_PART(email, '@', 2)"
description: "Customer email domain"

# Bucketing
- name: age_group
type: string
sql: |
CASE
WHEN age < 18 THEN 'Under 18'
WHEN age < 25 THEN '18-24'
WHEN age < 35 THEN '25-34'
WHEN age < 45 THEN '35-44'
WHEN age < 55 THEN '45-54'
ELSE '55+'
END
description: "Customer age group"

# Conditional logic
- name: customer_segment
type: string
sql: |
CASE
WHEN lifetime_value >= 10000 THEN 'Enterprise'
WHEN lifetime_value >= 1000 THEN 'Premium'
WHEN lifetime_value >= 100 THEN 'Standard'
ELSE 'Basic'
END
description: "Customer segment based on lifetime value"

Visibility

Control which dimensions are exposed in the API:

dimensions:
# Visible dimension (default)
- name: status
type: string
sql: status
shown: true

# Hidden dimension (used internally for joins/calculations)
- name: internal_flag
type: string
sql: internal_flag
shown: false

Hidden dimensions can still be used in:

  • Join conditions
  • Measure filters
  • Other dimension calculations

Complete Example

Here is a comprehensive orders cube with various dimension types:

cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
description: "E-commerce order transactions"

dimensions:
# Primary key
- name: order_id
type: number
sql: order_id
primary_key: true
description: "Unique order identifier"

# Time dimensions
- name: order_date
type: time
sql: order_date
granularities:
- day
- week
- month
- quarter
- year
description: "Date the order was placed"

- name: shipped_at
type: time
sql: shipped_at
granularities:
- day
- week
- month
description: "Shipment date"

# String dimensions
- name: status
type: string
sql: status
description: "Order status"

- name: region
type: string
sql: region
case: upper
description: "Geographic region"

- name: payment_method
type: string
sql: payment_method
description: "Payment method used"

# Number dimensions
- name: customer_id
type: number
sql: customer_id
description: "Customer identifier"

# Boolean dimensions
- name: has_discount
type: boolean
sql: "discount_amount > 0"
description: "Whether the order has a discount"

- name: is_first_order
type: boolean
sql: is_first_order
description: "Customer's first order"

# Computed dimensions
- name: order_size
type: string
sql: |
CASE
WHEN total_amount < 50 THEN 'Small'
WHEN total_amount < 200 THEN 'Medium'
WHEN total_amount < 500 THEN 'Large'
ELSE 'Enterprise'
END
description: "Order size category"

- name: day_of_week
type: string
sql: "TO_CHAR(order_date, 'Day')"
description: "Day of week the order was placed"

- name: is_weekend
type: boolean
sql: "EXTRACT(DOW FROM order_date) IN (0, 6)"
description: "Order placed on weekend"

- name: days_to_ship
type: number
sql: "DATE_PART('day', shipped_at - order_date)"
description: "Days between order and shipment"

measures:
- name: count
type: count
description: "Total number of orders"

- name: total_revenue
type: sum
sql: total_amount
format: currency
description: "Total revenue"

Best Practices

Always Define Primary Keys

dimensions:
- name: order_id
type: number
sql: order_id
primary_key: true

Use Appropriate Granularities

Only include granularities that make sense for your data:

# For daily data, don't include hour/minute/second
- name: order_date
type: time
sql: order_date
granularities:
- day
- week
- month
- quarter
- year

Document Expected Values

- name: status
type: string
sql: status
description: |
Order status. Possible values:
- pending: Order received, not processed
- processing: Order being prepared
- shipped: Order shipped
- completed: Order delivered
- cancelled: Order cancelled
- refunded: Order refunded

Use Computed Dimensions for Categories

# Instead of raw numeric values, create meaningful categories
- name: customer_tier
type: string
sql: |
CASE
WHEN lifetime_orders >= 50 THEN 'Platinum'
WHEN lifetime_orders >= 20 THEN 'Gold'
WHEN lifetime_orders >= 5 THEN 'Silver'
ELSE 'Bronze'
END
description: "Customer loyalty tier"

Keep Dimension Names Consistent

Use the same dimension names across related cubes:

# orders cube
- name: customer_id
type: number
sql: customer_id

# customers cube
- name: customer_id
type: number
sql: customer_id
primary_key: true

Next Steps