Skip to main content

Architecture Overview

For Data Analysts 8 min read

Olytix Core is built on a modern, cloud-native architecture designed for scalability, performance, and extensibility. This page provides a technical overview of the key components and how they interact.

What you'll learn
  • Core architectural components and their roles
  • How semantic queries are translated to SQL
  • The query optimization pipeline
  • Warehouse adapter interface

High-Level Architecture​

πŸ—οΈ

Olytix Core Architecture

Click any component to explore its details

πŸ–₯️
BI Tools
πŸ“Š
Power BI
🐍
Python SDK
βš›οΈ
React Apps
πŸ”§
REST Clients
β–Ό
API Gateway
FastAPI + Authentication + Rate Limiting
Entry Point
πŸ”
REST API
HTTP/JSON
πŸ“‘
GraphQL
Flexible queries
πŸ“ˆ
DAX / XMLA
Power BI Protocol
πŸ”‘
Auth
JWT + API Keys
β–Ό
Core Services
Business Logic & Processing
Processing
βš™οΈ
Compiler
YAML β†’ SQL
πŸ—ΊοΈ
Query Planner
Semantic β†’ SQL
πŸ”—
Lineage Service
Column tracking
πŸ“‹
Metadata
Catalog & Registry
πŸ›‘οΈ
Security Engine
RLS & Masking
⏰
Pre-aggregation
Performance cache
β–Ό
Resilience Layer
Fault Tolerance & Performance
Protection
πŸ”„
Circuit Breaker
Fail-fast on errors
CLOSED→OPEN→HALF
🧱
Bulkhead
Tenant isolation
A
B
C
πŸ”
Retry Policy
Exponential backoff
1s2s4s8s
⏱️
Rate Limiting
Request throttling
β–Ό
DataFusion Query Engine
Apache Arrow-based SQL Processing
Execution
πŸ“
SQL Parser
Parse & tokenize
β†’
🌳
Logical Plan
AST building
β†’
⚑
Optimizer
Cost-based
β†’
πŸš€
Physical Plan
Execution ready
🏹
Apache Arrow β€” Zero-copy columnar in-memory format
β–Ό
Data Warehouse Adapters
Pluggable Connectivity Layer
Data Sources
🐘
PostgreSQL
βœ“ GA
❄️
Snowflake
βœ“ GA
πŸ”·
BigQuery
βœ“ GA
🧱
Databricks
βœ“ GA
πŸ”Ά
Redshift
βœ“ GA
πŸ¦†
DuckDB
βœ“ GA
Technology Stack
⚑FastAPI
🐍Python 3.11+
🏹Apache Arrow
πŸ”₯DataFusion
πŸ“Strawberry GraphQL
πŸ”΄Redis
πŸ“ŠPrometheus
πŸ”­OpenTelemetry
Performance Targets
< 200ms
API Response (p95)
< 5s
Query Execution (p95)
< 60s
Full Compilation
> 80%
Cache Hit Rate

Core Components​

API Layer​

The API layer provides multiple interfaces for consuming the semantic layer:

InterfaceUse CaseProtocol
REST APIGeneral integrations, BI toolsHTTP/JSON
GraphQL APIFlexible queries, frontend appsGraphQL
DAX APIPower BI native integrationXMLA/DAX

Semantic Layer​

The semantic layer is the heart of Olytix Core:

  • Cubes: Define analytical entities with measures and dimensions
  • Measures: Aggregation expressions (SUM, COUNT, AVG, etc.)
  • Dimensions: Categorical and temporal attributes
  • Metrics: Business KPIs composed from measures
  • Joins: Relationships between cubes

Query Engine​

Built on Apache DataFusion and Apache Arrow:

  • Query Planner: Translates semantic queries to optimized SQL
  • Optimizer: Applies automatic optimizations
  • Executor: Manages query execution and result streaming

Deep Dive: Query Execution​

Semantic Query to SQL​

The Query Planner translates semantic queries into optimized SQL:

Semantic Query (Input)
{
"metrics": ["total_revenue"],
"dimensions": ["order_date.month", "customer.region"],
"filters": [{"dimension": "order_date.year", "operator": "equals", "value": 2024}]
}
Optimized SQL (Output)
SELECT
DATE_TRUNC('month', o.order_date) AS "order_date.month",
c.region AS "customer.region",
SUM(o.total_amount) AS "total_revenue"
FROM fct_orders o
JOIN dim_customers c ON o.customer_id = c.customer_id
WHERE EXTRACT(YEAR FROM o.order_date) = 2024
GROUP BY 1, 2
ORDER BY 1, 2

Query Optimization​

The optimizer applies several techniques automatically:

TechniqueDescriptionBenefit
Predicate PushdownFilters pushed to warehouse levelReduced data transfer
Join EliminationRemoves unnecessary joinsFaster execution
Pre-aggregation MatchingUses cached aggregates when availableSub-second responses
Subquery FlatteningSimplifies nested queriesCleaner execution plans
Performance Tip

Enable pre-aggregations for frequently queried measure/dimension combinations to achieve sub-second response times on large datasets.


Adapter Interface​

Olytix Core uses a pluggable adapter architecture for warehouse connectivity:

src/olytix_core/adapters/base.py
class WarehouseAdapter(ABC):
"""Abstract interface for warehouse implementations."""

async def execute(self, sql: str) -> pa.RecordBatch
async def execute_iter(self, sql: str) -> AsyncIterator[pa.RecordBatch]
async def get_schema(self, table: str) -> dict[str, str]
def get_dialect(self) -> SQLDialect

Supported Warehouses​

WarehouseAdapterStatus
PostgreSQLpostgresqlProduction
SnowflakesnowflakeProduction
BigQuerybigqueryProduction
DuckDBduckdbBeta

This abstraction allows Olytix Core to support multiple data warehouses while maintaining a consistent query execution model based on Apache Arrow.


Data Flow​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client │────▢│ API │────▢│ Query β”‚
β”‚ Request β”‚ β”‚ Layer β”‚ β”‚ Planner β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Response │◀────│ Result │◀────│ Warehouse β”‚
β”‚ (Arrow) β”‚ β”‚ Processor β”‚ β”‚ Adapter β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜