Snowflake Integration
Snowflake is a cloud-native data warehouse that provides automatic scaling, near-zero maintenance, and pay-per-use pricing. Olytix Core integrates with Snowflake using the official Python connector.
Prerequisites
- Snowflake account with ACCOUNTADMIN or appropriate role access
- Network access from Olytix Core to Snowflake (or PrivateLink/VPN configured)
- User account with access to required databases and schemas
Installation
Install the Snowflake connector:
pip install olytix-core[snowflake]
# Or install the connector directly
pip install snowflake-connector-python
Configuration
Basic Configuration
Add the warehouse configuration to your olytix-core_project.yml:
name: my_analytics
version: 1.0.0
warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
password: ${OLYTIX_SNOWFLAKE_PASSWORD}
database: ANALYTICS
schema: PUBLIC
warehouse: COMPUTE_WH
role: ANALYTICS_ROLE
Environment Variables
Set your credentials using environment variables:
export OLYTIX_SNOWFLAKE_USER=analytics_user
export OLYTIX_SNOWFLAKE_PASSWORD=your_secure_password
Account Identifier Format
The account identifier varies by cloud provider and region:
| Provider | Format | Example |
|---|---|---|
| AWS | <account>.<region> | xy12345.us-east-1 |
| AWS (legacy) | <account>.<region>.aws | xy12345.us-east-1.aws |
| Azure | <account>.<region>.azure | xy12345.west-us-2.azure |
| GCP | <account>.<region>.gcp | xy12345.us-central1.gcp |
Complete Configuration Options
warehouse:
type: snowflake
# Account settings
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
password: ${OLYTIX_SNOWFLAKE_PASSWORD}
# Database settings
database: ANALYTICS
schema: PUBLIC
# Compute settings
warehouse: COMPUTE_WH
role: ANALYTICS_ROLE
# Connection settings
connection_timeout: 30 # seconds
query_timeout: 300 # seconds
# Optional settings
extra:
client_session_keep_alive: true
client_prefetch_threads: 4
Authentication Methods
Password Authentication
Standard username/password authentication:
warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
password: ${OLYTIX_SNOWFLAKE_PASSWORD}
Key Pair Authentication
More secure authentication using RSA key pairs:
warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
extra:
private_key_path: /path/to/rsa_key.p8
private_key_passphrase: ${OLYTIX_SNOWFLAKE_KEY_PASSPHRASE}
Generate a key pair:
# Generate private key
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8
# Generate public key
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
# Set the public key in Snowflake
# ALTER USER analytics_user SET RSA_PUBLIC_KEY='<public_key_content>';
External Browser Authentication
For interactive use (development only):
warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
extra:
authenticator: externalbrowser
OAuth Authentication
For enterprise SSO integration:
warehouse:
type: snowflake
account: xy12345.us-east-1
extra:
authenticator: oauth
token: ${OLYTIX_SNOWFLAKE_OAUTH_TOKEN}
Warehouse and Role Management
Warehouse Configuration
Olytix Core can automatically manage Snowflake warehouse state:
warehouse:
type: snowflake
warehouse: COMPUTE_WH
extra:
auto_resume: true # Resume suspended warehouse
auto_suspend: true # Suspend after query completion
Multiple Warehouses
Use different warehouses for different workloads:
# Default warehouse for queries
warehouse:
type: snowflake
warehouse: QUERY_WH
# Override for heavy transformations
warehouse_overrides:
models:
- name: "*_heavy"
warehouse: TRANSFORM_WH_XL
Role-Based Access
Configure role for appropriate permissions:
warehouse:
type: snowflake
role: ANALYTICS_ROLE
Recommended role hierarchy:
-- Create roles
CREATE ROLE OLYTIX_READ;
CREATE ROLE OLYTIX_TRANSFORM;
CREATE ROLE OLYTIX_ADMIN;
-- Grant permissions
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE OLYTIX_READ;
GRANT USAGE ON DATABASE ANALYTICS TO ROLE OLYTIX_READ;
GRANT SELECT ON ALL TABLES IN SCHEMA ANALYTICS.PUBLIC TO ROLE OLYTIX_READ;
GRANT ROLE OLYTIX_READ TO ROLE OLYTIX_TRANSFORM;
GRANT CREATE TABLE ON SCHEMA ANALYTICS.MARTS TO ROLE OLYTIX_TRANSFORM;
GRANT ROLE OLYTIX_TRANSFORM TO ROLE OLYTIX_ADMIN;
Testing the Connection
Verify your connection is working:
olytix-core test-connection
Expected output:
Testing connection to snowflake://analytics_user@xy12345.us-east-1/ANALYTICS
✓ Connection successful
✓ Warehouse COMPUTE_WH available
✓ Role ANALYTICS_ROLE active
✓ Schema 'PUBLIC' accessible
✓ Found 42 tables
Using Python:
from olytix-core.engine.connectors.snowflake import SnowflakeConnector
from olytix-core.engine.connectors.base import ConnectionConfig, ConnectorType
config = ConnectionConfig(
type=ConnectorType.SNOWFLAKE,
account="xy12345.us-east-1",
username="analytics_user",
password="your_password",
database="ANALYTICS",
schema="PUBLIC",
warehouse="COMPUTE_WH",
role="ANALYTICS_ROLE",
)
async def test():
connector = SnowflakeConnector(config)
async with connector:
result = await connector.execute("SELECT CURRENT_VERSION()")
print(f"Snowflake version: {result.data[0]}")
tables = await connector.get_tables()
print(f"Available tables: {tables}")
Defining Sources
Once connected, define your data sources:
# sources/snowflake_raw.yml
version: 2
sources:
- name: raw
database: RAW_DATA
schema: ECOMMERCE
tables:
- name: ORDERS
description: Raw e-commerce orders
columns:
- name: ORDER_ID
description: Order identifier
- name: CUSTOMER_ID
description: Customer reference
- name: ORDER_TOTAL
description: Order total in USD
- name: ORDER_DATE
description: Order timestamp
- name: CUSTOMERS
description: Customer master data
freshness:
warn_after: { count: 24, period: hour }
error_after: { count: 48, period: hour }
Snowflake-Specific Features
Query Metadata
Olytix Core captures Snowflake query metadata:
result = await connector.execute("SELECT * FROM orders LIMIT 100")
print(f"Query ID: {result.metadata.get('query_id')}")
print(f"Execution time: {result.execution_time_ms}ms")
Query History
Access Snowflake query history for debugging:
history = await connector.get_query_history(
limit=50,
start_time="2024-01-01T00:00:00Z"
)
for query in history:
print(f"{query['QUERY_ID']}: {query['EXECUTION_TIME']}ms")
Warehouse Management
Programmatically manage warehouses:
# Switch warehouse
await connector.set_warehouse("TRANSFORM_WH_XL")
# Suspend warehouse when done
await connector.suspend_warehouse("TRANSFORM_WH_XL")
# Resume warehouse
await connector.resume_warehouse("COMPUTE_WH")
Clustering
Create clustered tables for optimal query performance:
await connector.create_table(
table="orders_clustered",
columns={
"order_id": "NUMBER",
"customer_id": "NUMBER",
"order_date": "TIMESTAMP_NTZ",
"amount": "NUMBER(10,2)",
},
cluster_by=["order_date", "customer_id"]
)
Performance Optimization
Warehouse Sizing
Choose appropriate warehouse size based on workload:
| Size | Use Case | Credits/Hour |
|---|---|---|
| X-Small | Development, small queries | 1 |
| Small | Light analytics | 2 |
| Medium | Standard workloads | 4 |
| Large | Complex queries | 8 |
| X-Large+ | Heavy transformations | 16+ |
Query Optimization
# Enable result caching
warehouse:
type: snowflake
extra:
use_cached_result: true
-- Add search optimization for frequent filters
ALTER TABLE orders ADD SEARCH OPTIMIZATION ON EQUALITY(customer_id);
-- Create materialized views for common aggregations
CREATE MATERIALIZED VIEW mv_daily_revenue AS
SELECT
DATE_TRUNC('day', order_date) as date,
SUM(order_total) as revenue
FROM orders
GROUP BY 1;
Pre-aggregation Tables
Olytix Core can create pre-aggregation tables in Snowflake:
# cubes/orders.yml
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"
pre_aggregations:
- name: daily_revenue
measures: [total_revenue, order_count]
dimensions: [region]
time_dimension: order_date
granularity: day
refresh_key:
every: "1 hour"
build_range_start: "2023-01-01"
Cost Management
Monitor Credit Usage
-- Query credit usage
SELECT
DATE_TRUNC('day', start_time) as date,
warehouse_name,
SUM(credits_used) as total_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -30, CURRENT_DATE())
GROUP BY 1, 2
ORDER BY 1 DESC, 3 DESC;
Auto-Suspend Configuration
-- Set auto-suspend to 1 minute
ALTER WAREHOUSE COMPUTE_WH SET AUTO_SUSPEND = 60;
Resource Monitors
-- Create a resource monitor
CREATE RESOURCE MONITOR olytix-core_monitor
WITH CREDIT_QUOTA = 1000
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 75 PERCENT DO NOTIFY
ON 90 PERCENT DO SUSPEND
ON 100 PERCENT DO SUSPEND_IMMEDIATE;
-- Assign to warehouse
ALTER WAREHOUSE COMPUTE_WH SET RESOURCE_MONITOR = olytix-core_monitor;
Troubleshooting
Account Not Found
Error: Account 'xy12345' not found
Solutions:
- Verify account identifier format includes region
- Check for typos in account name
- Verify network connectivity to Snowflake
Warehouse Suspended
Error: Warehouse 'COMPUTE_WH' is suspended
Solutions:
- Enable
auto_resume: truein configuration - Manually resume:
ALTER WAREHOUSE COMPUTE_WH RESUME; - Check resource monitor limits
Insufficient Privileges
Error: Insufficient privileges to operate on schema 'ANALYTICS.PUBLIC'
Solutions:
- Verify role has required permissions
- Check role is assigned to user
- Ensure database and schema grants are in place
Query Timeout
Error: Query exceeded timeout of 300 seconds
Solutions:
- Increase
query_timeoutin configuration - Use larger warehouse for complex queries
- Optimize query with clustering or materialized views
- Consider pre-aggregations