Skip to main content

Snowflake Integration

For Data Analysts

Snowflake is a cloud-native data warehouse that provides automatic scaling, near-zero maintenance, and pay-per-use pricing. Olytix Core integrates with Snowflake using the official Python connector.

Prerequisites

  • Snowflake account with ACCOUNTADMIN or appropriate role access
  • Network access from Olytix Core to Snowflake (or PrivateLink/VPN configured)
  • User account with access to required databases and schemas

Installation

Install the Snowflake connector:

pip install olytix-core[snowflake]

# Or install the connector directly
pip install snowflake-connector-python

Configuration

Basic Configuration

Add the warehouse configuration to your olytix-core_project.yml:

name: my_analytics
version: 1.0.0

warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
password: ${OLYTIX_SNOWFLAKE_PASSWORD}
database: ANALYTICS
schema: PUBLIC
warehouse: COMPUTE_WH
role: ANALYTICS_ROLE

Environment Variables

Set your credentials using environment variables:

export OLYTIX_SNOWFLAKE_USER=analytics_user
export OLYTIX_SNOWFLAKE_PASSWORD=your_secure_password

Account Identifier Format

The account identifier varies by cloud provider and region:

ProviderFormatExample
AWS<account>.<region>xy12345.us-east-1
AWS (legacy)<account>.<region>.awsxy12345.us-east-1.aws
Azure<account>.<region>.azurexy12345.west-us-2.azure
GCP<account>.<region>.gcpxy12345.us-central1.gcp

Complete Configuration Options

warehouse:
type: snowflake

# Account settings
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
password: ${OLYTIX_SNOWFLAKE_PASSWORD}

# Database settings
database: ANALYTICS
schema: PUBLIC

# Compute settings
warehouse: COMPUTE_WH
role: ANALYTICS_ROLE

# Connection settings
connection_timeout: 30 # seconds
query_timeout: 300 # seconds

# Optional settings
extra:
client_session_keep_alive: true
client_prefetch_threads: 4

Authentication Methods

Password Authentication

Standard username/password authentication:

warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
password: ${OLYTIX_SNOWFLAKE_PASSWORD}

Key Pair Authentication

More secure authentication using RSA key pairs:

warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
extra:
private_key_path: /path/to/rsa_key.p8
private_key_passphrase: ${OLYTIX_SNOWFLAKE_KEY_PASSPHRASE}

Generate a key pair:

# Generate private key
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8

# Generate public key
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub

# Set the public key in Snowflake
# ALTER USER analytics_user SET RSA_PUBLIC_KEY='<public_key_content>';

External Browser Authentication

For interactive use (development only):

warehouse:
type: snowflake
account: xy12345.us-east-1
user: ${OLYTIX_SNOWFLAKE_USER}
extra:
authenticator: externalbrowser

OAuth Authentication

For enterprise SSO integration:

warehouse:
type: snowflake
account: xy12345.us-east-1
extra:
authenticator: oauth
token: ${OLYTIX_SNOWFLAKE_OAUTH_TOKEN}

Warehouse and Role Management

Warehouse Configuration

Olytix Core can automatically manage Snowflake warehouse state:

warehouse:
type: snowflake
warehouse: COMPUTE_WH
extra:
auto_resume: true # Resume suspended warehouse
auto_suspend: true # Suspend after query completion

Multiple Warehouses

Use different warehouses for different workloads:

# Default warehouse for queries
warehouse:
type: snowflake
warehouse: QUERY_WH

# Override for heavy transformations
warehouse_overrides:
models:
- name: "*_heavy"
warehouse: TRANSFORM_WH_XL

Role-Based Access

Configure role for appropriate permissions:

warehouse:
type: snowflake
role: ANALYTICS_ROLE

Recommended role hierarchy:

-- Create roles
CREATE ROLE OLYTIX_READ;
CREATE ROLE OLYTIX_TRANSFORM;
CREATE ROLE OLYTIX_ADMIN;

-- Grant permissions
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE OLYTIX_READ;
GRANT USAGE ON DATABASE ANALYTICS TO ROLE OLYTIX_READ;
GRANT SELECT ON ALL TABLES IN SCHEMA ANALYTICS.PUBLIC TO ROLE OLYTIX_READ;

GRANT ROLE OLYTIX_READ TO ROLE OLYTIX_TRANSFORM;
GRANT CREATE TABLE ON SCHEMA ANALYTICS.MARTS TO ROLE OLYTIX_TRANSFORM;

GRANT ROLE OLYTIX_TRANSFORM TO ROLE OLYTIX_ADMIN;

Testing the Connection

Verify your connection is working:

olytix-core test-connection

Expected output:

Testing connection to snowflake://analytics_user@xy12345.us-east-1/ANALYTICS
✓ Connection successful
✓ Warehouse COMPUTE_WH available
✓ Role ANALYTICS_ROLE active
✓ Schema 'PUBLIC' accessible
✓ Found 42 tables

Using Python:

from olytix-core.engine.connectors.snowflake import SnowflakeConnector
from olytix-core.engine.connectors.base import ConnectionConfig, ConnectorType

config = ConnectionConfig(
type=ConnectorType.SNOWFLAKE,
account="xy12345.us-east-1",
username="analytics_user",
password="your_password",
database="ANALYTICS",
schema="PUBLIC",
warehouse="COMPUTE_WH",
role="ANALYTICS_ROLE",
)

async def test():
connector = SnowflakeConnector(config)
async with connector:
result = await connector.execute("SELECT CURRENT_VERSION()")
print(f"Snowflake version: {result.data[0]}")

tables = await connector.get_tables()
print(f"Available tables: {tables}")

Defining Sources

Once connected, define your data sources:

# sources/snowflake_raw.yml
version: 2

sources:
- name: raw
database: RAW_DATA
schema: ECOMMERCE
tables:
- name: ORDERS
description: Raw e-commerce orders
columns:
- name: ORDER_ID
description: Order identifier
- name: CUSTOMER_ID
description: Customer reference
- name: ORDER_TOTAL
description: Order total in USD
- name: ORDER_DATE
description: Order timestamp

- name: CUSTOMERS
description: Customer master data
freshness:
warn_after: { count: 24, period: hour }
error_after: { count: 48, period: hour }

Snowflake-Specific Features

Query Metadata

Olytix Core captures Snowflake query metadata:

result = await connector.execute("SELECT * FROM orders LIMIT 100")

print(f"Query ID: {result.metadata.get('query_id')}")
print(f"Execution time: {result.execution_time_ms}ms")

Query History

Access Snowflake query history for debugging:

history = await connector.get_query_history(
limit=50,
start_time="2024-01-01T00:00:00Z"
)

for query in history:
print(f"{query['QUERY_ID']}: {query['EXECUTION_TIME']}ms")

Warehouse Management

Programmatically manage warehouses:

# Switch warehouse
await connector.set_warehouse("TRANSFORM_WH_XL")

# Suspend warehouse when done
await connector.suspend_warehouse("TRANSFORM_WH_XL")

# Resume warehouse
await connector.resume_warehouse("COMPUTE_WH")

Clustering

Create clustered tables for optimal query performance:

await connector.create_table(
table="orders_clustered",
columns={
"order_id": "NUMBER",
"customer_id": "NUMBER",
"order_date": "TIMESTAMP_NTZ",
"amount": "NUMBER(10,2)",
},
cluster_by=["order_date", "customer_id"]
)

Performance Optimization

Warehouse Sizing

Choose appropriate warehouse size based on workload:

SizeUse CaseCredits/Hour
X-SmallDevelopment, small queries1
SmallLight analytics2
MediumStandard workloads4
LargeComplex queries8
X-Large+Heavy transformations16+

Query Optimization

# Enable result caching
warehouse:
type: snowflake
extra:
use_cached_result: true
-- Add search optimization for frequent filters
ALTER TABLE orders ADD SEARCH OPTIMIZATION ON EQUALITY(customer_id);

-- Create materialized views for common aggregations
CREATE MATERIALIZED VIEW mv_daily_revenue AS
SELECT
DATE_TRUNC('day', order_date) as date,
SUM(order_total) as revenue
FROM orders
GROUP BY 1;

Pre-aggregation Tables

Olytix Core can create pre-aggregation tables in Snowflake:

# cubes/orders.yml
cubes:
- name: orders
sql: "SELECT * FROM {{ ref('fct_orders') }}"

pre_aggregations:
- name: daily_revenue
measures: [total_revenue, order_count]
dimensions: [region]
time_dimension: order_date
granularity: day
refresh_key:
every: "1 hour"
build_range_start: "2023-01-01"

Cost Management

Monitor Credit Usage

-- Query credit usage
SELECT
DATE_TRUNC('day', start_time) as date,
warehouse_name,
SUM(credits_used) as total_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -30, CURRENT_DATE())
GROUP BY 1, 2
ORDER BY 1 DESC, 3 DESC;

Auto-Suspend Configuration

-- Set auto-suspend to 1 minute
ALTER WAREHOUSE COMPUTE_WH SET AUTO_SUSPEND = 60;

Resource Monitors

-- Create a resource monitor
CREATE RESOURCE MONITOR olytix-core_monitor
WITH CREDIT_QUOTA = 1000
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 75 PERCENT DO NOTIFY
ON 90 PERCENT DO SUSPEND
ON 100 PERCENT DO SUSPEND_IMMEDIATE;

-- Assign to warehouse
ALTER WAREHOUSE COMPUTE_WH SET RESOURCE_MONITOR = olytix-core_monitor;

Troubleshooting

Account Not Found

Error: Account 'xy12345' not found

Solutions:

  1. Verify account identifier format includes region
  2. Check for typos in account name
  3. Verify network connectivity to Snowflake

Warehouse Suspended

Error: Warehouse 'COMPUTE_WH' is suspended

Solutions:

  1. Enable auto_resume: true in configuration
  2. Manually resume: ALTER WAREHOUSE COMPUTE_WH RESUME;
  3. Check resource monitor limits

Insufficient Privileges

Error: Insufficient privileges to operate on schema 'ANALYTICS.PUBLIC'

Solutions:

  1. Verify role has required permissions
  2. Check role is assigned to user
  3. Ensure database and schema grants are in place

Query Timeout

Error: Query exceeded timeout of 300 seconds

Solutions:

  1. Increase query_timeout in configuration
  2. Use larger warehouse for complex queries
  3. Optimize query with clustering or materialized views
  4. Consider pre-aggregations

Next Steps