Skip to main content

AI/ML Integration

For Business Users

AI and machine learning require high-quality, well-documented data. Olytix Core provides the governance, consistency, and access controls needed to safely power AI initiatives while maintaining data trust.

The AI Data Challenge

AI projects often fail due to data issues:

Common AI Project Failures
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Data Quality (45%)
├── Inconsistent definitions
├── Missing values
├── Outdated data
└── No documentation

Data Access (30%)
├── Can't find the right data
├── No access permissions
├── Data silos
└── Security concerns

Data Governance (25%)
├── No lineage for model inputs
├── Can't explain predictions
├── Compliance violations
└── No version control
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Olytix Core AI Solution

Governed Data for AI

┌─────────────────────────────────────────────────────────────────────┐
│ AI-Ready Data Architecture │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Raw Data Olytix Core Semantic AI/ML Applications │
│ Sources Layer (Governed Access) │
│ │
│ ┌─────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ CRM │────►│ │────►│ Natural Language Query │ │
│ └─────────┘ │ │ └─────────────────────────┘ │
│ │ Metrics │ │
│ ┌─────────┐ │ Cubes │ ┌─────────────────────────┐ │
│ │ ERP │────►│ Lineage │────►│ Predictive Models │ │
│ └─────────┘ │ Security │ └─────────────────────────┘ │
│ │ │ │
│ ┌─────────┐ │ │ ┌─────────────────────────┐ │
│ │ Product │────►│ │────►│ Recommendation Engine │ │
│ └─────────┘ └─────────────┘ └─────────────────────────┘ │
│ │ │
│ ▼ │
│ Complete Audit Trail │
│ Model Input Lineage │
│ Explainable AI Ready │
└─────────────────────────────────────────────────────────────────────┘

Natural Language Querying

Ask Questions in Plain English

Olytix Core's AI integration enables natural language data access:

User: "What was our revenue last quarter compared to the same quarter last year?"

Olytix Core AI Translation:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Understanding:
• Metric: revenue (net_revenue)
• Time Period: Q4 2024
• Comparison: Q4 2023 (same quarter prior year)

Generated Query:
{
"metrics": ["net_revenue"],
"dimensions": ["orders.order_date.quarter"],
"time_intelligence": {
"compare_to": "same_period_prior_year"
},
"filters": [
{"dimension": "orders.order_date.quarter", "operator": "equals", "value": "2024-Q4"}
]
}

Result:
Q4 2024: $4.2M
Q4 2023: $3.8M
Change: +10.5%
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Semantic Understanding

The AI understands your business terminology:

# Configure business terminology mappings
ai:
terminology:
- term: "sales"
maps_to: "net_revenue"
context: "Usually refers to net revenue unless gross is specified"

- term: "customers"
maps_to: "customers.active_count"
context: "Default to active customers"

- term: "last quarter"
maps_to: "prior_quarter"
context: "Most recent completed quarter"

- term: "growth"
maps_to: "period_over_period_change"
context: "Compare to prior period"

Finding Relevant Metrics

User: "I need data about customer satisfaction"

Olytix Core Semantic Search Results:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Highly Relevant (>90% match):
├── nps_score (Net Promoter Score)
│ Description: "Customer satisfaction score from -100 to 100"
│ Cube: customers
│ Certified: ✓

├── customer_satisfaction_rating
│ Description: "Average support ticket satisfaction (1-5)"
│ Cube: support
│ Certified: ✓

└── csat_score
Description: "Post-interaction satisfaction percentage"
Cube: interactions
Certified: ✓

Related Metrics:
├── customer_churn_rate (Customer retention indicator)
├── support_ticket_count (Volume of issues)
└── avg_resolution_time (Service quality)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Embedding-Based Discovery

# Configure embeddings for semantic search
embeddings:
model: "text-embedding-3-small"
index:
- type: metrics
fields: [name, description, calculation_notes]

- type: dimensions
fields: [name, description, example_values]

- type: cubes
fields: [name, description, business_context]

search:
min_relevance: 0.7
max_results: 10
boost_certified: 1.2

Machine Learning Features

Feature Store Integration

from olytix-core import Olytix CoreClient

client = Olytix CoreClient("http://localhost:8000")

# Get features for ML model
def get_customer_features(customer_ids: list):
"""Get features for churn prediction model."""
return client.query(
measures=[
"customers.total_revenue",
"customers.order_count",
"customers.avg_order_value",
"customers.days_since_last_order",
"customers.support_ticket_count",
"customers.nps_score"
],
dimensions=[
"customers.customer_id",
"customers.segment",
"customers.tenure_months"
],
filters=[
{"dimension": "customers.customer_id", "operator": "inList", "value": customer_ids}
]
).to_dataframe()

# Use in ML pipeline
features = get_customer_features(customer_list)
predictions = model.predict(features)

Model Input Documentation

Every ML model input is documented:

Model: Customer Churn Prediction v2.3
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Input Features (from Olytix Core):
────────────────────────────────────────────────────────────
Feature │ Source Metric │ Lineage
─────────────────────────┼──────────────────────────┼─────────
total_revenue │ customers.total_revenue │ ✓ Traced
order_count │ customers.order_count │ ✓ Traced
days_since_last_order │ customers.recency │ ✓ Traced
support_tickets │ support.ticket_count │ ✓ Traced
nps_score │ customers.nps_score │ ✓ Traced
segment │ customers.segment │ ✓ Traced

Data Quality:
────────────────────────────────────────────────────────────
• All features from certified metrics
• Complete lineage from source to model
• 99.8% feature coverage (0.2% null values)
• Last training data: 2024-01-15

Governance:
────────────────────────────────────────────────────────────
• Model owner: Data Science Team
• Last audit: 2024-01-10
• Bias testing: Passed
• Explainability: SHAP values documented
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

LLM Integration

Retrieval-Augmented Generation (RAG)

# Configure RAG for business intelligence
rag:
knowledge_base:
sources:
- type: metric_definitions
include_lineage: true
include_examples: true

- type: business_glossary
include_context: true

- type: query_history
filter: successful_only
max_age_days: 90

retrieval:
model: "text-embedding-3-small"
top_k: 5
min_relevance: 0.75

generation:
model: "gpt-4"
temperature: 0.1
system_prompt: |
You are a business intelligence assistant with access to
the company's semantic layer. Use the provided context
to answer questions accurately. Always cite the metrics
and their definitions.

Conversational Analytics

User: "Why did revenue drop in December?"

Olytix Core AI Assistant:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Let me analyze the December revenue decline.

Revenue Analysis - December 2024:
• Total Revenue: $3.8M (↓15% vs November)
• Expected: $4.5M based on seasonal pattern

Contributing Factors:
1. Order Volume: ↓12% (3,200 vs 3,650 orders)
2. Average Order Value: ↓4% ($1,188 vs $1,233)
3. Refund Rate: ↑2pp (5% vs 3%)

Breakdown by Segment:
• Enterprise: $2.1M (↓8%) - 2 large renewals delayed to January
• Mid-Market: $1.2M (↓18%) - Holiday slowdown
• SMB: $0.5M (↓25%) - Typical seasonal pattern

Comparison to Prior Year December:
• 2024: $3.8M
• 2023: $3.5M
• YoY Growth: +8.6% (still positive)

Recommendation:
The December decline appears to be primarily seasonal, with
some impact from delayed enterprise renewals. The YoY comparison
shows healthy growth. Monitor January for enterprise renewals.

Sources: net_revenue, order_count, avg_order_value, refund_rate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

AI Governance

Model Input Lineage

# Track which data feeds which models
model_registry:
- name: churn_prediction_v2
inputs:
metrics:
- customers.total_revenue
- customers.order_count
- customers.nps_score
lineage: required

outputs:
- name: churn_probability
type: float
range: [0, 1]

governance:
owner: data-science-team
review_cycle: quarterly
bias_testing: required
explainability: shap

AI Access Controls

security:
ai_access:
# Which AI systems can access what
natural_language_query:
allowed_users: all
restricted_metrics:
- compensation_data
- pii_metrics
audit: true

ml_feature_access:
allowed_services:
- churn-prediction-service
- recommendation-engine
require_model_registration: true
audit: true

llm_context:
allowed_data:
- metric_definitions
- business_glossary
- anonymized_examples
prohibited:
- raw_customer_data
- financial_details

Explainability

Prediction Explanation - Customer: ACME Corp
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Churn Prediction: 72% (High Risk)

Top Contributing Factors:
────────────────────────────────────────────────────────────
Factor │ Value │ Impact │ Source
──────────────────────────┼────────────┼───────────┼──────────
Days since last order │ 45 days │ +0.25 │ customers.recency
Support tickets (90d) │ 8 tickets │ +0.18 │ support.count
NPS score │ 6 (passive)│ +0.12 │ customers.nps
Login frequency trend │ ↓ 40% │ +0.10 │ usage.logins
Contract renewal │ 30 days │ +0.07 │ contracts.renewal
──────────────────────────┼────────────┼───────────┼──────────
│ │ Total: 0.72│

Data Sources (All Traced):
• customers cube → dim_customers → CRM sync
• support cube → fct_tickets → Zendesk API
• usage cube → fct_events → Product analytics

Governance:
• Model version: v2.3
• Training date: 2024-01-01
• Bias audit: Passed (2024-01-10)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Implementation Examples

Python SDK for AI

from olytix-core import Olytix CoreClient
from olytix-core.ai import NaturalLanguageQuery, SemanticSearch

client = Olytix CoreClient("http://localhost:8000")

# Natural language query
nlq = NaturalLanguageQuery(client)
result = nlq.query("Show me top 10 customers by revenue this year")

# Semantic search for metrics
search = SemanticSearch(client)
metrics = search.find("customer satisfaction")

# Get embeddings for custom use
embeddings = client.get_embeddings(
texts=["revenue", "customer count", "churn rate"],
model="text-embedding-3-small"
)

REST API for AI

# Natural language query
curl -X POST http://localhost:8000/api/v1/ai/query \
-H "Content-Type: application/json" \
-d '{
"question": "What is the monthly revenue trend?",
"context": {
"time_range": "last 12 months"
}
}'

# Semantic search
curl -X POST http://localhost:8000/api/v1/ai/search \
-H "Content-Type: application/json" \
-d '{
"query": "customer satisfaction metrics",
"types": ["metrics", "dimensions"],
"limit": 10
}'

Best Practices

AI Data Quality

  1. Use certified metrics for ML features
  2. Document feature definitions clearly
  3. Track model input lineage end-to-end
  4. Version your feature sets with models

AI Governance

  1. Register all models that consume Olytix Core data
  2. Audit AI data access regularly
  3. Test for bias using documented methods
  4. Maintain explainability for all predictions

Next Steps

Ready to power AI with Olytix Core?

  1. Set up natural language queries →
  2. Configure semantic search →
  3. Implement embeddings →

AI Success

The best AI models are built on trusted, well-documented data. Invest in data quality and governance before scaling AI initiatives.