AI/ML Integration

For Business Users

AI and machine learning require high-quality, well-documented data. Olytix Core provides the governance, consistency, and access controls needed to safely power AI initiatives while maintaining data trust.

The AI Data Challenge

AI projects often fail due to data issues:

Common AI Project Failures
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Data Quality (45%)
├── Inconsistent definitions
├── Missing values
├── Outdated data
└── No documentation

Data Access (30%)
├── Can't find the right data
├── No access permissions
├── Data silos
└── Security concerns

Data Governance (25%)
├── No lineage for model inputs
├── Can't explain predictions
├── Compliance violations
└── No version control
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Olytix Core AI Solution

Governed Data for AI

┌─────────────────────────────────────────────────────────────────────┐
│                    AI-Ready Data Architecture                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   Raw Data        Olytix Core Semantic       AI/ML Applications            │
│   Sources         Layer               (Governed Access)             │
│                                                                      │
│  ┌─────────┐     ┌─────────────┐     ┌─────────────────────────┐   │
│  │ CRM     │────►│             │────►│ Natural Language Query  │   │
│  └─────────┘     │             │     └─────────────────────────┘   │
│                  │   Metrics   │                                    │
│  ┌─────────┐     │   Cubes     │     ┌─────────────────────────┐   │
│  │ ERP     │────►│   Lineage   │────►│ Predictive Models       │   │
│  └─────────┘     │   Security  │     └─────────────────────────┘   │
│                  │             │                                    │
│  ┌─────────┐     │             │     ┌─────────────────────────┐   │
│  │ Product │────►│             │────►│ Recommendation Engine   │   │
│  └─────────┘     └─────────────┘     └─────────────────────────┘   │
│                        │                                            │
│                        ▼                                            │
│               Complete Audit Trail                                  │
│               Model Input Lineage                                   │
│               Explainable AI Ready                                  │
└─────────────────────────────────────────────────────────────────────┘

Natural Language Querying

Ask Questions in Plain English

Olytix Core's AI integration enables natural language data access:

User: "What was our revenue last quarter compared to the same quarter last year?"

Olytix Core AI Translation:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Understanding:
• Metric: revenue (net_revenue)
• Time Period: Q4 2024
• Comparison: Q4 2023 (same quarter prior year)

Generated Query:
{
  "metrics": ["net_revenue"],
  "dimensions": ["orders.order_date.quarter"],
  "time_intelligence": {
    "compare_to": "same_period_prior_year"
  },
  "filters": [
    {"dimension": "orders.order_date.quarter", "operator": "equals", "value": "2024-Q4"}
  ]
}

Result:
Q4 2024: $4.2M
Q4 2023: $3.8M
Change: +10.5%
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Semantic Understanding

The AI understands your business terminology:

# Configure business terminology mappings
ai:
  terminology:
    - term: "sales"
      maps_to: "net_revenue"
      context: "Usually refers to net revenue unless gross is specified"

    - term: "customers"
      maps_to: "customers.active_count"
      context: "Default to active customers"

    - term: "last quarter"
      maps_to: "prior_quarter"
      context: "Most recent completed quarter"

    - term: "growth"
      maps_to: "period_over_period_change"
      context: "Compare to prior period"

Semantic Search

Finding Relevant Metrics

User: "I need data about customer satisfaction"

Olytix Core Semantic Search Results:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Highly Relevant (>90% match):
├── nps_score (Net Promoter Score)
│   Description: "Customer satisfaction score from -100 to 100"
│   Cube: customers
│   Certified: ✓
│
├── customer_satisfaction_rating
│   Description: "Average support ticket satisfaction (1-5)"
│   Cube: support
│   Certified: ✓
│
└── csat_score
    Description: "Post-interaction satisfaction percentage"
    Cube: interactions
    Certified: ✓

Related Metrics:
├── customer_churn_rate (Customer retention indicator)
├── support_ticket_count (Volume of issues)
└── avg_resolution_time (Service quality)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Embedding-Based Discovery

# Configure embeddings for semantic search
embeddings:
  model: "text-embedding-3-small"
  index:
    - type: metrics
      fields: [name, description, calculation_notes]

    - type: dimensions
      fields: [name, description, example_values]

    - type: cubes
      fields: [name, description, business_context]

  search:
    min_relevance: 0.7
    max_results: 10
    boost_certified: 1.2

Machine Learning Features

Feature Store Integration

from olytix-core import Olytix CoreClient

client = Olytix CoreClient("http://localhost:8000")

# Get features for ML model
def get_customer_features(customer_ids: list):
    """Get features for churn prediction model."""
    return client.query(
        measures=[
            "customers.total_revenue",
            "customers.order_count",
            "customers.avg_order_value",
            "customers.days_since_last_order",
            "customers.support_ticket_count",
            "customers.nps_score"
        ],
        dimensions=[
            "customers.customer_id",
            "customers.segment",
            "customers.tenure_months"
        ],
        filters=[
            {"dimension": "customers.customer_id", "operator": "inList", "value": customer_ids}
        ]
    ).to_dataframe()

# Use in ML pipeline
features = get_customer_features(customer_list)
predictions = model.predict(features)

Model Input Documentation

Every ML model input is documented:

Model: Customer Churn Prediction v2.3
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Input Features (from Olytix Core):
────────────────────────────────────────────────────────────
Feature                  │ Source Metric            │ Lineage
─────────────────────────┼──────────────────────────┼─────────
total_revenue            │ customers.total_revenue  │ ✓ Traced
order_count              │ customers.order_count    │ ✓ Traced
days_since_last_order    │ customers.recency        │ ✓ Traced
support_tickets          │ support.ticket_count     │ ✓ Traced
nps_score                │ customers.nps_score      │ ✓ Traced
segment                  │ customers.segment        │ ✓ Traced

Data Quality:
────────────────────────────────────────────────────────────
• All features from certified metrics
• Complete lineage from source to model
• 99.8% feature coverage (0.2% null values)
• Last training data: 2024-01-15

Governance:
────────────────────────────────────────────────────────────
• Model owner: Data Science Team
• Last audit: 2024-01-10
• Bias testing: Passed
• Explainability: SHAP values documented
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

LLM Integration

Retrieval-Augmented Generation (RAG)

# Configure RAG for business intelligence
rag:
  knowledge_base:
    sources:
      - type: metric_definitions
        include_lineage: true
        include_examples: true

      - type: business_glossary
        include_context: true

      - type: query_history
        filter: successful_only
        max_age_days: 90

  retrieval:
    model: "text-embedding-3-small"
    top_k: 5
    min_relevance: 0.75

  generation:
    model: "gpt-4"
    temperature: 0.1
    system_prompt: |
      You are a business intelligence assistant with access to
      the company's semantic layer. Use the provided context
      to answer questions accurately. Always cite the metrics
      and their definitions.

Conversational Analytics

User: "Why did revenue drop in December?"

Olytix Core AI Assistant:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Let me analyze the December revenue decline.

Revenue Analysis - December 2024:
• Total Revenue: $3.8M (↓15% vs November)
• Expected: $4.5M based on seasonal pattern

Contributing Factors:
1. Order Volume: ↓12% (3,200 vs 3,650 orders)
2. Average Order Value: ↓4% ($1,188 vs $1,233)
3. Refund Rate: ↑2pp (5% vs 3%)

Breakdown by Segment:
• Enterprise: $2.1M (↓8%) - 2 large renewals delayed to January
• Mid-Market: $1.2M (↓18%) - Holiday slowdown
• SMB: $0.5M (↓25%) - Typical seasonal pattern

Comparison to Prior Year December:
• 2024: $3.8M
• 2023: $3.5M
• YoY Growth: +8.6% (still positive)

Recommendation:
The December decline appears to be primarily seasonal, with
some impact from delayed enterprise renewals. The YoY comparison
shows healthy growth. Monitor January for enterprise renewals.

Sources: net_revenue, order_count, avg_order_value, refund_rate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

AI Governance

Model Input Lineage

# Track which data feeds which models
model_registry:
  - name: churn_prediction_v2
    inputs:
      metrics:
        - customers.total_revenue
        - customers.order_count
        - customers.nps_score
      lineage: required

    outputs:
      - name: churn_probability
        type: float
        range: [0, 1]

    governance:
      owner: data-science-team
      review_cycle: quarterly
      bias_testing: required
      explainability: shap

AI Access Controls

security:
  ai_access:
    # Which AI systems can access what
    natural_language_query:
      allowed_users: all
      restricted_metrics:
        - compensation_data
        - pii_metrics
      audit: true

    ml_feature_access:
      allowed_services:
        - churn-prediction-service
        - recommendation-engine
      require_model_registration: true
      audit: true

    llm_context:
      allowed_data:
        - metric_definitions
        - business_glossary
        - anonymized_examples
      prohibited:
        - raw_customer_data
        - financial_details

Explainability

Prediction Explanation - Customer: ACME Corp
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Churn Prediction: 72% (High Risk)

Top Contributing Factors:
────────────────────────────────────────────────────────────
Factor                    │ Value      │ Impact    │ Source
──────────────────────────┼────────────┼───────────┼──────────
Days since last order     │ 45 days    │ +0.25     │ customers.recency
Support tickets (90d)     │ 8 tickets  │ +0.18     │ support.count
NPS score                 │ 6 (passive)│ +0.12     │ customers.nps
Login frequency trend     │ ↓ 40%      │ +0.10     │ usage.logins
Contract renewal          │ 30 days    │ +0.07     │ contracts.renewal
──────────────────────────┼────────────┼───────────┼──────────
                          │            │ Total: 0.72│

Data Sources (All Traced):
• customers cube → dim_customers → CRM sync
• support cube → fct_tickets → Zendesk API
• usage cube → fct_events → Product analytics

Governance:
• Model version: v2.3
• Training date: 2024-01-01
• Bias audit: Passed (2024-01-10)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Implementation Examples

Python SDK for AI

from olytix-core import Olytix CoreClient
from olytix-core.ai import NaturalLanguageQuery, SemanticSearch

client = Olytix CoreClient("http://localhost:8000")

# Natural language query
nlq = NaturalLanguageQuery(client)
result = nlq.query("Show me top 10 customers by revenue this year")

# Semantic search for metrics
search = SemanticSearch(client)
metrics = search.find("customer satisfaction")

# Get embeddings for custom use
embeddings = client.get_embeddings(
    texts=["revenue", "customer count", "churn rate"],
    model="text-embedding-3-small"
)

REST API for AI

# Natural language query
curl -X POST http://localhost:8000/api/v1/ai/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is the monthly revenue trend?",
    "context": {
      "time_range": "last 12 months"
    }
  }'

# Semantic search
curl -X POST http://localhost:8000/api/v1/ai/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "customer satisfaction metrics",
    "types": ["metrics", "dimensions"],
    "limit": 10
  }'

Best Practices

AI Data Quality

Use certified metrics for ML features
Document feature definitions clearly
Track model input lineage end-to-end
Version your feature sets with models

AI Governance

Register all models that consume Olytix Core data
Audit AI data access regularly
Test for bias using documented methods
Maintain explainability for all predictions

Next Steps

Ready to power AI with Olytix Core?

AI Success

The best AI models are built on trusted, well-documented data. Invest in data quality and governance before scaling AI initiatives.

The AI Data Challenge​

Olytix Core AI Solution​

Governed Data for AI​

Natural Language Querying​

Ask Questions in Plain English​

Semantic Understanding​

Semantic Search​

Finding Relevant Metrics​

Embedding-Based Discovery​

Machine Learning Features​

Feature Store Integration​

Model Input Documentation​

LLM Integration​

Retrieval-Augmented Generation (RAG)​

Conversational Analytics​

AI Governance​

Model Input Lineage​

AI Access Controls​

Explainability​

Implementation Examples​

Python SDK for AI​

REST API for AI​

Best Practices​

AI Data Quality​

AI Governance​

Next Steps​

The AI Data Challenge

Olytix Core AI Solution

Governed Data for AI

Natural Language Querying

Ask Questions in Plain English

Semantic Understanding

Semantic Search

Finding Relevant Metrics

Embedding-Based Discovery

Machine Learning Features

Feature Store Integration

Model Input Documentation

LLM Integration

Retrieval-Augmented Generation (RAG)

Conversational Analytics

AI Governance

Model Input Lineage

AI Access Controls

Explainability

Implementation Examples

Python SDK for AI

REST API for AI

Best Practices

AI Data Quality

AI Governance

Next Steps