Anomaly Detection

For Data Analysts

Olytix Core's anomaly detection system proactively monitors your metrics and alerts you when unexpected changes occur, helping you catch issues before they impact your business.

Overview

┌─────────────────────────────────────────────────────────────────────┐
│                   ANOMALY DETECTION PIPELINE                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌──────────────┐    ┌──────────────┐    ┌──────────────┐         │
│   │   MONITORS   │───▶│  DETECTORS   │───▶│   SCORING    │         │
│   │              │    │              │    │              │         │
│   │ Define what  │    │ Z-Score      │    │ Severity     │         │
│   │ to monitor   │    │ IQR          │    │ Business     │         │
│   │              │    │ MAD          │    │ Impact       │         │
│   └──────────────┘    └──────────────┘    └──────────────┘         │
│                                                  │                   │
│                                                  ▼                   │
│   ┌──────────────┐    ┌──────────────┐    ┌──────────────┐         │
│   │   STORAGE    │◀───│   ALERTS     │◀───│  ANALYSIS    │         │
│   │              │    │              │    │              │         │
│   │ Historical   │    │ Grouping     │    │ Root Cause   │         │
│   │ Anomalies    │    │ Delivery     │    │ Correlation  │         │
│   └──────────────┘    └──────────────┘    └──────────────┘         │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Key Features

Detection Algorithms

Olytix Core supports multiple statistical detection methods:

Algorithm	Best For	Description
Z-Score	Normal distributions	Measures standard deviations from mean
IQR	Skewed data	Uses interquartile range for outlier detection
MAD	Robust detection	Median Absolute Deviation, resistant to outliers

Severity Scoring

Each anomaly receives a severity score based on:

Statistical Significance - How far from expected values
Historical Context - Comparison to past anomalies
Duration - How long the anomaly persists
Trend - Whether the anomaly is growing or stabilizing

Severity Levels:
├── CRITICAL (0.8 - 1.0): Immediate attention required
├── HIGH (0.6 - 0.8): Significant deviation
├── MEDIUM (0.4 - 0.6): Notable but not urgent
├── LOW (0.2 - 0.4): Minor deviation
└── INFO (0.0 - 0.2): Informational only

Business Impact Assessment

Anomalies are assessed for business impact:

ImpactAssessment:
├── revenue_impact: "$45,000 potential revenue at risk"
├── affected_segments: ["Enterprise", "North America"]
├── customer_impact: "~2,500 customers affected"
└── urgency: "High - revenue-critical metric"

Usage

Creating a Monitor

from olytix-core.anomaly.service import AnomalyService
from olytix-core.anomaly.monitors.models import AnomalyMonitor, DetectorType

service = AnomalyService()

# Create a monitor for revenue
monitor = AnomalyMonitor(
    name="Daily Revenue Monitor",
    cube_name="Orders",
    measure_name="revenue",
    detector_type=DetectorType.Z_SCORE,
    sensitivity=2.5,  # Standard deviations threshold
    dimensions=["region", "product_category"],
    schedule="0 9 * * *",  # Daily at 9 AM
    enabled=True
)

registered = await service.register_monitor(monitor)

Running Detection

# Run detection for a specific monitor
result = await service.run_detection(
    monitor_id=monitor.id,
    data=current_values,  # List of metric values
    context=DetectionContext(
        dimension_values={"region": "North America"},
        related_metrics={"orders_count": order_counts}
    )
)

# Result includes:
# - List of detected anomalies
# - Severity scores
# - Recommended actions

Configuring Alerts

from olytix-core.anomaly.alerts.models import AlertConfig, AlertChannel

# Configure alert delivery
config = AlertConfig(
    monitor_id=monitor.id,
    channels=[
        AlertChannel(type="email", target="data-team@company.com"),
        AlertChannel(type="slack", target="#alerts-channel"),
        AlertChannel(type="webhook", target="https://api.company.com/alerts")
    ],
    min_severity="HIGH",
    grouping_window_minutes=15,
    escalation_rules=[
        {"after_minutes": 30, "notify": "manager@company.com"},
        {"after_minutes": 60, "notify": "director@company.com"}
    ]
)

Root Cause Analysis

# Analyze potential root causes
analysis = await service.analyze_root_cause(
    anomaly_id=anomaly.id,
    dimension_drill_down=True,
    related_metrics=True
)

# Returns:
# RootCauseAnalysis(
#     primary_cause="North America region showing 45% drop",
#     contributing_factors=[
#         "Enterprise segment down 60%",
#         "Product category 'Software' down 35%"
#     ],
#     correlated_anomalies=[
#         "Orders.count also anomalous (correlation: 0.89)"
#     ],
#     recommendations=[
#         "Investigate North America Enterprise Software sales"
#     ]
# )

API Endpoints

Create Monitor

POST /api/v1/anomaly/monitors
Content-Type: application/json

{
  "name": "Revenue Monitor",
  "cube_name": "Orders",
  "measure_name": "revenue",
  "detector_type": "z_score",
  "sensitivity": 2.5,
  "dimensions": ["region"],
  "schedule": "0 * * * *",
  "alert_config": {
    "channels": [
      {"type": "email", "target": "alerts@company.com"}
    ],
    "min_severity": "HIGH"
  }
}

List Anomalies

GET /api/v1/anomaly/detections?
  monitor_id=<uuid>&
  start_date=2024-01-01&
  end_date=2024-01-31&
  min_severity=MEDIUM

Get Root Cause Analysis

GET /api/v1/anomaly/detections/<detection_id>/root-cause

Detection Algorithms

Z-Score Detection

Best for normally distributed metrics:

Z-Score = (value - mean) / standard_deviation

Anomaly if: |Z-Score| > sensitivity_threshold

Configuration:

sensitivity: Number of standard deviations (default: 2.5)
training_size: Historical values for baseline (default: 100)

IQR Detection

Best for skewed distributions:

IQR = Q3 - Q1 (75th percentile - 25th percentile)
Lower Bound = Q1 - (sensitivity * IQR)
Upper Bound = Q3 + (sensitivity * IQR)

Anomaly if: value < Lower Bound OR value > Upper Bound

Configuration:

sensitivity: IQR multiplier (default: 1.5)
training_size: Historical values for quartile calculation

MAD Detection

Most robust to existing outliers:

MAD = median(|values - median(values)|)
Modified Z-Score = 0.6745 * (value - median) / MAD

Anomaly if: |Modified Z-Score| > sensitivity_threshold

Configuration:

sensitivity: Modified Z-Score threshold (default: 3.0)
training_size: Historical values for median calculation

Alert Management

Alert Grouping

Related alerts are grouped to prevent alert fatigue:

# Grouping strategies:
# - By monitor: Group all alerts from same monitor
# - By dimension: Group alerts with same dimension values
# - By time: Group alerts within time window
# - By severity: Group alerts of similar severity

grouper = AlertGrouper(
    strategy="dimension",
    window_minutes=15,
    min_group_size=2
)

Alert Delivery

Delivery Channels:
├── Email: HTML-formatted alert with charts
├── Slack: Interactive message with actions
├── Webhook: JSON payload for custom integrations
├── PagerDuty: For critical alerts
└── SMS: For high-priority notifications

Alert States

Alert Lifecycle:
TRIGGERED → ACKNOWLEDGED → INVESTIGATING → RESOLVED
                                    ↓
                              ESCALATED

Correlation Analysis

Find related metrics affected by the same issue:

correlations = await service.find_correlations(
    anomaly_id=anomaly.id,
    metrics_to_check=["Orders.count", "Orders.avg_value", "Customers.new_signups"],
    time_window_hours=24
)

# Returns metrics with similar anomalous behavior:
# [
#   CorrelatedMetric(name="Orders.count", correlation=0.92, lag_hours=0),
#   CorrelatedMetric(name="Customers.new_signups", correlation=0.78, lag_hours=2)
# ]

Best Practices

Choosing a Detector

Z-Score: Use for metrics with stable, normal distributions (e.g., daily page views)
IQR: Use for metrics with outliers or skewed distributions (e.g., order values)
MAD: Use when you suspect historical data contains anomalies

Setting Sensitivity

High sensitivity (lower threshold): More alerts, fewer missed anomalies
Low sensitivity (higher threshold): Fewer alerts, may miss smaller anomalies

Recommendations:

Start with default values
Monitor false positive rate
Adjust based on business criticality

Monitor Design

One metric per monitor: Easier to tune and understand
Include relevant dimensions: Enable drill-down analysis
Set appropriate schedules: Match your data update frequency
Configure escalation: Ensure critical issues get attention

Example: Complete Setup

from olytix-core.anomaly.service import AnomalyService
from olytix-core.anomaly.monitors.models import AnomalyMonitor, DetectorType
from olytix-core.anomaly.alerts.models import AlertConfig, AlertChannel

# Initialize service
service = AnomalyService()

# Create revenue monitor with Z-Score
revenue_monitor = await service.register_monitor(AnomalyMonitor(
    name="Hourly Revenue",
    cube_name="Orders",
    measure_name="revenue",
    detector_type=DetectorType.Z_SCORE,
    sensitivity=2.5,
    dimensions=["region", "product_category"],
    schedule="0 * * * *"  # Every hour
))

# Configure alerts
await service.configure_alerts(AlertConfig(
    monitor_id=revenue_monitor.id,
    channels=[
        AlertChannel(type="slack", target="#revenue-alerts"),
        AlertChannel(type="email", target="finance@company.com")
    ],
    min_severity="MEDIUM",
    grouping_window_minutes=30
))

# The service will now automatically:
# 1. Run detection every hour
# 2. Score anomalies by severity
# 3. Assess business impact
# 4. Group and deliver alerts
# 5. Store historical anomalies for analysis

Next Steps

Query Assistant - Investigate anomalies with natural language
Data Profiling - Understand your data distributions
Monitoring & Logging - Operational monitoring

Overview​

Key Features​

Detection Algorithms​

Severity Scoring​

Business Impact Assessment​

Usage​

Creating a Monitor​

Running Detection​

Configuring Alerts​

Root Cause Analysis​

API Endpoints​

Create Monitor​

List Anomalies​

Get Root Cause Analysis​

Detection Algorithms​

Z-Score Detection​

IQR Detection​

MAD Detection​

Alert Management​

Alert Grouping​

Alert Delivery​

Alert States​

Correlation Analysis​

Best Practices​

Choosing a Detector​

Setting Sensitivity​

Monitor Design​

Example: Complete Setup​

Next Steps​

Overview

Key Features

Detection Algorithms

Severity Scoring

Business Impact Assessment

Usage

Creating a Monitor

Running Detection

Configuring Alerts

Root Cause Analysis

API Endpoints

Create Monitor

List Anomalies

Get Root Cause Analysis

Detection Algorithms

Z-Score Detection

IQR Detection

MAD Detection

Alert Management

Alert Grouping

Alert Delivery

Alert States

Correlation Analysis

Best Practices

Choosing a Detector

Setting Sensitivity

Monitor Design

Example: Complete Setup

Next Steps