Skip to main content

Data Governance

For Everyone

Olytix Core's data governance features help organizations maintain data quality, establish clear ownership, and ensure compliance through certification workflows, business glossaries, and comprehensive audit capabilities.

Overview

┌─────────────────────────────────────────────────────────────────────┐
│ DATA GOVERNANCE FRAMEWORK │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ CERTIFICATION│ │ GLOSSARY │ │ OWNERSHIP │ │
│ │ │ │ │ │ │ │
│ │ Draft │ │ Terms │ │ Owners │ │
│ │ Review │ │ Definitions │ │ Stewards │ │
│ │ Certified │ │ Synonyms │ │ Teams │ │
│ │ Deprecated │ │ Relationships│ │ Contacts │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │ │
│ └───────────────────┼───────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ ARTIFACT METADATA │ │
│ │ │ │
│ │ Cubes • Measures • Dimensions • Metrics • Models │ │
│ │ │ │
│ │ Status • Owner • Tags • Description • Version • Links │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ AUDIT │ │ SEARCH & │ │ COMPLIANCE │ │
│ │ │ │ DISCOVERY │ │ │ │
│ │ Access logs │ │ Full-text │ │ Policies │ │
│ │ Changes │ │ Filters │ │ Reports │ │
│ │ Exports │ │ Browse │ │ Attestation │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘

Key Features

Certification Workflow

Establish trust in your data through a formal certification process:

StatusDescriptionBadge
DraftIn development, not ready for useGray
Pending ReviewSubmitted for certificationYellow
CertifiedApproved, trusted for production useGreen
DeprecatedBeing phased out, use alternativesOrange
ArchivedNo longer availableRed

Business Glossary

Maintain a single source of truth for business terminology:

  • Term definitions - Clear, authoritative definitions
  • Synonyms - Alternative names and abbreviations
  • Related terms - Connections between concepts
  • Artifact links - Map terms to measures/dimensions

Ownership Model

Assign clear responsibility for data assets:

  • Data Owners - Accountable for data quality
  • Data Stewards - Day-to-day management
  • Technical Owners - Implementation responsibility
  • Teams - Group ownership for shared assets

Audit & Compliance

Track all changes and access:

  • Change history - Who changed what, when
  • Access logs - Who queried what data
  • Export tracking - Data leaving the platform
  • Compliance reports - Regulatory reporting

Usage

Certification Management

from olytix-core.governance.service import GovernanceService
from olytix-core.governance.models import CertificationStatus

service = GovernanceService()

# Submit artifact for certification
await service.submit_for_certification(
artifact_type="measure",
artifact_id="Orders.revenue",
submitted_by="analyst_123",
notes="Validated against finance system, matches within 0.1%"
)

# Review and certify (requires reviewer role)
await service.certify_artifact(
artifact_type="measure",
artifact_id="Orders.revenue",
certified_by="data_steward_456",
certification_notes="Approved after validation review",
valid_until="2025-01-15" # Optional expiration
)

# Deprecate an artifact
await service.deprecate_artifact(
artifact_type="measure",
artifact_id="Orders.legacy_revenue",
deprecated_by="data_steward_456",
reason="Replaced by Orders.revenue which includes all revenue types",
replacement="Orders.revenue",
sunset_date="2024-06-01"
)

Certification Status

# Get certification status
status = await service.get_certification_status(
artifact_type="measure",
artifact_id="Orders.revenue"
)

# CertificationInfo:
# ├── status: CertificationStatus.CERTIFIED
# ├── certified_by: "data_steward_456"
# ├── certified_at: "2024-01-15T10:30:00Z"
# ├── valid_until: "2025-01-15"
# ├── certification_notes: "Approved after validation review"
# ├── version: 3
# └── history: [
# {"status": "draft", "at": "2024-01-01", "by": "analyst_123"},
# {"status": "pending_review", "at": "2024-01-10", "by": "analyst_123"},
# {"status": "certified", "at": "2024-01-15", "by": "data_steward_456"}
# ]

# List all certified artifacts
certified = await service.list_artifacts(
status=CertificationStatus.CERTIFIED
)

Business Glossary

from olytix-core.governance.glossary.models import GlossaryTerm

# Create a glossary term
term = await service.create_glossary_term(
GlossaryTerm(
name="Revenue",
definition="Total income generated from sales of goods and services before any deductions",
synonyms=["Sales", "Income", "Turnover"],
category="Finance",
examples=["Product sales revenue", "Service revenue", "Subscription revenue"],
related_terms=["Gross Revenue", "Net Revenue", "ARR"],
created_by="data_steward_456"
)
)

# Link term to artifacts
await service.link_term_to_artifact(
term_id=term.id,
artifact_type="measure",
artifact_id="Orders.revenue",
relationship="defines" # defines, relates_to, synonym_of
)

# Search glossary
results = await service.search_glossary(
query="revenue",
category="Finance",
include_synonyms=True
)

Glossary Term Structure

GlossaryTerm:
├── id: "term_123"
├── name: "Revenue"
├── definition: "Total income generated from..."
├── synonyms: ["Sales", "Income", "Turnover"]
├── category: "Finance"
├── subcategory: "Income Statement"
├── examples: ["Product sales revenue", ...]
├── related_terms: ["Gross Revenue", "Net Revenue"]
├── formula: null # Optional for calculated terms
├── owner: "finance_team"
├── status: "approved"

├── linked_artifacts:
│ ├── measures: ["Orders.revenue", "Orders.gross_revenue"]
│ ├── dimensions: []
│ └── metrics: ["mrr", "arr"]

├── created_by: "data_steward_456"
├── created_at: "2024-01-10T09:00:00Z"
├── updated_at: "2024-01-15T14:30:00Z"
└── version: 2

Ownership Management

from olytix-core.governance.ownership.models import OwnershipAssignment, OwnerRole

# Assign owner to a cube
await service.assign_owner(
OwnershipAssignment(
artifact_type="cube",
artifact_id="Orders",
owner_type="user",
owner_id="finance_manager_789",
role=OwnerRole.DATA_OWNER,
assigned_by="admin_001"
)
)

# Assign steward
await service.assign_owner(
OwnershipAssignment(
artifact_type="cube",
artifact_id="Orders",
owner_type="user",
owner_id="analyst_123",
role=OwnerRole.DATA_STEWARD,
assigned_by="finance_manager_789"
)
)

# Assign team ownership
await service.assign_owner(
OwnershipAssignment(
artifact_type="cube",
artifact_id="Orders",
owner_type="team",
owner_id="analytics_team",
role=OwnerRole.TECHNICAL_OWNER,
assigned_by="admin_001"
)
)

# Get ownership info
ownership = await service.get_ownership(
artifact_type="cube",
artifact_id="Orders"
)

# OwnershipInfo:
# ├── data_owner: {"type": "user", "id": "finance_manager_789", "name": "Jane Smith"}
# ├── data_stewards: [{"type": "user", "id": "analyst_123", "name": "John Doe"}]
# ├── technical_owners: [{"type": "team", "id": "analytics_team", "name": "Analytics Team"}]
# └── contact_email: "analytics@company.com"

Artifact Metadata

# Update artifact metadata
await service.update_artifact_metadata(
artifact_type="measure",
artifact_id="Orders.revenue",
metadata={
"description": "Total order revenue including taxes, excluding returns",
"tags": ["finance", "core-metric", "certified"],
"documentation_url": "https://wiki.company.com/metrics/revenue",
"refresh_frequency": "hourly",
"data_classification": "internal",
"pii_flag": False
}
)

# Get full artifact details
details = await service.get_artifact_details(
artifact_type="measure",
artifact_id="Orders.revenue"
)

# ArtifactDetails:
# ├── type: "measure"
# ├── id: "Orders.revenue"
# ├── name: "Revenue"
# ├── description: "Total order revenue..."
# ├── definition: "SUM(order_items.price * order_items.quantity)"
# │
# ├── certification:
# │ ├── status: "certified"
# │ ├── certified_by: "data_steward_456"
# │ └── valid_until: "2025-01-15"
# │
# ├── ownership:
# │ ├── data_owner: "finance_manager_789"
# │ └── stewards: ["analyst_123"]
# │
# ├── glossary_terms: ["Revenue", "Sales"]
# ├── tags: ["finance", "core-metric", "certified"]
# ├── data_classification: "internal"
# │
# ├── lineage:
# │ ├── sources: ["raw.orders", "raw.order_items"]
# │ └── derived_from: ["stg_orders.total_amount"]
# │
# └── usage:
# ├── query_count_30d: 1250
# ├── unique_users_30d: 45
# └── last_queried: "2024-01-15T14:30:00Z"

API Endpoints

Certification

# Submit for certification
POST /api/v1/governance/certification/submit
{
"artifact_type": "measure",
"artifact_id": "Orders.revenue",
"notes": "Validated against finance system"
}

# Certify artifact
POST /api/v1/governance/certification/certify
{
"artifact_type": "measure",
"artifact_id": "Orders.revenue",
"notes": "Approved",
"valid_until": "2025-01-15"
}

# Get certification status
GET /api/v1/governance/certification/status?
artifact_type=measure&
artifact_id=Orders.revenue

# List artifacts by status
GET /api/v1/governance/certification/list?
status=certified&
artifact_type=measure

Glossary

# Create term
POST /api/v1/governance/glossary/terms
{
"name": "Revenue",
"definition": "Total income...",
"synonyms": ["Sales", "Income"],
"category": "Finance"
}

# Search glossary
GET /api/v1/governance/glossary/search?
query=revenue&
category=Finance

# Link term to artifact
POST /api/v1/governance/glossary/terms/<term_id>/links
{
"artifact_type": "measure",
"artifact_id": "Orders.revenue",
"relationship": "defines"
}

# Get term
GET /api/v1/governance/glossary/terms/<term_id>

# List all terms
GET /api/v1/governance/glossary/terms?
category=Finance&
status=approved

Ownership

# Assign owner
POST /api/v1/governance/ownership/assign
{
"artifact_type": "cube",
"artifact_id": "Orders",
"owner_type": "user",
"owner_id": "user_123",
"role": "data_owner"
}

# Get ownership
GET /api/v1/governance/ownership?
artifact_type=cube&
artifact_id=Orders

# List artifacts by owner
GET /api/v1/governance/ownership/by-owner?
owner_id=user_123

Artifact Metadata

# Update metadata
PATCH /api/v1/governance/artifacts/<type>/<id>/metadata
{
"description": "...",
"tags": ["finance", "core"],
"data_classification": "internal"
}

# Get artifact details
GET /api/v1/governance/artifacts/<type>/<id>

# Search artifacts
GET /api/v1/governance/artifacts/search?
query=revenue&
tags=finance&
certification_status=certified

Data Catalog

Browse and discover data assets:

Catalog View

# Browse catalog
catalog = await service.browse_catalog(
artifact_types=["cube", "metric"],
filters={
"certification_status": "certified",
"tags": ["core-metric"],
"owner_team": "analytics_team"
},
sort_by="popularity", # popularity, recent, alphabetical
limit=50
)

# CatalogResult:
# ├── total_count: 125
# ├── items: [
# │ {
# │ "type": "cube",
# │ "id": "Orders",
# │ "name": "Orders",
# │ "description": "All customer orders...",
# │ "certification_status": "certified",
# │ "owner": "finance_team",
# │ "tags": ["finance", "core"],
# │ "usage_score": 95,
# │ "measures_count": 12,
# │ "dimensions_count": 8
# │ },
# │ ...
# │ ]
# └── facets: {
# "certification_status": {"certified": 85, "draft": 30, "deprecated": 10},
# "owner_team": {"analytics": 45, "finance": 50, "marketing": 30},
# "tags": {"finance": 60, "core": 40, "marketing": 35}
# }

Search & Discovery

# Full-text search
results = await service.search_artifacts(
query="customer lifetime value",
artifact_types=["measure", "metric"],
filters={
"certification_status": ["certified", "pending_review"]
},
include_glossary=True
)

# SearchResults:
# ├── artifacts: [
# │ {
# │ "type": "metric",
# │ "id": "customer_ltv",
# │ "name": "Customer Lifetime Value",
# │ "relevance_score": 0.95,
# │ "highlights": ["...customer lifetime value calculation..."]
# │ }
# │ ]
# ├── glossary_matches: [
# │ {
# │ "term": "Lifetime Value (LTV)",
# │ "definition": "Predicted total revenue from a customer..."
# │ }
# │ ]
# └── suggested_terms: ["CLV", "Customer Value", "LTV"]

Audit & Compliance

Access Audit

# Query audit log
audit_log = await service.query_audit_log(
artifact_type="cube",
artifact_id="Orders",
event_types=["query", "export"],
date_range=("2024-01-01", "2024-01-31"),
user_id=None # All users
)

# AuditLog:
# ├── total_events: 1250
# └── events: [
# {
# "timestamp": "2024-01-15T14:30:00Z",
# "event_type": "query",
# "user_id": "analyst_123",
# "artifact_type": "cube",
# "artifact_id": "Orders",
# "details": {
# "measures": ["revenue", "count"],
# "dimensions": ["region"],
# "row_count": 15
# },
# "ip_address": "10.0.1.50"
# },
# ...
# ]

Change History

# Get change history
history = await service.get_change_history(
artifact_type="measure",
artifact_id="Orders.revenue",
limit=20
)

# ChangeHistory:
# └── changes: [
# {
# "timestamp": "2024-01-15T10:00:00Z",
# "change_type": "definition_updated",
# "changed_by": "analyst_123",
# "before": {"sql": "SUM(price)"},
# "after": {"sql": "SUM(price * quantity)"},
# "reason": "Include quantity in revenue calculation"
# },
# {
# "timestamp": "2024-01-10T09:00:00Z",
# "change_type": "created",
# "changed_by": "analyst_123"
# }
# ]

Compliance Reports

# Generate compliance report
report = await service.generate_compliance_report(
report_type="data_inventory",
include_sections=[
"artifact_summary",
"certification_status",
"ownership_coverage",
"access_patterns",
"data_classification"
],
format="pdf"
)

# Report includes:
# - Total artifacts by type and status
# - Certification coverage percentage
# - Ownership assignment coverage
# - Access frequency by user/team
# - Data classification distribution
# - PII flag summary

Best Practices

Certification

  1. Define clear criteria - Document what "certified" means
  2. Regular reviews - Re-certify periodically (e.g., annually)
  3. Track expiration - Don't let certifications lapse
  4. Deprecation process - Clear timeline and alternatives

Glossary

  1. Start with core terms - Focus on high-impact definitions
  2. Involve stakeholders - Get business input on definitions
  3. Link to artifacts - Connect terms to measures/dimensions
  4. Review regularly - Keep definitions current

Ownership

  1. Every asset needs an owner - No orphaned data
  2. Clear escalation - Owner → Steward → Technical Owner
  3. Document responsibilities - What each role does
  4. Regular review - Update when people change roles

Compliance

  1. Enable audit logging - Track all access
  2. Regular reports - Monthly compliance reviews
  3. Data classification - Tag all sensitive data
  4. Retention policies - Define and enforce

Configuration

# governance configuration
governance:
certification:
enabled: true
default_validity_days: 365
require_review_notes: true
auto_deprecate_after_days: 90 # After expiration

glossary:
enabled: true
require_approval: true
sync_from_external: null # Optional external glossary URL

ownership:
require_owner: true
require_steward: false
allow_team_ownership: true

audit:
enabled: true
retention_days: 365
log_queries: true
log_exports: true
log_changes: true

compliance:
data_classification_required: true
pii_flagging_required: true
report_schedule: "0 0 1 * *" # Monthly

Next Steps