Column Masking
Column masking protects sensitive data by replacing actual values with masked alternatives based on user permissions. Users without proper access see masked data while authorized users see the original values.
Overview
Column masking in Olytix Core:
- Protects PII, financial data, and other sensitive information
- Applies transparently at query time
- Supports multiple masking functions for different data types
- Works alongside RLS for comprehensive data protection
How Masking Works
┌─────────────────────────────────────────────────────────────────┐
│ User Query │
│ SELECT customer_name, email, phone, revenue FROM customers │
└───────────────────────── ──┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Masking Policy Engine │
│ User: analyst@company.com │
│ Roles: [analyst] (not in exempt_roles: [admin, compliance]) │
│ │
│ Policies: │
│ - email: partial (j***@company.com) │
│ - phone: partial (***-***-1234) │
│ - revenue: numeric_range (1000-2000) │
└───────────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Query Result │
│ customer_name | email | phone | revenue │
│ John Smith | j***@company.com | ***-***-1234 | 1000-2000 │
│ Jane Doe | j***@example.org | ***-***-5678 | 2000-3000 │
└─────────────────────────────────────────────────────────────────┘
Defining Masking Policies
YAML Configuration
# security/masking/email_masking.yml
name: email_masking
description: Mask email addresses for non-privileged users
cube: customers
column: email
enabled: true
priority: 10
# Masking configuration
mask_function: partial
mask_parameters:
pattern: email # j***@domain.com
# Users exempt from masking
exempt_roles:
- admin
- compliance
- data_steward
exempt_groups:
- pii_access
Masking Functions
Olytix Core provides multiple masking functions:
| Function | Description | Example Input | Example Output |
|---|---|---|---|
null | Replace with NULL | john@email.com | NULL |
redact | Replace with redacted text | john@email.com | ***REDACTED*** |
hash | SHA256 hash | john@email.com | a1b2c3d4... |
partial | Show partial value | john@email.com | j***@email.com |
truncate | Truncate to N chars | john@email.com | john@emai |
first_n | Show first N chars | john@email.com | joh*** |
last_n | Show last N chars | john@email.com | ***com |
constant | Replace with constant | john@email.com | [MASKED] |
random | Replace with random | john@email.com | f7a3b2c1... |
date_truncate | Truncate date precision | 2024-03-15 | 2024-03-01 |
numeric_range | Show value as range | 1523 | 1500-2000 |
Masking Function Examples
Email Masking
name: email_partial
cube: customers
column: email
mask_function: partial
mask_parameters:
pattern: email # j***@domain.com
exempt_roles:
- admin
Result: john.smith@company.com becomes j***@company.com
Phone Number Masking
name: phone_partial
cube: customers
column: phone
mask_function: partial
mask_parameters:
pattern: phone # ***-***-1234
exempt_roles:
- support
Result: 555-123-4567 becomes ***-***-4567
Credit Card Masking
name: credit_card_mask
cube: payments
column: card_number
mask_function: partial
mask_parameters:
pattern: credit_card # ****-****-****-1234
exempt_roles:
- payment_admin
Result: 4532-1234-5678-9012 becomes ****-****-****-9012
SSN Masking
name: ssn_mask
cube: employees
column: ssn
mask_function: partial
mask_parameters:
pattern: ssn # ***-**-1234
exempt_roles:
- hr_admin
Result: 123-45-6789 becomes ***-**-6789
Full Redaction
name: salary_redact
cube: employees
column: salary
mask_function: redact
mask_parameters:
text: "[CONFIDENTIAL]"
exempt_roles:
- hr_manager
- executive
Result: 150000 becomes [CONFIDENTIAL]
Numeric Range
name: salary_range
cube: employees
column: salary
mask_function: numeric_range
mask_parameters:
step: 10000 # Show as ranges of 10K
exempt_roles:
- hr_manager
Result: 147500 becomes 140000-150000
Date Truncation
name: birthdate_truncate
cube: customers
column: birth_date
mask_function: date_truncate
mask_parameters:
granularity: year # Show only year
exempt_roles:
- compliance
Result: 1985-03-15 becomes 1985-01-01
Hashing
name: customer_id_hash
cube: analytics
column: customer_id
mask_function: hash
exempt_roles:
- data_engineer
Result: CUST12345 becomes a1b2c3d4e5f6...
First N Characters
name: name_first_n
cube: customers
column: full_name
mask_function: first_n
mask_parameters:
n: 2
mask_rest: "***"
exempt_roles:
- support
Result: John Smith becomes Jo***
Last N Characters
name: account_last_n
cube: accounts
column: account_number
mask_function: last_n
mask_parameters:
n: 4
mask_prefix: "****"
exempt_roles:
- finance
Result: 1234567890 becomes ****7890
Wildcard Column Matching
Apply masking to multiple columns:
name: mask_all_emails
cube: "*" # All cubes
column: "*email*" # Any column containing 'email'
mask_function: partial
mask_parameters:
pattern: email
exempt_roles:
- admin
name: mask_pii_columns
cube: customers
column: "pii_*" # Columns starting with 'pii_'
mask_function: redact
exempt_roles:
- compliance
Cube-Level Configuration
Configure masking at the cube level:
# cubes/customers.yml
name: customers
sql_table: public.customers
dimensions:
- name: email
type: string
sql: email
meta:
sensitive: true
masking_policy: email_masking
- name: phone
type: string
sql: phone
meta:
sensitive: true
masking_policy: phone_masking
- name: ssn
type: string
sql: ssn
meta:
sensitive: true
pii_category: "ssn"
security:
masking:
enabled: true
policies:
- email_masking
- phone_masking
- ssn_masking
API Configuration
Create Policy
curl -X POST http://localhost:8000/api/v1/security/masking-policies \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "email_masking",
"description": "Mask customer emails",
"cube": "customers",
"column": "email",
"mask_function": "partial",
"mask_parameters": {
"pattern": "email"
},
"exempt_roles": ["admin", "support"],
"enabled": true
}'
List Policies
curl http://localhost:8000/api/v1/security/masking-policies \
-H "Authorization: Bearer $ADMIN_TOKEN"
Check Column Masking
curl http://localhost:8000/api/v1/security/masking-policies/check \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"cube": "customers",
"columns": ["email", "phone", "name"]
}'
Response:
{
"columns": {
"email": {
"is_masked": true,
"policy_name": "email_masking",
"mask_function": "partial"
},
"phone": {
"is_masked": true,
"policy_name": "phone_masking",
"mask_function": "partial"
},
"name": {
"is_masked": false,
"policy_name": null,
"mask_function": null
}
}
}
Query Response with Masking
When masking is applied, the response includes metadata:
{
"data": [
{
"customer_name": "John Smith",
"email": "j***@company.com",
"phone": "***-***-4567",
"revenue": "10000-20000"
}
],
"security": {
"masking": {
"policies_applied": ["email_masking", "phone_masking", "revenue_range"],
"columns_masked": ["email", "phone", "revenue"],
"columns_unmasked": ["customer_name"],
"evaluation_time_ms": 0.8
}
}
}
Testing Masking Policies
Dry Run
curl -X POST http://localhost:8000/api/v1/security/masking-policies/evaluate \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"cube": "customers",
"columns": ["email", "phone", "name", "ssn"],
"user_context": {
"user_id": "test_user",
"roles": ["analyst"]
}
}'
CLI Testing
olytix-core security test-masking \
--cube customers \
--columns email,phone,ssn \
--roles analyst
Combining with RLS
Masking works together with RLS:
# RLS Policy - filter rows
name: regional_access
cube: customers
filter_sql: "region = $user.region"
roles:
- analyst
# Masking Policy - mask columns
name: email_masking
cube: customers
column: email
mask_function: partial
exempt_roles:
- admin
Result:
- RLS filters rows to user's region
- Masking hides email details in filtered results
- User sees only authorized rows with masked sensitive columns
SQL Dialect Support
Masking functions generate dialect-specific SQL:
PostgreSQL
-- partial email
CONCAT(LEFT(email, 1), '***', SUBSTRING(email, POSITION('@' IN email)))
-- hash
ENCODE(SHA256((customer_id)::BYTEA), 'hex')
-- date_truncate
DATE_TRUNC('month', birth_date)
MySQL
-- partial email
CONCAT(LEFT(email, 1), '***', SUBSTRING(email, LOCATE('@', email)))
-- hash
SHA2(customer_id, 256)
-- date_truncate
DATE_FORMAT(birth_date, '%Y-%m-01')
Common Patterns
PII Protection
# Comprehensive PII masking
policies:
- name: mask_emails
column: "*email*"
mask_function: partial
mask_parameters:
pattern: email
- name: mask_phones
column: "*phone*"
mask_function: partial
mask_parameters:
pattern: phone
- name: mask_ssn
column: "*ssn*"
mask_function: partial
mask_parameters:
pattern: ssn
- name: mask_addresses
column: "*address*"
mask_function: redact
Financial Data Protection
policies:
- name: mask_credit_cards
column: "*card*"
mask_function: partial
mask_parameters:
pattern: credit_card
- name: mask_account_numbers
column: "*account*"
mask_function: last_n
mask_parameters:
n: 4
- name: mask_amounts
column: "*amount*"
mask_function: numeric_range
mask_parameters:
step: 1000
Role-Based Exemptions
# Different roles see different levels of detail
name: salary_tiered
cube: employees
column: salary
# Base policy - show ranges
mask_function: numeric_range
mask_parameters:
step: 25000
# Exempt roles see actual values
exempt_roles:
- hr_admin
- executive
- payroll
Best Practices
- Mask by default - Start with masking enabled, exempt specific roles
- Use appropriate functions - Choose masking that preserves data utility
- Document exemptions - Clearly document why roles are exempt
- Test thoroughly - Verify masking works as expected
- Audit access - Log when exempt users access sensitive data
- Review regularly - Periodically audit masking policies
Compliance Considerations
| Regulation | Relevant Masking Features |
|---|---|
| GDPR | Email/phone masking, right to be forgotten |
| HIPAA | Patient identifier masking, SSN protection |
| PCI DSS | Credit card masking (show last 4 only) |
| CCPA | Consumer data masking, data minimization |
Troubleshooting
Data Still Visible
- Check if user has exempt role
- Verify policy is enabled
- Confirm column name matches pattern
Wrong Masking Applied
- Check policy priority
- Verify column pattern matching
- Review mask_parameters
Performance Issues
- Limit wildcard column patterns
- Cache policy evaluations
- Consider pre-masking in ETL for large datasets