4 Core Concepts

Atlas is a unified optimization framework that enables data-driven budget allocation across diverse models and scenarios. This document covers the four fundamental concepts that make Atlas powerful and flexible.

Overview

Atlas operates on four core concepts that work together to solve complex optimization problems:

  1. Models - Predictive models that estimate outcomes

  2. Optimization - Algorithms and strategies for finding optimal solutions

  3. Constraints - Business rules and limitations that solutions must satisfy

  4. Data - Multi-dimensional data structures and management


Data

Data management in Atlas handles the complex, multi-dimensional nature of many problems a business might face. The framework uses Xarray as its foundation for powerful and flexible data structures. See Data for more detail.

Data Architecture

Atlas is built on Xarray, which provides:

  • Multi-dimensional arrays with labeled axes

  • Automatic alignment across different data sources

  • Broadcasting for operations across dimensions

  • Missing data handling with interpolation options

  • Efficient computation with NumPy backend

Core Data Structures

Budget Allocations

Budget data is represented as Xarray DataArrays or Datasets with labeled dimensions:

import xarray as xr
import numpy as np

# Simple budget allocation (2D: channels × time)
budget = xr.DataArray(
    data=np.array([
        [100_000, 120_000, 110_000, 130_000],  # Digital by month
        [200_000, 180_000, 190_000, 210_000],  # TV by month
        [50_000, 60_000, 55_000, 65_000]       # Radio by month
    ]),
    dims=['channel', 'month'],
    coords={
        'channel': ['digital', 'tv', 'radio'],
        'month': ['jan', 'feb', 'mar', 'apr']
    }
)

# Complex budget allocation (4D: channels × regions × time × products)
complex_budget = xr.Dataset({
    'budget': xr.DataArray(
        data=np.random.rand(3, 2, 12, 4) * 1_000_000,
        dims=['channel', 'region', 'month', 'product'],
        coords={
            'channel': ['digital', 'tv', 'radio'],
            'region': ['north', 'south'],
            'month': range(1, 13),
            'product': ['product_a', 'product_b', 'product_c', 'product_d']
        }
    )
})

Model Outputs

Model predictions follow the same structure for consistency:

# Revenue predictions with confidence intervals
revenue_predictions = xr.Dataset({
    'revenue': xr.DataArray(
        data=predicted_revenue,
        dims=['channel', 'month'],
        coords={'channel': channels, 'month': months}
    ),
    'revenue_lower': xr.DataArray(
        data=revenue_lower_bound,
        dims=['channel', 'month'],
        coords={'channel': channels, 'month': months}
    ),
    'revenue_upper': xr.DataArray(
        data=revenue_upper_bound,
        dims=['channel', 'month'],
        coords={'channel': channels, 'month': months}
    )
})

Historical Data

Historical performance data for model training and validation:

historical_data = xr.Dataset({
    'spend': xr.DataArray(
        data=historical_spend_data,
        dims=['date', 'channel'],
        coords={'date': date_range, 'channel': channels}
    ),
    'impressions': xr.DataArray(
        data=impressions_data,
        dims=['date', 'channel'],
        coords={'date': date_range, 'channel': channels}
    ),
    'conversions': xr.DataArray(
        data=conversion_data,
        dims=['date', 'channel'],
        coords={'date': date_range, 'channel': channels}
    ),
    'revenue': xr.DataArray(
        data=revenue_data,
        dims=['date'],
        coords={'date': date_range}
    )
})

Data Operations

Budget Aggregation

# Total budget by channel across all time periods
total_by_channel = budget.sum(dim='month')

# Monthly totals across all channels
monthly_totals = budget.sum(dim='channel')

# Budget allocation percentages
budget_percentages = budget / budget.sum(dim='channel')

Data Alignment

Xarray automatically handles alignment across different data sources:

# Automatically aligns budget and cost data
cost_per_impression = budget / impressions

# Handles missing values gracefully
roi = revenue / budget.where(budget > 0)

Resampling and Interpolation

# Resample daily data to monthly
monthly_data = daily_budget.resample(date='M').mean()

# Interpolate missing values
complete_budget = budget.interpolate_na(dim='month', method='linear')

# Fill forward for categorical data
filled_budget = budget.fillna(method='ffill')

Data Validation

Atlas includes comprehensive data validation to ensure data quality:

from atlas.data import DataValidator

validator = DataValidator()

# Validate data structure
validation_result = validator.validate_structure(
    data=budget_data,
    expected_dims=['channel', 'month'],
    required_coords=['channel', 'month']
)

# Check for data quality issues
quality_report = validator.check_data_quality(
    data=budget_data,
    checks=['missing_values', 'negative_values', 'outliers', 'consistency']
)

# Validate business rules
business_validation = validator.validate_business_rules(
    data=budget_data,
    rules={
        'positive_budgets': 'budget >= 0',
        'reasonable_scale': 'budget <= 10_000_000',
        'minimum_spend': 'budget.sum() >= 100_000'
    }
)

Data Transformation

Atlas provides utilities for common data transformations:

Unit Conversion

from atlas.data import UnitConverter

converter = UnitConverter()

# Convert between currencies
budget_usd = converter.convert_currency(
    budget_local,
    from_currency='EUR',
    to_currency='USD',
    exchange_rates=exchange_rate_data
)

# Normalize spend by market size
normalized_budget = converter.normalize_by_market_size(
    budget=budget_data,
    market_sizes=market_size_by_region
)

Time Aggregation

from atlas.data import TimeAggregator

aggregator = TimeAggregator()

# Aggregate daily data to weekly
weekly_budget = aggregator.aggregate_time(
    data=daily_budget,
    target_frequency='W',
    method='sum'
)

# Create rolling windows
rolling_budget = aggregator.rolling_window(
    data=budget_data,
    window=4,  # 4-week rolling average
    method='mean'
)

Feature Engineering

from atlas.data import FeatureEngineer

engineer = FeatureEngineer()

# Create lagged features
lagged_features = engineer.create_lags(
    data=budget_data,
    lags=[1, 2, 4],  # 1-week, 2-week, 4-week lags
    variables=['budget', 'impressions']
)

# Calculate moving averages
moving_averages = engineer.moving_average(
    data=budget_data,
    windows=[4, 8, 12],  # 4, 8, 12-week averages
    center=True
)

# Create interaction terms
interactions = engineer.create_interactions(
    data=budget_data,
    interaction_pairs=[('digital', 'tv'), ('radio', 'print')]
)

Data Import/Export

Atlas supports various data formats and sources:

File Formats

from atlas.data import DataLoader, DataExporter

loader = DataLoader()

# Load from CSV
budget_data = loader.from_csv(
    'budget_data.csv',
    index_cols=['date'],
    parse_dates=['date']
)

# Load from Excel with multiple sheets
excel_data = loader.from_excel(
    'marketing_data.xlsx',
    sheets=['budget', 'performance', 'costs'],
    header_row=2
)

# Load from Parquet (efficient for large datasets)
large_dataset = loader.from_parquet('historical_data.parquet')

# Export results
exporter = DataExporter()
exporter.to_excel(
    optimization_results,
    'optimization_results.xlsx',
    include_metadata=True
)

Database Connections

from atlas.data import DatabaseConnector

db = DatabaseConnector(
    connection_string="postgresql://user:pass@localhost/marketing"
)

# Load budget data from database
budget_query = """
    SELECT date, channel, spend, impressions, conversions
    FROM marketing_spend
    WHERE date >= %s AND date <= %s
"""

budget_data = db.query_to_xarray(
    query=budget_query,
    params=[start_date, end_date],
    index_cols=['date'],
    data_vars=['spend', 'impressions', 'conversions']
)

API Integration

from atlas.data import APIConnector

api = APIConnector(
    base_url="https://api.marketing-platform.com",
    auth_token="your-token"
)

# Fetch performance data
performance_data = api.get_performance_data(
    start_date=start_date,
    end_date=end_date,
    metrics=['impressions', 'clicks', 'conversions'],
    dimensions=['channel', 'campaign']
)

Data Performance

Atlas includes optimizations for handling large datasets:

Lazy Loading

# Load large datasets lazily (don't load into memory until needed)
large_dataset = xr.open_dataset(
    'huge_marketing_data.nc',
    chunks={'date': 1000, 'channel': 10}  # Dask chunks
)

# Computations are lazy until explicitly computed
result = large_dataset.groupby('channel').mean()
computed_result = result.compute()  # Now actually compute

Efficient Storage

# Save with compression for smaller file sizes
budget_data.to_netcdf(
    'budget_data.nc',
    encoding={'budget': {'zlib': True, 'complevel': 9}}
)

# Use efficient data types
optimized_data = budget_data.astype({
    'budget': 'float32',    # Reduce precision if appropriate
    'channel': 'category'   # Use categorical for string data
})

Models

Models in Atlas are the predictive engines that estimate outcomes (revenue, awareness, conversions, etc.) based on decisions a business makes. Atlas is designed to be completely model-agnostic, supporting any type of predictive model. Models

Model Types

Atlas supports integration with various model types:

Machine Learning Models

  • Scikit-learn models: Random Forest, XGBoost, Neural Networks

  • Deep learning frameworks: TensorFlow, PyTorch models

  • Time series models: ARIMA, Prophet, seasonal decomposition

  • Custom ML pipelines: End-to-end prediction workflows

Statistical Models

  • Regression models: Linear, logistic, mixed effects

  • Bayesian models: PyMC, Stan implementations

  • Econometric models: Marketing mix models, attribution models

External Services

  • APIs: Third-party prediction services, cloud ML platforms

  • Legacy systems: Existing business intelligence tools

  • Excel models: Business rule-based spreadsheet models

Custom Models

  • Business rules: Custom logic implementations

  • Hybrid models: Combinations of multiple approaches

  • Rule engines: Decision trees and business logic

Model Integration

Atlas provides multiple integration patterns to accommodate different model architectures:

1. Direct Python Integration

from atlas.models import BaseModel

class MyRevenueModel(BaseModel):
    def __init__(self, trained_model):
        self.model = trained_model
    
    def predict(self, budget_allocation):
        """
        Predict revenue from budget allocation.
        
        Args:
            budget_allocation: xarray.Dataset with budget by channel/time
            
        Returns:
            xarray.Dataset with predicted outcomes
        """
        # Transform budget to model features
        features = self.prepare_features(budget_allocation)
        
        # Generate predictions
        predictions = self.model.predict(features)
        
        # Return structured results
        return self.format_predictions(predictions)
    
    def prepare_features(self, budget_allocation):
        # Feature engineering logic
        pass
    
    def format_predictions(self, raw_predictions):
        # Format output as xarray Dataset
        pass

2. Docker Container Integration

from atlas.models import DockerModel

# For models deployed as containerized services
model = DockerModel(
    image="mycompany/mmm-model:v1.2",
    port=8080,
    health_endpoint="/health",
    prediction_endpoint="/predict"
)

3. API Integration

from atlas.models import APIModel

# For external prediction services
model = APIModel(
    endpoint="https://api.mycompany.com/revenue-model/predict",
    auth_token="your-auth-token",
    timeout=30,
    retry_config={'max_retries': 3, 'backoff_factor': 1.5}
)

Model Requirements

All models must implement the BaseModel interface with these key methods:

Required Methods

  • predict(optimization_levers): Core prediction method

  • validate_input(optimization_levers): Input validation

  • get_feature_names(): Return expected input features

Optional Methods

  • predict_confidence(optimization_levers): Uncertainty estimates

  • explain_prediction(optimization_levers): Model interpretability

  • health_check(): Model availability status

Model Validation

Atlas includes comprehensive validation to ensure model reliability:

from atlas.validation import ModelValidator

validator = ModelValidator()

# Validate model interface compliance
is_valid, errors = validator.validate_interface(model)

# Test model predictions
test_results = validator.test_predictions(
    model=model,
    test_cases=sample_budgets,
    expected_properties=['positive_revenue', 'monotonic_response']
)

# Performance benchmarking
benchmarks = validator.benchmark_performance(
    model=model,
    budget_samples=performance_test_data
)

For a technical guide to implementing a model yourself see: Model Integration


Optimization

Optimization in Atlas finds the best levers to pull given your model predictions, business constraints, and objectives. The framework supports multiple optimization approaches and algorithms. Optimization

Optimization Backends

Atlas provides several optimization engines, each suited for different problem types:

SciPy Optimizer

Best for: Continuous variables, gradient-based optimization, well-behaved objective functions

from atlas.optimizers import ScipyOptimizer

optimizer = ScipyOptimizer(
    model=revenue_model,
    method='trust-constr',  # L-BFGS-B, SLSQP, trust-constr
    config={
        'maxiter': 1000,
        'tol': 1e-8,
        'ftol': 1e-9
    }
)

Supported methods:

  • L-BFGS-B: Bounded optimization, good for smooth functions

  • SLSQP: Sequential quadratic programming, handles constraints well

  • trust-constr: Trust region, most robust for complex constraints

Optuna Optimizer

Best for: Hyperparameter tuning, discrete variables, black-box optimization, parallel execution

from atlas.optimizers import OptunaOptimizer

optimizer = OptunaOptimizer(
    model=mmm_model,
    config={
        'n_trials': 1000,
        'sampler': 'TPE',  # Tree-structured Parzen Estimator
        'pruner': 'MedianPruner',
        'n_jobs': -1  # Use all CPU cores
    }
)

Key features:

  • Bayesian optimization with TPE sampler

  • Early stopping with pruning

  • Parallel execution support

  • Built-in hyperparameter suggestions

CVXPY Optimizer

Best for: Convex optimization, linear/quadratic programming, mathematical guarantees

from atlas.optimizers import CVXPYOptimizer

optimizer = CVXPYOptimizer(
    model=linear_model,
    solver='ECOS',  # ECOS, SCS, MOSEK
    config={
        'verbose': True,
        'max_iters': 10000,
        'abstol': 1e-8
    }
)

Supported problem types:

  • Linear programming (LP)

  • Quadratic programming (QP)

  • Second-order cone programming (SOCP)

  • Semidefinite programming (SDP)

Multi-Objective Optimization

Atlas supports optimizing multiple objectives simultaneously:

from atlas.optimizers import MultiObjectiveOptimizer

optimizer = MultiObjectiveOptimizer(
    objectives={
        'revenue': {
            'model': revenue_model,
            'weight': 0.5,
            'direction': 'maximize'
        },
        'awareness': {
            'model': awareness_model,
            'weight': 0.3,
            'direction': 'maximize'
        },
        'cost_efficiency': {
            'model': cost_model,
            'weight': 0.2,
            'direction': 'minimize'
        }
    },
    method='weighted_sum'  # weighted_sum, pareto_frontier, lexicographic
)

# Generate Pareto frontier
pareto_solutions = optimizer.generate_pareto_frontier(
    n_points=20,
    initial_budget=base_allocation
)

Optimization Strategies

Single-Point Optimization

Find one optimal solution:

result = optimizer.optimize(
    initial_budget=starting_allocation,
    constraints=business_constraints
)

print(f"Optimal allocation: {result.optimal_budget}")
print(f"Expected return: {result.optimal_value}")
print(f"Optimization time: {result.optimization_time}")

Multi-Start Optimization

Improve robustness by trying multiple starting points:

result = optimizer.multistart_optimize(
    n_starts=10,
    initial_budget_range={
        'digital': (100_000, 500_000),
        'tv': (200_000, 800_000),
        'radio': (50_000, 300_000)
    },
    constraints=constraints
)

Scenario Optimization

Optimize across multiple scenarios:

scenarios = {
    'optimistic': {'market_growth': 1.1, 'competition': 0.9},
    'base': {'market_growth': 1.0, 'competition': 1.0},
    'pessimistic': {'market_growth': 0.9, 'competition': 1.1}
}

robust_result = optimizer.scenario_optimize(
    scenarios=scenarios,
    risk_preference='robust',  # robust, risk_neutral, risk_seeking
    confidence_level=0.95
)

Performance Optimization

Atlas includes several features to improve optimization performance:

Caching

from atlas.caching import ModelCache

cached_model = ModelCache(
    base_model=expensive_model,
    cache_size=1000,
    ttl=3600  # Cache for 1 hour
)

Parallel Execution

# Automatic parallelization for applicable optimizers
optimizer = OptunaOptimizer(
    model=model,
    config={'n_jobs': -1}  # Use all available cores
)

Warm Starting

# Use previous solution as starting point
result = optimizer.optimize(
    initial_budget=previous_optimal_budget,
    warm_start=True
)

Constraints

Constraints in Atlas define the business rules and limitations that any optimal solution must satisfy. They ensure that optimization results are feasible and aligned with business requirements. Constraints

Constraint Types

Atlas supports various types of constraints to model real-world business requirements:

Budget Constraints

Total Budget Limits:

constraints = {
    'total_budget': 1_000_000,  # Cannot exceed $1M total spend
    'min_total_budget': 800_000  # Must spend at least $800K
}

Channel-Specific Budgets:

constraints = {
    'bounds': {
        'digital': (100_000, 500_000),  # Digital: $100K-$500K
        'tv': (200_000, 800_000),       # TV: $200K-$800K
        'radio': (0, 300_000),          # Radio: $0-$300K
        'print': (50_000, 200_000)      # Print: $50K-$200K
    }
}

Percentage Allocations:

constraints = {
    'percentage_bounds': {
        'digital': (0.3, 0.6),    # 30-60% of total budget
        'traditional': (0.2, 0.5) # 20-50% for traditional channels
    }
}

Business Rule Constraints

Minimum Spend Requirements:

constraints = {
    'min_spend_rules': {
        'digital': 150_000,      # Must spend at least $150K on digital
        'brand_building': 300_000 # Must allocate $300K to brand activities
    }
}

Channel Dependencies:

constraints = {
    'dependency_rules': [
        {
            'type': 'if_then',
            'condition': 'tv_budget > 500000',
            'requirement': 'digital_budget >= 200000'
        },
        {
            'type': 'mutual_exclusive',
            'channels': ['premium_tv', 'basic_tv'],
            'max_active': 1
        }
    ]
}

Geographic Constraints:

constraints = {
    'geographic_rules': {
        'north_region': {'min_budget': 200_000, 'max_budget': 600_000},
        'south_region': {'min_budget': 150_000, 'max_budget': 500_000},
        'total_regional_balance': 0.1  # Regions within 10% of each other
    }
}

Temporal Constraints

Seasonal Restrictions:

constraints = {
    'seasonal_rules': {
        'q4_boost': {
            'months': [10, 11, 12],
            'min_increase': 0.2  # 20% increase in Q4
        },
        'summer_reduction': {
            'months': [6, 7, 8],
            'max_spend_ratio': 0.8  # Reduce spend by 20% in summer
        }
    }
}

Budget Smoothing:

constraints = {
    'smoothing_rules': {
        'max_month_to_month_change': 0.15,  # Max 15% change between months
        'max_quarter_variance': 0.25        # Quarters within 25% of average
    }
}

Performance Constraints

ROI Requirements:

constraints = {
    'performance_targets': {
        'min_roi': 3.0,           # Minimum 3:1 ROI
        'min_channel_roi': {
            'digital': 4.0,       # Higher requirement for digital
            'traditional': 2.5    # Lower requirement for traditional
        }
    }
}

Market Share Limits:

def market_share_constraint(budget_allocation):
    """Custom constraint function for market share limits."""
    total_market_spend = 50_000_000
    our_spend = budget_allocation.sum()
    market_share = our_spend / total_market_spend
    return 0.25 - market_share  # Must be <= 0 (max 25% market share)

constraints = {
    'custom_constraints': [
        {'type': 'ineq', 'fun': market_share_constraint}
    ]
}

Custom Constraints

Atlas supports custom constraint functions for complex business rules:

def cross_channel_synergy_constraint(budget):
    """Ensure minimum synergy between digital and TV."""
    digital_spend = budget['digital'].sum()
    tv_spend = budget['tv'].sum()
    
    # Require balanced investment for synergy
    ratio = digital_spend / (tv_spend + 1e-6)
    return 2.0 - ratio  # Digital spend should not exceed 2x TV spend

def brand_safety_constraint(budget):
    """Ensure brand safety requirements are met."""
    premium_channels = ['tv', 'premium_digital', 'print']
    premium_spend = sum(budget[ch].sum() for ch in premium_channels if ch in budget)
    total_spend = budget.sum()
    
    premium_ratio = premium_spend / total_spend
    return premium_ratio - 0.4  # At least 40% in premium channels

constraints = {
    'custom_constraints': [
        {'type': 'ineq', 'fun': cross_channel_synergy_constraint},
        {'type': 'ineq', 'fun': brand_safety_constraint}
    ]
}

Constraint Validation

Atlas provides tools to validate and debug constraints:

from atlas.constraints import ConstraintValidator

validator = ConstraintValidator()

# Check constraint feasibility
is_feasible, violations = validator.check_feasibility(
    constraints=constraints,
    budget_bounds=channel_bounds
)

if not is_feasible:
    print("Constraint violations found:")
    for violation in violations:
        print(f"- {violation['constraint']}: {violation['description']}")

# Test constraint with sample budget
test_budget = {'digital': 300_000, 'tv': 500_000, 'radio': 200_000}
constraint_results = validator.evaluate_constraints(
    budget=test_budget,
    constraints=constraints
)

Constraint Relaxation

When constraints are too restrictive, Atlas provides relaxation strategies:

from atlas.constraints import ConstraintRelaxation

relaxer = ConstraintRelaxation()

# Automatic constraint relaxation
relaxed_constraints = relaxer.relax_constraints(
    original_constraints=strict_constraints,
    relaxation_method='penalty',  # penalty, elastic, priority
    relaxation_factor=0.05  # Allow 5% violation
)

# Priority-based relaxation
relaxed_constraints = relaxer.priority_relaxation(
    constraints=constraints,
    priorities={
        'total_budget': 'hard',      # Never relax
        'channel_bounds': 'medium',   # Moderate relaxation allowed
        'roi_targets': 'soft'        # Can be relaxed significantly
    }
)

Integration Example

Here’s how all four concepts work together in a complete optimization workflow:

from atlas import (
    ModelFactory, OptimizerFactory, 
    ConstraintBuilder, DataLoader
)

# 1. DATA: Load and prepare multi-dimensional data
loader = DataLoader()
historical_data = loader.from_database(
    query="SELECT * FROM marketing_performance",
    dims=['date', 'channel', 'region']
)

# 2. MODEL: Create and validate model
model = ModelFactory.create(
    model_type='xgboost',
    training_data=historical_data,
    target_variable='revenue'
)

# 3. CONSTRAINTS: Define business rules
constraints = ConstraintBuilder() \
    .total_budget(max_budget=5_000_000) \
    .channel_bounds(digital=(500_000, 2_000_000)) \
    .roi_threshold(min_roi=2.5) \
    .build()

# 4. OPTIMIZATION: Find optimal allocation
optimizer = OptimizerFactory.create(
    optimizer_type='optuna',
    model=model,
    n_trials=1000
)

result = optimizer.optimize(
    initial_budget=current_allocation,
    constraints=constraints,
    objectives=['revenue', 'brand_awareness']
)

print(f"Optimal allocation: {result.optimal_budget}")
print(f"Expected outcomes: {result.predictions}")
print(f"Constraint satisfaction: {result.constraint_status}")

These core concepts make Atlas the titan of modeling and optimization.