Configuration Guide

This guide covers all aspects of configuring the Optimizer Framework, from basic settings to advanced customization options.

Table of Contents

Configuration Overview
Configuration File Formats
Environment Variables
Model Configuration
Optimizer Configuration
Constraint Configuration
Logging Configuration
API Configuration
Performance Tuning
Best Practices

Configuration Overview

The Optimizer Framework uses a hierarchical configuration system:

Default Configuration (built-in)
Configuration Files (YAML/JSON/TOML)
Environment Variables
Command Line Arguments
Runtime Configuration

Each level overrides the previous, allowing flexible configuration management.

Configuration Loading Order

from optimizer_framework.config import ConfigurationManager

# Load configuration with precedence
config = ConfigurationManager.load_config(
    default_config="config/defaults.yaml",
    user_config="config/production.yaml",
    env_prefix="OPTIMIZER_",
    cli_args=sys.argv[1:]
)

Configuration File Formats

YAML Configuration (Recommended)

# config/optimizer.yaml
version: "1.0"

# Application settings
app:
  name: "Optimizer Framework"
  environment: "production"
  debug: false
  log_level: "INFO"

# Model configuration
models:
  default_timeout: 300
  cache_enabled: true
  cache_ttl: 3600
  registry:
    type: "local"
    path: "/app/models"
  
  # Model-specific settings
  revenue_model:
    type: "docker"
    image: "revenue-model:latest"
    port: 8001
    health_check_interval: 30
    timeout: 600
    
  awareness_model:
    type: "python"
    module: "models.awareness"
    class: "AwarenessModel"
    params:
      saturation_alpha: 2.5
      carryover_rate: 0.3

# Optimizer configuration
optimizer:
  default_type: "optuna"
  max_iterations: 1000
  convergence_tolerance: 1e-6
  
  scipy:
    method: "SLSQP"
    options:
      ftol: 1e-6
      maxiter: 1000
      disp: false
  
  optuna:
    n_trials: 2000
    n_jobs: -1
    sampler:
      type: "TPE"
      n_startup_trials: 10
      n_ei_candidates: 24
    pruner:
      type: "MedianPruner"
      n_startup_trials: 5

# Database configuration
database:
  url: "${DATABASE_URL}"
  pool_size: 20
  max_overflow: 0
  pool_pre_ping: true
  echo: false

# Cache configuration
cache:
  type: "redis"
  redis:
    host: "${REDIS_HOST:localhost}"
    port: ${REDIS_PORT:6379}
    db: 0
    password: "${REDIS_PASSWORD:}"
    socket_timeout: 5
    decode_responses: true
  
  memory:
    max_size: 1000
    ttl: 3600

# API configuration
api:
  host: "0.0.0.0"
  port: 8000
  workers: ${API_WORKERS:4}
  cors:
    enabled: true
    origins:
      - "https://app.example.com"
      - "https://localhost:3000"
    allow_credentials: true
    allow_methods: ["GET", "POST", "PUT", "DELETE"]
    allow_headers: ["*"]
  
  rate_limiting:
    enabled: true
    default_limit: "100/minute"
    endpoints:
      "/api/optimize": "10/minute"
      "/api/models": "1000/hour"
  
  authentication:
    enabled: true
    type: "jwt"
    secret_key: "${JWT_SECRET_KEY}"
    algorithm: "HS256"
    token_expiration: 3600

# Monitoring configuration
monitoring:
  metrics:
    enabled: true
    endpoint: "/metrics"
    include_histogram: true
  
  tracing:
    enabled: false
    jaeger:
      agent_host: "localhost"
      agent_port: 6831
      service_name: "optimizer-framework"
  
  health_check:
    endpoint: "/health"
    detailed_endpoint: "/health/detailed"
    checks:
      - database
      - cache
      - models

# Feature flags
features:
  async_optimization: true
  model_versioning: true
  auto_scaling: false
  experimental:
    neural_optimizer: false
    quantum_annealing: false

JSON Configuration

{
  "version": "1.0",
  "app": {
    "name": "Optimizer Framework",
    "environment": "production",
    "debug": false,
    "log_level": "INFO"
  },
  "models": {
    "default_timeout": 300,
    "cache_enabled": true,
    "registry": {
      "type": "local",
      "path": "/app/models"
    }
  },
  "optimizer": {
    "default_type": "scipy",
    "scipy": {
      "method": "SLSQP",
      "options": {
        "ftol": 1e-6,
        "maxiter": 1000
      }
    }
  }
}

TOML Configuration

# config/optimizer.toml
version = "1.0"

[app]
name = "Optimizer Framework"
environment = "production"
debug = false
log_level = "INFO"

[models]
default_timeout = 300
cache_enabled = true

[models.registry]
type = "local"
path = "/app/models"

[optimizer]
default_type = "scipy"

[optimizer.scipy]
method = "SLSQP"

[optimizer.scipy.options]
ftol = 1e-6
maxiter = 1000

Environment Variables

Standard Environment Variables

# Application
export OPTIMIZER_ENV=production
export OPTIMIZER_DEBUG=false
export OPTIMIZER_LOG_LEVEL=INFO

# Database
export DATABASE_URL=postgresql://user:pass@localhost:5432/optimizer
export DATABASE_POOL_SIZE=20

# Redis Cache
export REDIS_HOST=localhost
export REDIS_PORT=6379
export REDIS_PASSWORD=secret

# API
export API_HOST=0.0.0.0
export API_PORT=8000
export API_WORKERS=4

# Security
export JWT_SECRET_KEY=your-secret-key
export CORS_ORIGINS=https://app.example.com,https://localhost:3000

# AWS (if using)
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_REGION=us-east-1

# Feature Flags
export FEATURE_ASYNC_OPTIMIZATION=true
export FEATURE_MODEL_VERSIONING=true

Loading Environment Variables

from optimizer_framework.config import EnvConfig

# Load with prefix
env_config = EnvConfig(prefix="OPTIMIZER_")

# Access values with type conversion
debug = env_config.get_bool("DEBUG", default=False)
port = env_config.get_int("API_PORT", default=8000)
origins = env_config.get_list("CORS_ORIGINS", default=["*"])

# Load from .env file
from dotenv import load_dotenv
load_dotenv()

Model Configuration

Model Registry Configuration

# models/registry.yaml
models:
  - name: "revenue_model"
    version: "2.0.0"
    type: "sklearn"
    path: "models/revenue_model_v2.pkl"
    features:
      - tv_spend
      - digital_spend
      - radio_spend
    metadata:
      trained_date: "2024-01-15"
      performance:
        rmse: 0.05
        mape: 0.03
    
  - name: "awareness_model"
    version: "1.5.0"
    type: "tensorflow"
    path: "models/awareness_model"
    serving_config:
      batch_size: 32
      input_shape: [None, 10]
    
  - name: "attribution_model"
    version: "3.1.0"
    type: "api"
    endpoint: "https://api.attribution.com/v1/predict"
    auth:
      type: "bearer"
      token: "${ATTRIBUTION_API_TOKEN}"
    timeout: 30
    retry:
      max_attempts: 3
      backoff: "exponential"

Model-Specific Configuration

# config/models/revenue_model.py
MODEL_CONFIG = {
    "name": "Revenue Model",
    "version": "2.0.0",
    "type": "regression",
    
    # Model parameters
    "parameters": {
        "saturation": {
            "tv": {"alpha": 2.5, "gamma": 0.8},
            "digital": {"alpha": 2.0, "gamma": 0.9},
            "radio": {"alpha": 2.3, "gamma": 0.7}
        },
        "carryover": {
            "tv": 0.3,
            "digital": 0.1,
            "radio": 0.2
        },
        "base_sales": 500000,
        "trend": 0.02,
        "seasonality": {
            "amplitude": 50000,
            "frequency": 52  # Weekly
        }
    },
    
    # Data requirements
    "input_schema": {
        "required_features": ["tv", "digital", "radio"],
        "optional_features": ["competitor_spend", "macro_index"],
        "time_granularity": "weekly",
        "min_history": 52  # weeks
    },
    
    # Performance settings
    "performance": {
        "cache_predictions": True,
        "batch_size": 1000,
        "use_gpu": False,
        "parallel_workers": 4
    }
}

Optimizer Configuration

Optimizer Profiles

# config/optimizer_profiles.yaml
profiles:
  fast:
    type: "scipy"
    method: "L-BFGS-B"
    options:
      ftol: 1e-4
      maxiter: 100
      maxfun: 150
    
  accurate:
    type: "scipy"
    method: "trust-constr"
    options:
      xtol: 1e-8
      gtol: 1e-8
      maxiter: 1000
      initial_trust_radius: 1.0
    
  exploration:
    type: "optuna"
    n_trials: 5000
    sampler:
      type: "TPE"
      multivariate: true
      constant_liar: true
    pruner:
      type: "HyperbandPruner"
      min_resource: 10
      reduction_factor: 3
    
  production:
    type: "hybrid"
    stages:
      - profile: "exploration"
        n_trials: 100
      - profile: "accurate"
        warm_start: true

Dynamic Optimizer Selection

# config/optimizer_selector.py
def select_optimizer_config(problem_characteristics):
    """Select optimizer configuration based on problem."""
    
    n_variables = problem_characteristics['n_variables']
    constraint_type = problem_characteristics['constraint_type']
    objective_smoothness = problem_characteristics['smoothness']
    
    if n_variables < 10 and objective_smoothness == 'smooth':
        return load_profile('fast')
    elif constraint_type == 'nonlinear':
        return load_profile('accurate')
    elif n_variables > 100:
        return load_profile('exploration')
    else:
        return load_profile('production')

Constraint Configuration

Constraint Templates

# config/constraints.yaml
constraint_templates:
  standard_marketing:
    budget_constraints:
      total:
        min: 100000
        max: 10000000
      
      channel_bounds:
        tv:
          min: 0
          max: 5000000
        digital:
          min: 10000
          max: 8000000
        radio:
          min: 0
          max: 1000000
    
    business_rules:
      - name: "digital_minimum"
        type: "percentage"
        channel: "digital"
        min_percentage: 0.3
        
      - name: "traditional_maximum"
        type: "percentage"
        channels: ["tv", "radio", "print"]
        max_percentage: 0.5
        
      - name: "channel_ratio"
        type: "ratio"
        numerator: "tv"
        denominator: "digital"
        min_ratio: 0.5
        max_ratio: 2.0
    
    custom_constraints:
      - name: "market_share"
        function: "constraints.market_share_limit"
        params:
          max_share: 0.3
          market_size: 10000000

Dynamic Constraints

# config/dynamic_constraints.py
from datetime import datetime

def get_seasonal_constraints(base_constraints, current_date):
    """Adjust constraints based on seasonality."""
    constraints = base_constraints.copy()
    
    # Q4 holiday season
    if current_date.month in [10, 11, 12]:
        # Increase budgets
        constraints['budget_constraints']['total']['max'] *= 1.5
        constraints['business_rules'].append({
            'name': 'holiday_tv_minimum',
            'type': 'percentage',
            'channel': 'tv',
            'min_percentage': 0.25
        })
    
    # Summer season
    elif current_date.month in [6, 7, 8]:
        # Focus on digital
        constraints['business_rules'].append({
            'name': 'summer_digital_focus',
            'type': 'percentage',
            'channel': 'digital',
            'min_percentage': 0.5
        })
    
    return constraints

Logging Configuration

Structured Logging

# config/logging.yaml
version: 1
disable_existing_loggers: false

formatters:
  standard:
    format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
  
  json:
    class: pythonjsonlogger.jsonlogger.JsonFormatter
    format: '%(asctime)s %(name)s %(levelname)s %(message)s'
  
  detailed:
    format: '%(asctime)s - %(name)s - %(levelname)s - %(funcName)s:%(lineno)d - %(message)s'

handlers:
  console:
    class: logging.StreamHandler
    level: INFO
    formatter: standard
    stream: ext://sys.stdout
  
  file:
    class: logging.handlers.RotatingFileHandler
    level: DEBUG
    formatter: detailed
    filename: logs/optimizer.log
    maxBytes: 10485760  # 10MB
    backupCount: 5
  
  error_file:
    class: logging.handlers.RotatingFileHandler
    level: ERROR
    formatter: json
    filename: logs/errors.log
    maxBytes: 10485760
    backupCount: 5

loggers:
  optimizer_framework:
    level: INFO
    handlers: [console, file]
    propagate: false
  
  optimizer_framework.models:
    level: DEBUG
    handlers: [file]
  
  optimizer_framework.api:
    level: INFO
    handlers: [console, file]

root:
  level: INFO
  handlers: [console]

Log Aggregation

# config/log_aggregation.py
import logging
from logging.handlers import SysLogHandler

def setup_centralized_logging(app_config):
    """Configure centralized logging."""
    
    if app_config.get('logging.centralized.enabled'):
        # Syslog handler
        syslog = SysLogHandler(
            address=(
                app_config['logging.centralized.host'],
                app_config['logging.centralized.port']
            )
        )
        syslog.setFormatter(
            logging.Formatter('optimizer-framework: %(message)s')
        )
        
        # Add to root logger
        logging.getLogger().addHandler(syslog)
        
    # Elasticsearch handler
    if app_config.get('logging.elasticsearch.enabled'):
        from elasticsearch import Elasticsearch
        from pythonjsonlogger import jsonlogger
        
        es_handler = ElasticsearchHandler(
            hosts=[app_config['logging.elasticsearch.host']],
            index_name='optimizer-logs'
        )
        es_handler.setFormatter(jsonlogger.JsonFormatter())
        
        logging.getLogger('optimizer_framework').addHandler(es_handler)

API Configuration

Rate Limiting Configuration

# config/rate_limiting.py
from slowapi import Limiter
from slowapi.util import get_remote_address

RATE_LIMIT_CONFIG = {
    "default": "100/minute",
    "endpoints": {
        "/api/optimize": {
            "limit": "10/minute",
            "key_func": get_remote_address,
            "error_message": "Optimization rate limit exceeded"
        },
        "/api/models/predict": {
            "limit": "1000/hour",
            "key_func": lambda req: f"{get_remote_address(req)}:{req.headers.get('X-API-Key')}"
        },
        "/api/admin/*": {
            "limit": "30/minute",
            "key_func": lambda req: req.headers.get('X-API-Key', 'anonymous')
        }
    },
    "storage": {
        "type": "redis",
        "redis_url": "redis://localhost:6379/1"
    }
}

CORS Configuration

# config/cors.py
from fastapi.middleware.cors import CORSMiddleware

CORS_CONFIG = {
    "allow_origins": [
        "https://app.example.com",
        "https://staging.example.com",
        "http://localhost:3000"  # Development
    ],
    "allow_credentials": True,
    "allow_methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
    "allow_headers": [
        "Accept",
        "Accept-Language",
        "Content-Type",
        "Authorization",
        "X-API-Key",
        "X-Request-ID"
    ],
    "expose_headers": [
        "X-Rate-Limit-Limit",
        "X-Rate-Limit-Remaining",
        "X-Rate-Limit-Reset"
    ],
    "max_age": 86400  # 24 hours
}

def configure_cors(app):
    """Configure CORS for the application."""
    if app.config.get('api.cors.enabled'):
        app.add_middleware(
            CORSMiddleware,
            **CORS_CONFIG
        )

Performance Tuning

Cache Configuration

# config/cache.yaml
cache:
  # Model prediction cache
  predictions:
    enabled: true
    backend: "redis"
    ttl: 3600  # 1 hour
    max_size: 10000
    eviction_policy: "lru"
    key_prefix: "pred:"
    
  # Optimization results cache
  optimizations:
    enabled: true
    backend: "redis"
    ttl: 86400  # 24 hours
    max_size: 1000
    key_prefix: "opt:"
    
  # API response cache
  api_responses:
    enabled: true
    backend: "memory"
    ttl: 300  # 5 minutes
    max_size: 5000
    
  # Configuration
  redis:
    connection_pool:
      max_connections: 50
      max_idle_time: 300
      retry_on_timeout: true

Worker Configuration

# config/workers.yaml
workers:
  # API workers
  api:
    count: ${WORKERS:4}
    worker_class: "uvicorn.workers.UvicornWorker"
    timeout: 300
    graceful_timeout: 30
    max_requests: 1000
    max_requests_jitter: 50
    
  # Background workers
  background:
    count: 2
    queues:
      - optimization
      - model_training
      - reporting
    concurrency: 10
    
  # Optimization workers
  optimization:
    pool_size: ${CPU_COUNT:4}
    process_based: true
    memory_limit: "2GB"
    timeout: 3600

Best Practices

Configuration Management

Use Environment Variables for Secrets

# Good
database_url = os.environ['DATABASE_URL']

# Bad
database_url = "postgresql://user:password@localhost/db"

Validate Configuration on Startup

from pydantic import BaseSettings, validator

class AppConfig(BaseSettings):
    database_url: str
    redis_url: str
    jwt_secret: str
    
    @validator('database_url')
    def validate_db_url(cls, v):
        if not v.startswith(('postgresql://', 'mysql://')):
            raise ValueError('Invalid database URL')
        return v
    
    class Config:
        env_prefix = 'OPTIMIZER_'

Use Configuration Profiles

# config/profiles.py
PROFILES = {
    'development': {
        'debug': True,
        'log_level': 'DEBUG',
        'cache_enabled': False
    },
    'staging': {
        'debug': False,
        'log_level': 'INFO',
        'cache_enabled': True
    },
    'production': {
        'debug': False,
        'log_level': 'WARNING',
        'cache_enabled': True
    }
}

Implement Configuration Hot-Reload

from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

class ConfigReloader(FileSystemEventHandler):
    def on_modified(self, event):
        if event.src_path.endswith('.yaml'):
            logger.info(f"Reloading configuration: {event.src_path}")
            ConfigurationManager.reload()

Document All Configuration Options

# Each configuration option should have:
setting_name:
  value: "default_value"
  description: "What this setting controls"
  type: "string"
  required: false
  env_var: "OPTIMIZER_SETTING_NAME"
  example: "example_value"

Security Considerations

Never commit secrets to version control
Use secret management services in production
Rotate secrets regularly
Implement least privilege access
Audit configuration changes

Configuration Testing

# tests/test_config.py
import pytest
from optimizer_framework.config import ConfigurationManager

def test_load_config():
    """Test configuration loading."""
    config = ConfigurationManager.load_config('tests/fixtures/test_config.yaml')
    assert config['app']['name'] == 'Test Optimizer'
    assert config['database']['pool_size'] == 10

def test_env_override():
    """Test environment variable override."""
    os.environ['OPTIMIZER_DATABASE_POOL_SIZE'] = '20'
    config = ConfigurationManager.load_config()
    assert config['database']['pool_size'] == 20

def test_config_validation():
    """Test configuration validation."""
    with pytest.raises(ValueError):
        ConfigurationManager.load_config('tests/fixtures/invalid_config.yaml')

Remember: Good configuration management is key to maintaining a flexible, secure, and maintainable system!