Configuration Guide

This guide covers all aspects of configuring the Optimizer Framework, from basic settings to advanced customization options.

Table of Contents

  1. Configuration Overview

  2. Configuration File Formats

  3. Environment Variables

  4. Model Configuration

  5. Optimizer Configuration

  6. Constraint Configuration

  7. Logging Configuration

  8. API Configuration

  9. Performance Tuning

  10. Best Practices

Configuration Overview

The Optimizer Framework uses a hierarchical configuration system:

1. Default Configuration (built-in)
2. Configuration Files (YAML/JSON/TOML)
3. Environment Variables
4. Command Line Arguments
5. Runtime Configuration

Each level overrides the previous, allowing flexible configuration management.

Configuration Loading Order

from optimizer_framework.config import ConfigurationManager

# Load configuration with precedence
config = ConfigurationManager.load_config(
    default_config="config/defaults.yaml",
    user_config="config/production.yaml",
    env_prefix="OPTIMIZER_",
    cli_args=sys.argv[1:]
)

Configuration File Formats

JSON Configuration

{
  "version": "1.0",
  "app": {
    "name": "Optimizer Framework",
    "environment": "production",
    "debug": false,
    "log_level": "INFO"
  },
  "models": {
    "default_timeout": 300,
    "cache_enabled": true,
    "registry": {
      "type": "local",
      "path": "/app/models"
    }
  },
  "optimizer": {
    "default_type": "scipy",
    "scipy": {
      "method": "SLSQP",
      "options": {
        "ftol": 1e-6,
        "maxiter": 1000
      }
    }
  }
}

TOML Configuration

# config/optimizer.toml
version = "1.0"

[app]
name = "Optimizer Framework"
environment = "production"
debug = false
log_level = "INFO"

[models]
default_timeout = 300
cache_enabled = true

[models.registry]
type = "local"
path = "/app/models"

[optimizer]
default_type = "scipy"

[optimizer.scipy]
method = "SLSQP"

[optimizer.scipy.options]
ftol = 1e-6
maxiter = 1000

Environment Variables

Standard Environment Variables

# Application
export OPTIMIZER_ENV=production
export OPTIMIZER_DEBUG=false
export OPTIMIZER_LOG_LEVEL=INFO

# Database
export DATABASE_URL=postgresql://user:pass@localhost:5432/optimizer
export DATABASE_POOL_SIZE=20

# Redis Cache
export REDIS_HOST=localhost
export REDIS_PORT=6379
export REDIS_PASSWORD=secret

# API
export API_HOST=0.0.0.0
export API_PORT=8000
export API_WORKERS=4

# Security
export JWT_SECRET_KEY=your-secret-key
export CORS_ORIGINS=https://app.example.com,https://localhost:3000

# AWS (if using)
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_REGION=us-east-1

# Feature Flags
export FEATURE_ASYNC_OPTIMIZATION=true
export FEATURE_MODEL_VERSIONING=true

Loading Environment Variables

from optimizer_framework.config import EnvConfig

# Load with prefix
env_config = EnvConfig(prefix="OPTIMIZER_")

# Access values with type conversion
debug = env_config.get_bool("DEBUG", default=False)
port = env_config.get_int("API_PORT", default=8000)
origins = env_config.get_list("CORS_ORIGINS", default=["*"])

# Load from .env file
from dotenv import load_dotenv
load_dotenv()

Model Configuration

Model Registry Configuration

# models/registry.yaml
models:
  - name: "revenue_model"
    version: "2.0.0"
    type: "sklearn"
    path: "models/revenue_model_v2.pkl"
    features:
      - tv_spend
      - digital_spend
      - radio_spend
    metadata:
      trained_date: "2024-01-15"
      performance:
        rmse: 0.05
        mape: 0.03
    
  - name: "awareness_model"
    version: "1.5.0"
    type: "tensorflow"
    path: "models/awareness_model"
    serving_config:
      batch_size: 32
      input_shape: [None, 10]
    
  - name: "attribution_model"
    version: "3.1.0"
    type: "api"
    endpoint: "https://api.attribution.com/v1/predict"
    auth:
      type: "bearer"
      token: "${ATTRIBUTION_API_TOKEN}"
    timeout: 30
    retry:
      max_attempts: 3
      backoff: "exponential"

Model-Specific Configuration

# config/models/revenue_model.py
MODEL_CONFIG = {
    "name": "Revenue Model",
    "version": "2.0.0",
    "type": "regression",
    
    # Model parameters
    "parameters": {
        "saturation": {
            "tv": {"alpha": 2.5, "gamma": 0.8},
            "digital": {"alpha": 2.0, "gamma": 0.9},
            "radio": {"alpha": 2.3, "gamma": 0.7}
        },
        "carryover": {
            "tv": 0.3,
            "digital": 0.1,
            "radio": 0.2
        },
        "base_sales": 500000,
        "trend": 0.02,
        "seasonality": {
            "amplitude": 50000,
            "frequency": 52  # Weekly
        }
    },
    
    # Data requirements
    "input_schema": {
        "required_features": ["tv", "digital", "radio"],
        "optional_features": ["competitor_spend", "macro_index"],
        "time_granularity": "weekly",
        "min_history": 52  # weeks
    },
    
    # Performance settings
    "performance": {
        "cache_predictions": True,
        "batch_size": 1000,
        "use_gpu": False,
        "parallel_workers": 4
    }
}

Optimizer Configuration

Optimizer Profiles

# config/optimizer_profiles.yaml
profiles:
  fast:
    type: "scipy"
    method: "L-BFGS-B"
    options:
      ftol: 1e-4
      maxiter: 100
      maxfun: 150
    
  accurate:
    type: "scipy"
    method: "trust-constr"
    options:
      xtol: 1e-8
      gtol: 1e-8
      maxiter: 1000
      initial_trust_radius: 1.0
    
  exploration:
    type: "optuna"
    n_trials: 5000
    sampler:
      type: "TPE"
      multivariate: true
      constant_liar: true
    pruner:
      type: "HyperbandPruner"
      min_resource: 10
      reduction_factor: 3
    
  production:
    type: "hybrid"
    stages:
      - profile: "exploration"
        n_trials: 100
      - profile: "accurate"
        warm_start: true

Dynamic Optimizer Selection

# config/optimizer_selector.py
def select_optimizer_config(problem_characteristics):
    """Select optimizer configuration based on problem."""
    
    n_variables = problem_characteristics['n_variables']
    constraint_type = problem_characteristics['constraint_type']
    objective_smoothness = problem_characteristics['smoothness']
    
    if n_variables < 10 and objective_smoothness == 'smooth':
        return load_profile('fast')
    elif constraint_type == 'nonlinear':
        return load_profile('accurate')
    elif n_variables > 100:
        return load_profile('exploration')
    else:
        return load_profile('production')

Constraint Configuration

Constraint Templates

# config/constraints.yaml
constraint_templates:
  standard_marketing:
    budget_constraints:
      total:
        min: 100000
        max: 10000000
      
      channel_bounds:
        tv:
          min: 0
          max: 5000000
        digital:
          min: 10000
          max: 8000000
        radio:
          min: 0
          max: 1000000
    
    business_rules:
      - name: "digital_minimum"
        type: "percentage"
        channel: "digital"
        min_percentage: 0.3
        
      - name: "traditional_maximum"
        type: "percentage"
        channels: ["tv", "radio", "print"]
        max_percentage: 0.5
        
      - name: "channel_ratio"
        type: "ratio"
        numerator: "tv"
        denominator: "digital"
        min_ratio: 0.5
        max_ratio: 2.0
    
    custom_constraints:
      - name: "market_share"
        function: "constraints.market_share_limit"
        params:
          max_share: 0.3
          market_size: 10000000

Dynamic Constraints

# config/dynamic_constraints.py
from datetime import datetime

def get_seasonal_constraints(base_constraints, current_date):
    """Adjust constraints based on seasonality."""
    constraints = base_constraints.copy()
    
    # Q4 holiday season
    if current_date.month in [10, 11, 12]:
        # Increase budgets
        constraints['budget_constraints']['total']['max'] *= 1.5
        constraints['business_rules'].append({
            'name': 'holiday_tv_minimum',
            'type': 'percentage',
            'channel': 'tv',
            'min_percentage': 0.25
        })
    
    # Summer season
    elif current_date.month in [6, 7, 8]:
        # Focus on digital
        constraints['business_rules'].append({
            'name': 'summer_digital_focus',
            'type': 'percentage',
            'channel': 'digital',
            'min_percentage': 0.5
        })
    
    return constraints

Logging Configuration

Structured Logging

# config/logging.yaml
version: 1
disable_existing_loggers: false

formatters:
  standard:
    format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
  
  json:
    class: pythonjsonlogger.jsonlogger.JsonFormatter
    format: '%(asctime)s %(name)s %(levelname)s %(message)s'
  
  detailed:
    format: '%(asctime)s - %(name)s - %(levelname)s - %(funcName)s:%(lineno)d - %(message)s'

handlers:
  console:
    class: logging.StreamHandler
    level: INFO
    formatter: standard
    stream: ext://sys.stdout
  
  file:
    class: logging.handlers.RotatingFileHandler
    level: DEBUG
    formatter: detailed
    filename: logs/optimizer.log
    maxBytes: 10485760  # 10MB
    backupCount: 5
  
  error_file:
    class: logging.handlers.RotatingFileHandler
    level: ERROR
    formatter: json
    filename: logs/errors.log
    maxBytes: 10485760
    backupCount: 5

loggers:
  optimizer_framework:
    level: INFO
    handlers: [console, file]
    propagate: false
  
  optimizer_framework.models:
    level: DEBUG
    handlers: [file]
  
  optimizer_framework.api:
    level: INFO
    handlers: [console, file]

root:
  level: INFO
  handlers: [console]

Log Aggregation

# config/log_aggregation.py
import logging
from logging.handlers import SysLogHandler

def setup_centralized_logging(app_config):
    """Configure centralized logging."""
    
    if app_config.get('logging.centralized.enabled'):
        # Syslog handler
        syslog = SysLogHandler(
            address=(
                app_config['logging.centralized.host'],
                app_config['logging.centralized.port']
            )
        )
        syslog.setFormatter(
            logging.Formatter('optimizer-framework: %(message)s')
        )
        
        # Add to root logger
        logging.getLogger().addHandler(syslog)
        
    # Elasticsearch handler
    if app_config.get('logging.elasticsearch.enabled'):
        from elasticsearch import Elasticsearch
        from pythonjsonlogger import jsonlogger
        
        es_handler = ElasticsearchHandler(
            hosts=[app_config['logging.elasticsearch.host']],
            index_name='optimizer-logs'
        )
        es_handler.setFormatter(jsonlogger.JsonFormatter())
        
        logging.getLogger('optimizer_framework').addHandler(es_handler)

API Configuration

Rate Limiting Configuration

# config/rate_limiting.py
from slowapi import Limiter
from slowapi.util import get_remote_address

RATE_LIMIT_CONFIG = {
    "default": "100/minute",
    "endpoints": {
        "/api/optimize": {
            "limit": "10/minute",
            "key_func": get_remote_address,
            "error_message": "Optimization rate limit exceeded"
        },
        "/api/models/predict": {
            "limit": "1000/hour",
            "key_func": lambda req: f"{get_remote_address(req)}:{req.headers.get('X-API-Key')}"
        },
        "/api/admin/*": {
            "limit": "30/minute",
            "key_func": lambda req: req.headers.get('X-API-Key', 'anonymous')
        }
    },
    "storage": {
        "type": "redis",
        "redis_url": "redis://localhost:6379/1"
    }
}

CORS Configuration

# config/cors.py
from fastapi.middleware.cors import CORSMiddleware

CORS_CONFIG = {
    "allow_origins": [
        "https://app.example.com",
        "https://staging.example.com",
        "http://localhost:3000"  # Development
    ],
    "allow_credentials": True,
    "allow_methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
    "allow_headers": [
        "Accept",
        "Accept-Language",
        "Content-Type",
        "Authorization",
        "X-API-Key",
        "X-Request-ID"
    ],
    "expose_headers": [
        "X-Rate-Limit-Limit",
        "X-Rate-Limit-Remaining",
        "X-Rate-Limit-Reset"
    ],
    "max_age": 86400  # 24 hours
}

def configure_cors(app):
    """Configure CORS for the application."""
    if app.config.get('api.cors.enabled'):
        app.add_middleware(
            CORSMiddleware,
            **CORS_CONFIG
        )

Performance Tuning

Cache Configuration

# config/cache.yaml
cache:
  # Model prediction cache
  predictions:
    enabled: true
    backend: "redis"
    ttl: 3600  # 1 hour
    max_size: 10000
    eviction_policy: "lru"
    key_prefix: "pred:"
    
  # Optimization results cache
  optimizations:
    enabled: true
    backend: "redis"
    ttl: 86400  # 24 hours
    max_size: 1000
    key_prefix: "opt:"
    
  # API response cache
  api_responses:
    enabled: true
    backend: "memory"
    ttl: 300  # 5 minutes
    max_size: 5000
    
  # Configuration
  redis:
    connection_pool:
      max_connections: 50
      max_idle_time: 300
      retry_on_timeout: true

Worker Configuration

# config/workers.yaml
workers:
  # API workers
  api:
    count: ${WORKERS:4}
    worker_class: "uvicorn.workers.UvicornWorker"
    timeout: 300
    graceful_timeout: 30
    max_requests: 1000
    max_requests_jitter: 50
    
  # Background workers
  background:
    count: 2
    queues:
      - optimization
      - model_training
      - reporting
    concurrency: 10
    
  # Optimization workers
  optimization:
    pool_size: ${CPU_COUNT:4}
    process_based: true
    memory_limit: "2GB"
    timeout: 3600

Best Practices

Configuration Management

  1. Use Environment Variables for Secrets

    # Good
    database_url = os.environ['DATABASE_URL']
    
    # Bad
    database_url = "postgresql://user:password@localhost/db"
    
  2. Validate Configuration on Startup

    from pydantic import BaseSettings, validator
    
    class AppConfig(BaseSettings):
        database_url: str
        redis_url: str
        jwt_secret: str
        
        @validator('database_url')
        def validate_db_url(cls, v):
            if not v.startswith(('postgresql://', 'mysql://')):
                raise ValueError('Invalid database URL')
            return v
        
        class Config:
            env_prefix = 'OPTIMIZER_'
    
  3. Use Configuration Profiles

    # config/profiles.py
    PROFILES = {
        'development': {
            'debug': True,
            'log_level': 'DEBUG',
            'cache_enabled': False
        },
        'staging': {
            'debug': False,
            'log_level': 'INFO',
            'cache_enabled': True
        },
        'production': {
            'debug': False,
            'log_level': 'WARNING',
            'cache_enabled': True
        }
    }
    
  4. Implement Configuration Hot-Reload

    from watchdog.observers import Observer
    from watchdog.events import FileSystemEventHandler
    
    class ConfigReloader(FileSystemEventHandler):
        def on_modified(self, event):
            if event.src_path.endswith('.yaml'):
                logger.info(f"Reloading configuration: {event.src_path}")
                ConfigurationManager.reload()
    
  5. Document All Configuration Options

    # Each configuration option should have:
    setting_name:
      value: "default_value"
      description: "What this setting controls"
      type: "string"
      required: false
      env_var: "OPTIMIZER_SETTING_NAME"
      example: "example_value"
    

Security Considerations

  1. Never commit secrets to version control

  2. Use secret management services in production

  3. Rotate secrets regularly

  4. Implement least privilege access

  5. Audit configuration changes

Configuration Testing

# tests/test_config.py
import pytest
from optimizer_framework.config import ConfigurationManager

def test_load_config():
    """Test configuration loading."""
    config = ConfigurationManager.load_config('tests/fixtures/test_config.yaml')
    assert config['app']['name'] == 'Test Optimizer'
    assert config['database']['pool_size'] == 10

def test_env_override():
    """Test environment variable override."""
    os.environ['OPTIMIZER_DATABASE_POOL_SIZE'] = '20'
    config = ConfigurationManager.load_config()
    assert config['database']['pool_size'] == 20

def test_config_validation():
    """Test configuration validation."""
    with pytest.raises(ValueError):
        ConfigurationManager.load_config('tests/fixtures/invalid_config.yaml')

Remember: Good configuration management is key to maintaining a flexible, secure, and maintainable system!