Configuration Guide
This guide covers all aspects of configuring the Optimizer Framework, from basic settings to advanced customization options.
Table of Contents
Configuration Overview
The Optimizer Framework uses a hierarchical configuration system:
1. Default Configuration (built-in)
2. Configuration Files (YAML/JSON/TOML)
3. Environment Variables
4. Command Line Arguments
5. Runtime Configuration
Each level overrides the previous, allowing flexible configuration management.
Configuration Loading Order
from optimizer_framework.config import ConfigurationManager
# Load configuration with precedence
config = ConfigurationManager.load_config(
default_config="config/defaults.yaml",
user_config="config/production.yaml",
env_prefix="OPTIMIZER_",
cli_args=sys.argv[1:]
)
Configuration File Formats
YAML Configuration (Recommended)
# config/optimizer.yaml
version: "1.0"
# Application settings
app:
name: "Optimizer Framework"
environment: "production"
debug: false
log_level: "INFO"
# Model configuration
models:
default_timeout: 300
cache_enabled: true
cache_ttl: 3600
registry:
type: "local"
path: "/app/models"
# Model-specific settings
revenue_model:
type: "docker"
image: "revenue-model:latest"
port: 8001
health_check_interval: 30
timeout: 600
awareness_model:
type: "python"
module: "models.awareness"
class: "AwarenessModel"
params:
saturation_alpha: 2.5
carryover_rate: 0.3
# Optimizer configuration
optimizer:
default_type: "optuna"
max_iterations: 1000
convergence_tolerance: 1e-6
scipy:
method: "SLSQP"
options:
ftol: 1e-6
maxiter: 1000
disp: false
optuna:
n_trials: 2000
n_jobs: -1
sampler:
type: "TPE"
n_startup_trials: 10
n_ei_candidates: 24
pruner:
type: "MedianPruner"
n_startup_trials: 5
# Database configuration
database:
url: "${DATABASE_URL}"
pool_size: 20
max_overflow: 0
pool_pre_ping: true
echo: false
# Cache configuration
cache:
type: "redis"
redis:
host: "${REDIS_HOST:localhost}"
port: ${REDIS_PORT:6379}
db: 0
password: "${REDIS_PASSWORD:}"
socket_timeout: 5
decode_responses: true
memory:
max_size: 1000
ttl: 3600
# API configuration
api:
host: "0.0.0.0"
port: 8000
workers: ${API_WORKERS:4}
cors:
enabled: true
origins:
- "https://app.example.com"
- "https://localhost:3000"
allow_credentials: true
allow_methods: ["GET", "POST", "PUT", "DELETE"]
allow_headers: ["*"]
rate_limiting:
enabled: true
default_limit: "100/minute"
endpoints:
"/api/optimize": "10/minute"
"/api/models": "1000/hour"
authentication:
enabled: true
type: "jwt"
secret_key: "${JWT_SECRET_KEY}"
algorithm: "HS256"
token_expiration: 3600
# Monitoring configuration
monitoring:
metrics:
enabled: true
endpoint: "/metrics"
include_histogram: true
tracing:
enabled: false
jaeger:
agent_host: "localhost"
agent_port: 6831
service_name: "optimizer-framework"
health_check:
endpoint: "/health"
detailed_endpoint: "/health/detailed"
checks:
- database
- cache
- models
# Feature flags
features:
async_optimization: true
model_versioning: true
auto_scaling: false
experimental:
neural_optimizer: false
quantum_annealing: false
JSON Configuration
{
"version": "1.0",
"app": {
"name": "Optimizer Framework",
"environment": "production",
"debug": false,
"log_level": "INFO"
},
"models": {
"default_timeout": 300,
"cache_enabled": true,
"registry": {
"type": "local",
"path": "/app/models"
}
},
"optimizer": {
"default_type": "scipy",
"scipy": {
"method": "SLSQP",
"options": {
"ftol": 1e-6,
"maxiter": 1000
}
}
}
}
TOML Configuration
# config/optimizer.toml
version = "1.0"
[app]
name = "Optimizer Framework"
environment = "production"
debug = false
log_level = "INFO"
[models]
default_timeout = 300
cache_enabled = true
[models.registry]
type = "local"
path = "/app/models"
[optimizer]
default_type = "scipy"
[optimizer.scipy]
method = "SLSQP"
[optimizer.scipy.options]
ftol = 1e-6
maxiter = 1000
Environment Variables
Standard Environment Variables
# Application
export OPTIMIZER_ENV=production
export OPTIMIZER_DEBUG=false
export OPTIMIZER_LOG_LEVEL=INFO
# Database
export DATABASE_URL=postgresql://user:pass@localhost:5432/optimizer
export DATABASE_POOL_SIZE=20
# Redis Cache
export REDIS_HOST=localhost
export REDIS_PORT=6379
export REDIS_PASSWORD=secret
# API
export API_HOST=0.0.0.0
export API_PORT=8000
export API_WORKERS=4
# Security
export JWT_SECRET_KEY=your-secret-key
export CORS_ORIGINS=https://app.example.com,https://localhost:3000
# AWS (if using)
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_REGION=us-east-1
# Feature Flags
export FEATURE_ASYNC_OPTIMIZATION=true
export FEATURE_MODEL_VERSIONING=true
Loading Environment Variables
from optimizer_framework.config import EnvConfig
# Load with prefix
env_config = EnvConfig(prefix="OPTIMIZER_")
# Access values with type conversion
debug = env_config.get_bool("DEBUG", default=False)
port = env_config.get_int("API_PORT", default=8000)
origins = env_config.get_list("CORS_ORIGINS", default=["*"])
# Load from .env file
from dotenv import load_dotenv
load_dotenv()
Model Configuration
Model Registry Configuration
# models/registry.yaml
models:
- name: "revenue_model"
version: "2.0.0"
type: "sklearn"
path: "models/revenue_model_v2.pkl"
features:
- tv_spend
- digital_spend
- radio_spend
metadata:
trained_date: "2024-01-15"
performance:
rmse: 0.05
mape: 0.03
- name: "awareness_model"
version: "1.5.0"
type: "tensorflow"
path: "models/awareness_model"
serving_config:
batch_size: 32
input_shape: [None, 10]
- name: "attribution_model"
version: "3.1.0"
type: "api"
endpoint: "https://api.attribution.com/v1/predict"
auth:
type: "bearer"
token: "${ATTRIBUTION_API_TOKEN}"
timeout: 30
retry:
max_attempts: 3
backoff: "exponential"
Model-Specific Configuration
# config/models/revenue_model.py
MODEL_CONFIG = {
"name": "Revenue Model",
"version": "2.0.0",
"type": "regression",
# Model parameters
"parameters": {
"saturation": {
"tv": {"alpha": 2.5, "gamma": 0.8},
"digital": {"alpha": 2.0, "gamma": 0.9},
"radio": {"alpha": 2.3, "gamma": 0.7}
},
"carryover": {
"tv": 0.3,
"digital": 0.1,
"radio": 0.2
},
"base_sales": 500000,
"trend": 0.02,
"seasonality": {
"amplitude": 50000,
"frequency": 52 # Weekly
}
},
# Data requirements
"input_schema": {
"required_features": ["tv", "digital", "radio"],
"optional_features": ["competitor_spend", "macro_index"],
"time_granularity": "weekly",
"min_history": 52 # weeks
},
# Performance settings
"performance": {
"cache_predictions": True,
"batch_size": 1000,
"use_gpu": False,
"parallel_workers": 4
}
}
Optimizer Configuration
Optimizer Profiles
# config/optimizer_profiles.yaml
profiles:
fast:
type: "scipy"
method: "L-BFGS-B"
options:
ftol: 1e-4
maxiter: 100
maxfun: 150
accurate:
type: "scipy"
method: "trust-constr"
options:
xtol: 1e-8
gtol: 1e-8
maxiter: 1000
initial_trust_radius: 1.0
exploration:
type: "optuna"
n_trials: 5000
sampler:
type: "TPE"
multivariate: true
constant_liar: true
pruner:
type: "HyperbandPruner"
min_resource: 10
reduction_factor: 3
production:
type: "hybrid"
stages:
- profile: "exploration"
n_trials: 100
- profile: "accurate"
warm_start: true
Dynamic Optimizer Selection
# config/optimizer_selector.py
def select_optimizer_config(problem_characteristics):
"""Select optimizer configuration based on problem."""
n_variables = problem_characteristics['n_variables']
constraint_type = problem_characteristics['constraint_type']
objective_smoothness = problem_characteristics['smoothness']
if n_variables < 10 and objective_smoothness == 'smooth':
return load_profile('fast')
elif constraint_type == 'nonlinear':
return load_profile('accurate')
elif n_variables > 100:
return load_profile('exploration')
else:
return load_profile('production')
Constraint Configuration
Constraint Templates
# config/constraints.yaml
constraint_templates:
standard_marketing:
budget_constraints:
total:
min: 100000
max: 10000000
channel_bounds:
tv:
min: 0
max: 5000000
digital:
min: 10000
max: 8000000
radio:
min: 0
max: 1000000
business_rules:
- name: "digital_minimum"
type: "percentage"
channel: "digital"
min_percentage: 0.3
- name: "traditional_maximum"
type: "percentage"
channels: ["tv", "radio", "print"]
max_percentage: 0.5
- name: "channel_ratio"
type: "ratio"
numerator: "tv"
denominator: "digital"
min_ratio: 0.5
max_ratio: 2.0
custom_constraints:
- name: "market_share"
function: "constraints.market_share_limit"
params:
max_share: 0.3
market_size: 10000000
Dynamic Constraints
# config/dynamic_constraints.py
from datetime import datetime
def get_seasonal_constraints(base_constraints, current_date):
"""Adjust constraints based on seasonality."""
constraints = base_constraints.copy()
# Q4 holiday season
if current_date.month in [10, 11, 12]:
# Increase budgets
constraints['budget_constraints']['total']['max'] *= 1.5
constraints['business_rules'].append({
'name': 'holiday_tv_minimum',
'type': 'percentage',
'channel': 'tv',
'min_percentage': 0.25
})
# Summer season
elif current_date.month in [6, 7, 8]:
# Focus on digital
constraints['business_rules'].append({
'name': 'summer_digital_focus',
'type': 'percentage',
'channel': 'digital',
'min_percentage': 0.5
})
return constraints
Logging Configuration
Structured Logging
# config/logging.yaml
version: 1
disable_existing_loggers: false
formatters:
standard:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
json:
class: pythonjsonlogger.jsonlogger.JsonFormatter
format: '%(asctime)s %(name)s %(levelname)s %(message)s'
detailed:
format: '%(asctime)s - %(name)s - %(levelname)s - %(funcName)s:%(lineno)d - %(message)s'
handlers:
console:
class: logging.StreamHandler
level: INFO
formatter: standard
stream: ext://sys.stdout
file:
class: logging.handlers.RotatingFileHandler
level: DEBUG
formatter: detailed
filename: logs/optimizer.log
maxBytes: 10485760 # 10MB
backupCount: 5
error_file:
class: logging.handlers.RotatingFileHandler
level: ERROR
formatter: json
filename: logs/errors.log
maxBytes: 10485760
backupCount: 5
loggers:
optimizer_framework:
level: INFO
handlers: [console, file]
propagate: false
optimizer_framework.models:
level: DEBUG
handlers: [file]
optimizer_framework.api:
level: INFO
handlers: [console, file]
root:
level: INFO
handlers: [console]
Log Aggregation
# config/log_aggregation.py
import logging
from logging.handlers import SysLogHandler
def setup_centralized_logging(app_config):
"""Configure centralized logging."""
if app_config.get('logging.centralized.enabled'):
# Syslog handler
syslog = SysLogHandler(
address=(
app_config['logging.centralized.host'],
app_config['logging.centralized.port']
)
)
syslog.setFormatter(
logging.Formatter('optimizer-framework: %(message)s')
)
# Add to root logger
logging.getLogger().addHandler(syslog)
# Elasticsearch handler
if app_config.get('logging.elasticsearch.enabled'):
from elasticsearch import Elasticsearch
from pythonjsonlogger import jsonlogger
es_handler = ElasticsearchHandler(
hosts=[app_config['logging.elasticsearch.host']],
index_name='optimizer-logs'
)
es_handler.setFormatter(jsonlogger.JsonFormatter())
logging.getLogger('optimizer_framework').addHandler(es_handler)
API Configuration
Rate Limiting Configuration
# config/rate_limiting.py
from slowapi import Limiter
from slowapi.util import get_remote_address
RATE_LIMIT_CONFIG = {
"default": "100/minute",
"endpoints": {
"/api/optimize": {
"limit": "10/minute",
"key_func": get_remote_address,
"error_message": "Optimization rate limit exceeded"
},
"/api/models/predict": {
"limit": "1000/hour",
"key_func": lambda req: f"{get_remote_address(req)}:{req.headers.get('X-API-Key')}"
},
"/api/admin/*": {
"limit": "30/minute",
"key_func": lambda req: req.headers.get('X-API-Key', 'anonymous')
}
},
"storage": {
"type": "redis",
"redis_url": "redis://localhost:6379/1"
}
}
CORS Configuration
# config/cors.py
from fastapi.middleware.cors import CORSMiddleware
CORS_CONFIG = {
"allow_origins": [
"https://app.example.com",
"https://staging.example.com",
"http://localhost:3000" # Development
],
"allow_credentials": True,
"allow_methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
"allow_headers": [
"Accept",
"Accept-Language",
"Content-Type",
"Authorization",
"X-API-Key",
"X-Request-ID"
],
"expose_headers": [
"X-Rate-Limit-Limit",
"X-Rate-Limit-Remaining",
"X-Rate-Limit-Reset"
],
"max_age": 86400 # 24 hours
}
def configure_cors(app):
"""Configure CORS for the application."""
if app.config.get('api.cors.enabled'):
app.add_middleware(
CORSMiddleware,
**CORS_CONFIG
)
Performance Tuning
Cache Configuration
# config/cache.yaml
cache:
# Model prediction cache
predictions:
enabled: true
backend: "redis"
ttl: 3600 # 1 hour
max_size: 10000
eviction_policy: "lru"
key_prefix: "pred:"
# Optimization results cache
optimizations:
enabled: true
backend: "redis"
ttl: 86400 # 24 hours
max_size: 1000
key_prefix: "opt:"
# API response cache
api_responses:
enabled: true
backend: "memory"
ttl: 300 # 5 minutes
max_size: 5000
# Configuration
redis:
connection_pool:
max_connections: 50
max_idle_time: 300
retry_on_timeout: true
Worker Configuration
# config/workers.yaml
workers:
# API workers
api:
count: ${WORKERS:4}
worker_class: "uvicorn.workers.UvicornWorker"
timeout: 300
graceful_timeout: 30
max_requests: 1000
max_requests_jitter: 50
# Background workers
background:
count: 2
queues:
- optimization
- model_training
- reporting
concurrency: 10
# Optimization workers
optimization:
pool_size: ${CPU_COUNT:4}
process_based: true
memory_limit: "2GB"
timeout: 3600
Best Practices
Configuration Management
Use Environment Variables for Secrets
# Good database_url = os.environ['DATABASE_URL'] # Bad database_url = "postgresql://user:password@localhost/db"
Validate Configuration on Startup
from pydantic import BaseSettings, validator class AppConfig(BaseSettings): database_url: str redis_url: str jwt_secret: str @validator('database_url') def validate_db_url(cls, v): if not v.startswith(('postgresql://', 'mysql://')): raise ValueError('Invalid database URL') return v class Config: env_prefix = 'OPTIMIZER_'
Use Configuration Profiles
# config/profiles.py PROFILES = { 'development': { 'debug': True, 'log_level': 'DEBUG', 'cache_enabled': False }, 'staging': { 'debug': False, 'log_level': 'INFO', 'cache_enabled': True }, 'production': { 'debug': False, 'log_level': 'WARNING', 'cache_enabled': True } }
Implement Configuration Hot-Reload
from watchdog.observers import Observer from watchdog.events import FileSystemEventHandler class ConfigReloader(FileSystemEventHandler): def on_modified(self, event): if event.src_path.endswith('.yaml'): logger.info(f"Reloading configuration: {event.src_path}") ConfigurationManager.reload()
Document All Configuration Options
# Each configuration option should have: setting_name: value: "default_value" description: "What this setting controls" type: "string" required: false env_var: "OPTIMIZER_SETTING_NAME" example: "example_value"
Security Considerations
Never commit secrets to version control
Use secret management services in production
Rotate secrets regularly
Implement least privilege access
Audit configuration changes
Configuration Testing
# tests/test_config.py
import pytest
from optimizer_framework.config import ConfigurationManager
def test_load_config():
"""Test configuration loading."""
config = ConfigurationManager.load_config('tests/fixtures/test_config.yaml')
assert config['app']['name'] == 'Test Optimizer'
assert config['database']['pool_size'] == 10
def test_env_override():
"""Test environment variable override."""
os.environ['OPTIMIZER_DATABASE_POOL_SIZE'] = '20'
config = ConfigurationManager.load_config()
assert config['database']['pool_size'] == 20
def test_config_validation():
"""Test configuration validation."""
with pytest.raises(ValueError):
ConfigurationManager.load_config('tests/fixtures/invalid_config.yaml')
Remember: Good configuration management is key to maintaining a flexible, secure, and maintainable system!