Terraform provisions the infrastructure Β· Helm packages the application Β· Kubernetes runs the workloads Β· Same code, any cloud
What it does: Declares the entire cloud infrastructure in code files. Run one command β everything exists. Run another β everything is gone.
Directory: terraform/ β 9 modules, 3+ environment configs
Key commands:
terraform plan β Preview what will change (safe, read-only)
terraform apply β Create or update all infrastructure
terraform destroy β Tear everything down cleanly
State: Stored in Azure Blob / S3 / GCS β encrypted, locked, versioned. Zero secrets in state files (uses Managed Identity references).
terraform plan β Shows exactly what will be created, modified, or destroyed before making any changes.
Like a construction blueprint review β you see every change before the builders start. No surprises.
Output example: "Plan: 23 to add, 2 to change, 0 to destroy"
terraform apply β Creates all infrastructure: Kubernetes cluster, databases, search service, vault, networking, DNS.
Then helm install deploys the application workloads into the cluster.
Full stack from zero: One terraform apply + helm install = entire platform running.
helm upgrade cerebro-app ./helm/cerebro-app β Rolling update, zero downtime.
What can be upgraded:
β Application code (new Docker image)
β Configuration (env vars, secrets, replicas)
β Infrastructure (Terraform detects drift and reconciles)
β Plugins (hot-reload via MCP engine API)
Rollback: helm rollback cerebro-app 1 β instant revert to previous version.
terraform destroy β Removes ALL infrastructure cleanly. Every resource, every service, every DNS record.
Useful for: dev/test environments, cost optimization, migration between clouds.
Safety: Requires explicit confirmation. State file records what was created, so destroy is precise β no orphaned resources.
Each module abstracts one cloud service into a cloud-agnostic interface:
π Identity: Azure Managed Identity / AWS IAM Roles / GCP Workload Identity
βΈοΈ Kubernetes: AKS / EKS / GKE β same K8s API, different provisioning
πΎ Database: Cosmos DB / DynamoDB / Firestore
π Search: Azure AI Search / OpenSearch / Vertex AI Search
π€ LLM: Azure OpenAI / AWS Bedrock / Vertex AI
π Vault: Azure Key Vault / AWS Secrets Manager / GCP Secret Manager
π¦ Registry: ACR / ECR / Artifact Registry
π Networking: VNet / VPC / Ingress controllers
π DNS: Azure DNS / Route 53 / Cloud DNS
The app code never references cloud-specific APIs β only the Terraform modules change per cloud.
Helm packages Kubernetes manifests into versioned, configurable releases:
cerebro-app: The main platform β Flask app, ingress, TLS, HPA autoscaling, service account with workload identity.
mcp-engine: Each MCP engine gets its own Helm release in its own namespace. Deploy new engines by adding a release.
monitoring: Full observability stack β Prometheus, Grafana, Loki, Promtail, AlertManager, dashboards.
Values files: values-azure.yaml, values-aws.yaml, values-local.yaml β same chart, different cloud configs.
Runs the main Cerebro app β the Flask web server, OIDC auth, agent loop, Visual Designer, all API endpoints.
Pods: cerebro-app (4 Gunicorn workers), ingress-nginx controller
TLS: cert-manager auto-provisions Let's Encrypt certificates
Secrets: Cosmos endpoint, search keys, LLM keys, SSO secret β all injected via Helm values
Each MCP engine runs in its own Kubernetes namespace for isolation. Deploy additional engines (mcp2, mcp-healthcare, mcp-finance) as separate Helm releases.
Ports: 5000 (dashboard/API), 5001 (MCP transport β SSE/HTTP)
Plugins: Mounted as volumes or uploaded via API. Each plugin is a ZIP with manifest + Python code.
Scaling: Each engine scales independently based on load.
Full observability stack running alongside the application:
Prometheus: Scrapes metrics from all pods via ServiceMonitors
Grafana: Dashboards for platform, MCP, RAG, security, users, Loki logs
Loki + Promtail: Aggregates all pod logs, searchable with LogQL
Alertmanager: Fires alerts on high error rates, slow tools, pod crashes, disk/memory pressure
Fully wired and running. cd terraform/environments/azure && terraform apply
AKS (D4ds_v5 nodes), Cosmos DB, Azure AI Search, Azure OpenAI (GPT-4o), Key Vault, ACR, Application Gateway
Security: Managed Identity for all service-to-service auth β zero API keys stored in pods.
Same Terraform modules, AWS provider. cd terraform/environments/aws && terraform apply
EKS, DynamoDB, OpenSearch, Bedrock (Claude/Titan), Secrets Manager, ECR
Security: IAM Roles for Service Accounts (IRSA) β same zero-key pattern as Azure.
Same Terraform modules, GCP provider. cd terraform/environments/gcp && terraform apply
GKE, Firestore, Vertex AI Search, Vertex AI (Gemini), Secret Manager, Artifact Registry
Security: Workload Identity Federation β same zero-key pattern.
For development, demos, and air-gapped deployments. docker-compose up
Docker Compose, SQLite (no Cosmos needed), Ollama (free local LLM), file-based vault
No cloud account required. Everything runs on a single machine.