🧠 Cerebro β€” Platform & Security 🌐 Loom β€” Infrastructure & Deploy 🧢 Weave β€” Plugins & Knowledge ⚑ Workflows πŸ“Š Presentation

🌐 Loom β€” One Command to Deploy, Upgrade, or Tear Down on Any Cloud

Terraform provisions the infrastructure Β· Helm packages the application Β· Kubernetes runs the workloads Β· Same code, any cloud

πŸ—οΈ Terraform Infrastructure as Code πŸ“‹ Plan terraform plan πŸš€ Apply terraform apply ⬆️ Upgrade helm upgrade πŸ’₯ Teardown terraform destroy TERRAFORM MODULES β€” Cloud-Agnostic Building Blocks πŸ†” Identity Managed ID / IAM ☸️ Kubernetes AKS / EKS / GKE πŸ’Ύ Database Cosmos / Dynamo / Fire πŸ” Search AI Search / OpenSearch πŸ€– LLM OpenAI / Bedrock / Vertex πŸ” Vault Key Vault / Secrets Mgr πŸ“¦ Registry ACR / ECR / GAR 🌐 Networking VNet / VPC / Ingress 🌍 DNS Azure DNS / R53 HELM CHARTS β€” Application Packaging & Deployment ⎈ cerebro-app Deployment Β· Service Β· Ingress Β· HPA ConfigMaps Β· Secrets Β· ServiceAccount Β· RBAC values-azure.yaml Β· values-aws.yaml Β· values-local.yaml ⎈ mcp-engine Deployment Β· Service Β· Namespace Β· HPA Plugin volume mounts Β· Liveness/Readiness probes Per-engine Helm release: mcp1, mcp2, mcp-healthcare... ⎈ monitoring Prometheus Β· Grafana Β· Loki Β· Promtail ServiceMonitors Β· AlertRules Β· Dashboards kube-prometheus-stack + loki + promtail charts KUBERNETES CLUSTER β€” Runtime Workloads ☸ Kubernetes Cluster πŸ“¦ default namespace 🧠 cerebro-app Flask Β· Gunicorn Β· 4 workers πŸ”Œ Ingress Controller nginx Β· TLS termination Services: cerebro-app:5080 Β· Ingress: app.brainzbytes.com (Let's Encrypt TLS) πŸ”‘ Secrets (env vars) πŸ“„ ConfigMaps πŸ” cert-manager Β· contextweaver-tls (Let's Encrypt auto-renew) πŸ“¦ mcp1 namespace πŸ”Œ mcp-engine FastMCP Β· Dashboard Β· Plugins πŸ”Œ mcp2 (future) Additional engines Services: mcp1:5000 (dashboard) Β· mcp1:5001 (MCP transport) πŸ”‘ Engine Secrets πŸ“š Plugin Volumes πŸ”— ServiceMonitor β†’ Prometheus scraping πŸ“¦ monitoring namespace Prometheus Grafana Loki Promtail Alertmanager Β· node-exporter kube-state-metrics Dashboards: Platform Overview Β· MCP Engine Β· RAG Β· Security Β· User Activity Β· Loki Logs 🚨 Alert Rules: High error rate Β· Slow tools Β· Pod crashes Β· Disk full Β· Memory pressure CLOUD PROVIDER β€” One terraform apply away ☁️ Azure CURRENT β€” Production AKS Β· Cosmos DB Β· AI Search Azure OpenAI Β· Key Vault Β· ACR Managed Identity β€” zero keys in pods ☁️ AWS PLANNED EKS Β· DynamoDB Β· OpenSearch Bedrock Β· Secrets Manager Β· ECR IAM Roles for Service Accounts ☁️ GCP PLANNED GKE Β· Firestore Β· Vertex Search Vertex AI Β· Secret Mgr Β· Artifact Reg Workload Identity Federation 🏠 Local / On-Prem AVAILABLE Docker Β· SQLite Β· Ollama File vault Β· docker-compose Air-gapped Β· No cloud needed ⇄ ⇄ ⇄
Terraform (Provision + Lifecycle)
Helm (Package + Deploy)
Kubernetes (Runtime)
Upgrade Path
Teardown
Cloud Providers
βœ•

πŸ—οΈ Terraform β€” Infrastructure as Code

What it does: Declares the entire cloud infrastructure in code files. Run one command β†’ everything exists. Run another β†’ everything is gone.

Directory: terraform/ β€” 9 modules, 3+ environment configs

Key commands:

terraform plan β€” Preview what will change (safe, read-only)

terraform apply β€” Create or update all infrastructure

terraform destroy β€” Tear everything down cleanly

State: Stored in Azure Blob / S3 / GCS β€” encrypted, locked, versioned. Zero secrets in state files (uses Managed Identity references).

βœ•

πŸ“‹ Plan Phase

terraform plan β€” Shows exactly what will be created, modified, or destroyed before making any changes.

Like a construction blueprint review β€” you see every change before the builders start. No surprises.

Output example: "Plan: 23 to add, 2 to change, 0 to destroy"

βœ•

πŸš€ Apply Phase

terraform apply β€” Creates all infrastructure: Kubernetes cluster, databases, search service, vault, networking, DNS.

Then helm install deploys the application workloads into the cluster.

Full stack from zero: One terraform apply + helm install = entire platform running.

βœ•

⬆️ Upgrade Phase

helm upgrade cerebro-app ./helm/cerebro-app β€” Rolling update, zero downtime.

What can be upgraded:

β†’ Application code (new Docker image)

β†’ Configuration (env vars, secrets, replicas)

β†’ Infrastructure (Terraform detects drift and reconciles)

β†’ Plugins (hot-reload via MCP engine API)

Rollback: helm rollback cerebro-app 1 β€” instant revert to previous version.

βœ•

πŸ’₯ Teardown Phase

terraform destroy β€” Removes ALL infrastructure cleanly. Every resource, every service, every DNS record.

Useful for: dev/test environments, cost optimization, migration between clouds.

Safety: Requires explicit confirmation. State file records what was created, so destroy is precise β€” no orphaned resources.

βœ•

🧱 Terraform Modules

Each module abstracts one cloud service into a cloud-agnostic interface:

πŸ†” Identity: Azure Managed Identity / AWS IAM Roles / GCP Workload Identity

☸️ Kubernetes: AKS / EKS / GKE β€” same K8s API, different provisioning

πŸ’Ύ Database: Cosmos DB / DynamoDB / Firestore

πŸ” Search: Azure AI Search / OpenSearch / Vertex AI Search

πŸ€– LLM: Azure OpenAI / AWS Bedrock / Vertex AI

πŸ” Vault: Azure Key Vault / AWS Secrets Manager / GCP Secret Manager

πŸ“¦ Registry: ACR / ECR / Artifact Registry

🌐 Networking: VNet / VPC / Ingress controllers

🌍 DNS: Azure DNS / Route 53 / Cloud DNS

The app code never references cloud-specific APIs β€” only the Terraform modules change per cloud.

βœ•

⎈ Helm Charts

Helm packages Kubernetes manifests into versioned, configurable releases:

cerebro-app: The main platform β€” Flask app, ingress, TLS, HPA autoscaling, service account with workload identity.

mcp-engine: Each MCP engine gets its own Helm release in its own namespace. Deploy new engines by adding a release.

monitoring: Full observability stack β€” Prometheus, Grafana, Loki, Promtail, AlertManager, dashboards.

Values files: values-azure.yaml, values-aws.yaml, values-local.yaml β€” same chart, different cloud configs.

βœ•

πŸ“¦ default namespace

Runs the main Cerebro app β€” the Flask web server, OIDC auth, agent loop, Visual Designer, all API endpoints.

Pods: cerebro-app (4 Gunicorn workers), ingress-nginx controller

TLS: cert-manager auto-provisions Let's Encrypt certificates

Secrets: Cosmos endpoint, search keys, LLM keys, SSO secret β€” all injected via Helm values

βœ•

πŸ“¦ mcp1 namespace

Each MCP engine runs in its own Kubernetes namespace for isolation. Deploy additional engines (mcp2, mcp-healthcare, mcp-finance) as separate Helm releases.

Ports: 5000 (dashboard/API), 5001 (MCP transport β€” SSE/HTTP)

Plugins: Mounted as volumes or uploaded via API. Each plugin is a ZIP with manifest + Python code.

Scaling: Each engine scales independently based on load.

βœ•

πŸ“¦ monitoring namespace

Full observability stack running alongside the application:

Prometheus: Scrapes metrics from all pods via ServiceMonitors

Grafana: Dashboards for platform, MCP, RAG, security, users, Loki logs

Loki + Promtail: Aggregates all pod logs, searchable with LogQL

Alertmanager: Fires alerts on high error rates, slow tools, pod crashes, disk/memory pressure

βœ•

☁️ Azure β€” Current Production

Fully wired and running. cd terraform/environments/azure && terraform apply

AKS (D4ds_v5 nodes), Cosmos DB, Azure AI Search, Azure OpenAI (GPT-4o), Key Vault, ACR, Application Gateway

Security: Managed Identity for all service-to-service auth β€” zero API keys stored in pods.

βœ•

☁️ AWS β€” Planned

Same Terraform modules, AWS provider. cd terraform/environments/aws && terraform apply

EKS, DynamoDB, OpenSearch, Bedrock (Claude/Titan), Secrets Manager, ECR

Security: IAM Roles for Service Accounts (IRSA) β€” same zero-key pattern as Azure.

βœ•

☁️ GCP β€” Planned

Same Terraform modules, GCP provider. cd terraform/environments/gcp && terraform apply

GKE, Firestore, Vertex AI Search, Vertex AI (Gemini), Secret Manager, Artifact Registry

Security: Workload Identity Federation β€” same zero-key pattern.

βœ•

🏠 Local / On-Premises

For development, demos, and air-gapped deployments. docker-compose up

Docker Compose, SQLite (no Cosmos needed), Ollama (free local LLM), file-based vault

No cloud account required. Everything runs on a single machine.