System Architecture
High-level overview of the Chelar platform architecture.
Chelar runs on a single Hetzner AX42 bare metal server (AMD Ryzen 5 3600, 64 GB RAM, 2x 512 GB NVMe, ~EUR 47/mo).
The full architecture details are in ARCHITECTURE_NOMAD.md at the repo root. This page is a summary for quick reference.
Core Components
| Component | Role | Technology |
|---|---|---|
| Go API | Control plane — tenant CRUD, Nomad orchestration, billing | Go 1.22+, chi, sqlc + pgx/v5 |
| Dashboard | User-facing UI — auth, onboarding, channels, settings | Next.js 15, NextAuth, Tailwind v4 |
| Nomad | Container orchestration — schedules tenant jobs | HashiCorp Nomad (single binary) |
| Caddy | Reverse proxy — wildcard TLS, per-tenant routing, auth gating | Caddy v2, Cloudflare DNS challenge |
| JuiceFS | Shared storage — tenant data dirs, encrypted at rest | S3 backend + Postgres metadata |
| PostgreSQL | Platform database — tenants, billing, sessions | Supabase-hosted PostgreSQL 16 |
Request Flow
User → WhatsApp/Telegram → Caddy → tenant container → AI provider
↑
Go API manages lifecycleThe Go API is NOT in the message data path. Messages flow directly from Caddy to the tenant container. The API only manages lifecycle, configuration, and billing.
Gateway-per-Tenant Model
Each tenant gets a dedicated container running an AI assistant runtime:
| Runtime | Language | Memory | Use Case |
|---|---|---|---|
| OpenClaw | Node.js | 1,024 MB | Full-featured, plugins, Pro tier |
| ZeroClaw | Rust | 256 MB | Lightweight, secure, Free tier default |
The Go API provisions the container via Nomad, creates the JuiceFS data directory, and configures the Caddy route.
Why Nomad Over Kubernetes
Kubernetes installs ~700 MB of system overhead per node (etcd, kubelet, kube-proxy, CNI, CSI, CRDs). Nomad is a single binary at ~100 MB. For single-replica containers with no service discovery or rolling updates — Nomad is the right tool.
The task driver model allows evolving isolation without changing anything else:
| Phase | Driver | Isolation |
|---|---|---|
| Phase 1 (current) | Docker | Linux namespaces + cgroups |
| Phase 2 (planned) | Cloud Hypervisor | VM-level isolation + virtiofs |
| Phase 3 (future) | Firecracker | Minimal attack surface |
Key Design Rules
- Minimal runtime patches — configure via config files + Nomad Variables, not code changes
- Hybrid control — platform manages infra; users manage AI features natively
- No user SSH — all access mediated by Go API and Caddy auth gating
- Tenant isolation — Docker network isolation, per-tenant UID, AES-256-GCM encryption
- No node pinning — tenant data on JuiceFS; Nomad schedules to any available node