StellarBase
Docs Deployment Docker & Kubernetes
Deployment

Docker & Kubernetes

Self-hosted deployments ship as standard cloud-native primitives. Helm charts and Compose files, GPU-aware, with the operational surface a platform team expects.

Deployment options

OptionWhen to use
Docker ComposeSingle-node pilots, development environments, small teams
Helm on KubernetesProduction, multi-node, auto-scaling
Operator (K8s)Advanced — CRD-based lifecycle management
Bare binary + systemdMinimal-dependency environments

What you’re deploying

StellarBase ships as a set of stateless application services plus a small number of stateful dependencies. Stateless services scale horizontally; stateful components are industry-standard and either bundled or pluggable with managed offerings.

The architecture is intentionally boring: cloud-native primitives, a Helm umbrella chart, no exotic runtime. If your team operates Kubernetes, they already know how to operate StellarBase.

Stateful dependencies

ComponentRole
PostgreSQL 14+Metadata, config, audit log
Redis 7+Queues, rate limits, ephemeral state
Object storage (S3-compatible)Documents, model weights, backups
Vector indexSemantic search

All four are replaceable with managed offerings (RDS, ElastiCache, managed MinIO, etc.) or with your existing internal infrastructure. Sizing is workload-dependent — we provide reference values per tier in the Helm chart.

Kubernetes requirements

  • Kubernetes 1.27+
  • Ingress controller (NGINX, Traefik, Istio — your choice)
  • StorageClass for persistent volumes (SSD recommended)
  • NVIDIA GPU Operator for GPU workloads
  • cert-manager for TLS (or your own cert pipeline)
  • Metrics-server for HPA

Helm chart structure

Single umbrella chart with sub-charts per service. The values file is the primary configuration surface. Common customisations:

  • Replica counts and resource requests / limits
  • GPU selector for inference pods
  • Storage class and sizes
  • Secrets source (K8s secrets, Vault, AWS Secrets Manager)
  • Ingress hostnames and TLS
  • Model choices (which LLMs to install)

We provide opinionated defaults for three tiers (pilot, department, enterprise). Override anything.

GPU scheduling

Inference pods declare GPU requirements. Kubernetes GPU Operator schedules them onto matching nodes. Multi-GPU inference is handled automatically — you size the pool, we manage placement.

For multi-tenant isolation, use GPU time-slicing (MIG on H100 / H200) to assign slices to different tenants.

Scaling policies

  • Horizontal Pod Autoscaler — stateless services scale on CPU + custom metrics (queue depth, request rate).
  • Vertical Pod Autoscaler — recommended for learning correct resource sizing. Start in recommendation mode, apply after a week.
  • Cluster autoscaler — for cloud K8s (EKS, GKE, AKS), scales nodes up / down with workload.

Networking

Service mesh

Istio or Linkerd recommended for mTLS between services. We provide PeerAuthentication and AuthorizationPolicy manifests.

Ingress

Single ingress per environment. TLS via cert-manager or your own certificates. WAF optional (recommended for internet-exposed deployments).

Egress

Default: block all egress, explicit allowlist for required destinations (LLM providers, connector targets). For air-gapped, block all.

Observability

  • Prometheus metrics on every service
  • OpenTelemetry traces for request-level debugging
  • Structured JSON logs, shippable to Loki / Elasticsearch / your SIEM
  • Grafana dashboards bundled in the Helm chart

Backup & restore

Postgres: continuous backup to object storage with point-in-time recovery, tested quarterly.

Object storage: versioning enabled. Cross-region replication optional.

Config: IaC in your Git. The Helm chart + values.yaml is the source of truth.

Upgrades

Rolling updates via helm upgrade. Blue-green option for zero-downtime in production. Database migrations run automatically on startup.

Always upgrade in lower environments first. Read the release notes — minor versions occasionally require a brief read-only window for certain migrations.

Debugging

  • Health endpoints — every service exposes /healthz and /readyz
  • Correlation IDs — track a single request across services via X-Correlation-Id
  • Audit log — operational actions visible in the admin UI

Docker Compose (for pilots)

Simpler for small deployments:

  • Single-host, services as containers
  • Same images as K8s
  • Production-grade but limited to vertical scaling on one machine
  • Good for pilots up to ~50 users

Related