StellarBase
Docs Deployment On-Premise
Deployment

On-Premise Deployment

The full platform in your own infrastructure. Same features, same UX, same APIs as managed cloud — but you own the data and the operations.

When to go on-premise

  • Regulatory requirement: data cannot leave your DC
  • IP sensitivity: process parameters, trade secrets, privileged information
  • Volume economics: per-token pricing becomes expensive past a threshold
  • Integration tightness: you need the platform inside the same network as key internal systems
  • Latency: sub-millisecond response to internal users

For zero-internet-egress deployments, see Air-gapped. For a mix of managed + on-prem, see Hybrid.

What ships

Container images

Signed OCI images for the full StellarBase service set, imported into your private registry. See Docker & Kubernetes for the operational shape.

Helm chart

Kubernetes deployment with sensible defaults. Customisable values.yaml. Covers ingress, TLS, persistent volumes, resource limits, autoscaling rules.

Docker Compose (alternative)

For single-node or small multi-node deployments where Kubernetes would be overkill. Production-grade but less scalable.

Models

StellarCloud model weights packaged as signed tarballs. Choose which to install at deploy time — LLMs alone can be ~240 GB, specialized models are much smaller.

Tooling

Admin CLI, health-check scripts, backup / restore utilities, migration scripts, monitoring setup.

Infrastructure requirements

Compute

ScaleNodesvCPURAM
Pilot (< 50 users)32496 GB
Department (< 500 users)696384 GB
Enterprise (< 5,000 users)12+200+1 TB+

GPU (for LLMs)

WorkloadMinimumRecommended
Specialized models only1x L42x L4 / L40S
+ GPT-OSS 120B2x H1004x H100
+ Qwen 3.5 397B8x H1008x H200

You don’t need to host all LLMs — you choose. Some customers host only the specialized models and route LLM calls to StellarCloud (if connected) or commercial providers via StellarGate.

Storage

ComponentTypical size
Postgres (metadata, config)100 GB (pilot) → 5 TB (enterprise)
Object storage (documents, embeddings)Depends on corpus — plan ~15% of corpus size
Model weights~ 4 GB (specialized only) → 250 GB (with large LLMs)
Logs + audit100 GB / month for typical workloads

Network

  • Internal network with low-latency connectivity between nodes (10 GbE recommended)
  • TLS-terminating ingress (we provide the config)
  • Egress policy — none required for air-gapped; controlled egress for connected deployments

Identity

Integration with your IdP (SAML, OIDC, LDAP/AD). Local users supported for emergencies.

Deployment process

  1. Planning (Week 1) — capacity sizing, network topology, security review
  2. Infrastructure (Week 2) — your team provisions Kubernetes, Postgres, object storage, GPUs
  3. Install (Week 3) — Helm deployment, configuration, first login
  4. Identity + connectors (Week 3–4) — SSO integration, first data sources
  5. Pilot (Week 4–6) — first users, validation, security review
  6. Go-live (Week 6–8) — full rollout

Typical 6–8 weeks for a straightforward deployment. Regulated / air-gapped adds 4–8 weeks for security review and certification.

Updates

Updates ship as signed Helm chart bumps or image updates. Cadence:

  • Security patches — within 14 days of release (critical patches within 48 hours)
  • Minor releases — quarterly
  • Major releases — annually

You control when to apply. Blue-green deployments via Helm provide zero-downtime upgrades. Roll back with a single Helm command.

Support model

Named engineers assigned to your deployment. Communication via:

  • Your preferred channel (email, Slack, Teams, phone)
  • Health-check data you share on request
  • Remote support (if permitted) via your controlled channels
  • On-site engineering support for installation + annual review (business tier)

Monitoring

Prometheus / Grafana stack integrated. Dashboards for:

  • Application health (request rate, latency, errors)
  • Ingestion pipeline (throughput, lag, failures)
  • LLM inference (tokens, GPU utilization, queue depth)
  • Storage (disk, backup status)
  • Security (auth failures, policy violations)

Metrics exportable to your existing observability stack (Datadog, New Relic, etc.).

Backup & recovery

  • Postgres: continuous backup to object storage
  • Object storage: versioning + cross-bucket replication (your choice of target)
  • Point-in-time recovery tested quarterly
  • Configuration / IaC in your Git

Licensing

Annual licence based on:

  • Number of users / Bases
  • Models included (some large LLMs have separate licence terms)
  • Support tier

Per-token pricing does not apply on self-hosted — unlimited requests within your licence.

Related