Integration
Patterns for wiring StellarGate into your application. One URL change for the transparent proxy, an SDK for tokenized handoff, a Helm chart for self-hosted.
Mode 1 — Transparent Proxy
The simplest integration. If your app currently uses the OpenAI SDK (or an OpenAI-compatible client) pointed at OpenAI directly, change the base URL to https://gate.stellarbase.ai/v1. Nothing else changes.
Authentication
Use your StellarGate API key in place of your OpenAI key. StellarGate translates to the downstream provider’s auth model behind the scenes.
Supported providers
- OpenAI (GPT-4o, GPT-4, GPT-3.5, embeddings)
- Anthropic (Claude 4, Claude 3.5, Claude 3)
- Google (Gemini 2.5, Gemini 1.5)
- Mistral (Large, Medium, Small)
- Cohere (Command, Embed)
- Any custom OpenAI-compatible endpoint you configure
Request shape
Same as OpenAI’s API. Model name is part of the request body. StellarGate routes to the correct provider based on the model name.
What changes in responses
Nothing. You get the provider’s native response format, with tokens in the content resolved back to original values.
Mode 2 — Tokenized Handoff
Three API calls per interaction. Use the StellarGate SDK (Python or TypeScript) for the anonymize / de-anonymize calls; the middle call (to the LLM) is your normal provider SDK.
Conceptual flow
- Call
gate.anonymize(text)— returnssanitized_text+map_id - Call the LLM with
sanitized_textusing whatever SDK you prefer - Call
gate.deanonymize(llm_response, map_id)— returns the resolved text
Map ID lifecycle
Each anonymize call issues a unique map_id. The mapping is stored in an encrypted vault scoped to your tenant and expires automatically after 24 hours (configurable). De-anonymization after expiry returns an error.
For very long-running workflows, request an extended-lifetime map or persist the mapping yourself (the SDK supports export/import of mappings).
Streaming
For streaming responses, de-anonymization happens on the final assembled text, not token-by-token. If you need token-level streaming with de-anonymization, use Mode 1 (transparent proxy) — which handles the streaming mechanics internally.
Mode 3 — Self-Hosted
Deployment
StellarGate ships as:
- Docker Compose — simplest, for single-node deployments
- Helm chart — production Kubernetes
- Bare binary — for minimal-dependency environments (systemd service)
Images are signed and published to our private registry; air-gapped customers receive them as signed tarballs for import into their own registry.
Required infrastructure
- Postgres — for configuration, audit log, HITL queue. Any recent version (14+).
- Redis — for per-request token vault (ephemeral). Standalone or clustered.
- GPU — for the ML detection engine. Sized for your throughput; reference values per tier in the Helm chart.
- Networking — ingress to your app (TLS), egress to LLM providers (optional — can be zero-egress for local LLMs)
Upgrade path
Blue-green deployments via Helm. Zero-downtime if you have multiple replicas. Database migrations run automatically on startup.
SDK coverage
| Language | SDK | Status |
|---|---|---|
| Python 3.9+ | stellargate on PyPI | Production |
| TypeScript / Node 18+ | @stellarbase/gate on npm | Production |
| Go 1.21+ | github.com/stellarbase/gate-go | Beta |
| Java 17+ | Maven Central | Beta |
| .NET 8+ | NuGet | Beta |
| Rust | crates.io | Experimental |
For anything else, the HTTP API is standard — any HTTP client works. Swagger / OpenAPI spec available on the API reference page.
Configuration
Configuration is YAML for self-hosted, admin UI + API for managed. Common settings:
- Detection policy — strict / balanced / permissive
- Dictionaries — attached via API or UI
- Allowlists
- HITL triggers — per rule
- Per-model policies — different rules for different destinations
- Retention — how long to keep audit logs (default 90 days)
Observability
Prometheus metrics endpoint on all modes. Typical metrics:
- Request rate, anonymization latency, de-anonymization latency
- Detection counts per category
- HITL queue depth, approval rate, rejection rate
- Per-LLM-provider call rate, error rate, latency
OpenTelemetry traces available for detailed debugging.
