Integration

Patterns for wiring StellarGate into your application. One URL change for the transparent proxy, an SDK for tokenized handoff, a Helm chart for self-hosted.

Mode 1 — Transparent Proxy

The simplest integration. If your app currently uses the OpenAI SDK (or an OpenAI-compatible client) pointed at OpenAI directly, change the base URL to https://gate.stellarbase.ai/v1. Nothing else changes.

Authentication

Use your StellarGate API key in place of your OpenAI key. StellarGate translates to the downstream provider’s auth model behind the scenes.

Supported providers

OpenAI (GPT-4o, GPT-4, GPT-3.5, embeddings)
Anthropic (Claude 4, Claude 3.5, Claude 3)
Google (Gemini 2.5, Gemini 1.5)
Mistral (Large, Medium, Small)
Cohere (Command, Embed)
Any custom OpenAI-compatible endpoint you configure

Request shape

Same as OpenAI’s API. Model name is part of the request body. StellarGate routes to the correct provider based on the model name.

What changes in responses

Nothing. You get the provider’s native response format, with tokens in the content resolved back to original values.

Mode 2 — Tokenized Handoff

Three API calls per interaction. Use the StellarGate SDK (Python or TypeScript) for the anonymize / de-anonymize calls; the middle call (to the LLM) is your normal provider SDK.

Conceptual flow

Call gate.anonymize(text) — returns sanitized_text + map_id
Call the LLM with sanitized_text using whatever SDK you prefer
Call gate.deanonymize(llm_response, map_id) — returns the resolved text

Map ID lifecycle

Each anonymize call issues a unique map_id. The mapping is stored in an encrypted vault scoped to your tenant and expires automatically after 24 hours (configurable). De-anonymization after expiry returns an error.

For very long-running workflows, request an extended-lifetime map or persist the mapping yourself (the SDK supports export/import of mappings).

Streaming

For streaming responses, de-anonymization happens on the final assembled text, not token-by-token. If you need token-level streaming with de-anonymization, use Mode 1 (transparent proxy) — which handles the streaming mechanics internally.

Mode 3 — Self-Hosted

Deployment

StellarGate ships as:

Docker Compose — simplest, for single-node deployments
Helm chart — production Kubernetes
Bare binary — for minimal-dependency environments (systemd service)

Images are signed and published to our private registry; air-gapped customers receive them as signed tarballs for import into their own registry.

Required infrastructure

Postgres — for configuration, audit log, HITL queue. Any recent version (14+).
Redis — for per-request token vault (ephemeral). Standalone or clustered.
GPU — for the ML detection engine. Sized for your throughput; reference values per tier in the Helm chart.
Networking — ingress to your app (TLS), egress to LLM providers (optional — can be zero-egress for local LLMs)

Upgrade path

Blue-green deployments via Helm. Zero-downtime if you have multiple replicas. Database migrations run automatically on startup.

SDK coverage

Language	SDK	Status
Python 3.9+	`stellargate` on PyPI	Production
TypeScript / Node 18+	`@stellarbase/gate` on npm	Production
Go 1.21+	`github.com/stellarbase/gate-go`	Beta
Java 17+	Maven Central	Beta
.NET 8+	NuGet	Beta
Rust	crates.io	Experimental

For anything else, the HTTP API is standard — any HTTP client works. Swagger / OpenAPI spec available on the API reference page.

Configuration

Configuration is YAML for self-hosted, admin UI + API for managed. Common settings:

Detection policy — strict / balanced / permissive
Dictionaries — attached via API or UI
Allowlists
HITL triggers — per rule
Per-model policies — different rules for different destinations
Retention — how long to keep audit logs (default 90 days)

Observability

Prometheus metrics endpoint on all modes. Typical metrics:

Request rate, anonymization latency, de-anonymization latency
Detection counts per category
HITL queue depth, approval rate, rejection rate
Per-LLM-provider call rate, error rate, latency

OpenTelemetry traces available for detailed debugging.