API Reference
Endpoint-by-endpoint documentation for StellarCloud. OpenAI-compatible where possible. REST + JSON end-to-end.
Base URL
Managed deployments: https://api.stellarcloud.ai/v1
Self-hosted deployments: https://your-domain/v1
Versioning is path-based. Breaking changes bump to /v2 with a minimum 12-month deprecation window.
Authentication
Bearer token in the Authorization header. API keys are managed in the admin UI with per-key scopes, rate limits, and expiry.
Keys are either organization-wide or Base-scoped. Best practice: one key per service / application, rotated quarterly.
Endpoints overview
Chat & completions
| Endpoint | Purpose |
|---|---|
POST /chat/completions | OpenAI-compatible chat completion |
POST /completions | Legacy text completion (discouraged) |
POST /embeddings | OpenAI-compatible embeddings |
StellarCloud-specific
| Endpoint | Purpose |
|---|---|
POST /ocr | StellarOCR — document to structured output |
POST /detect-language | GlotLID language detection |
POST /ner | Named entity recognition |
POST /entity-linking | Link entities to canonical IDs |
POST /lemmatize | Stanza lemmatization |
POST /rerank | Cross-encoder re-ranking |
POST /image-embeddings | DINOv3 image embeddings |
Platform
| Endpoint | Purpose |
|---|---|
POST /search | Hybrid search over a Base |
POST /sources | Manage data source connectors |
POST /agents | Create / update agents |
POST /workflows/<id>/trigger | Run a workflow |
GET /audit-log | Query the audit log |
StellarGate
| Endpoint | Purpose |
|---|---|
POST /gate/anonymize | Mode 2 — anonymize without forwarding |
POST /gate/deanonymize | Mode 2 — resolve tokens with map_id |
POST /gate/dictionaries | Manage custom dictionaries |
POST /gate/hitl/queue | HITL approval queue |
For Mode 1 (transparent proxy), use POST /chat/completions — StellarGate handles anonymization transparently.
Request / response conventions
- All requests and responses are JSON, UTF-8 encoded
- Timestamps are ISO 8601 in UTC
- IDs are opaque strings, case-sensitive
- Pagination via
cursorparameter;next_cursorin the response - Rate limits communicated via
X-RateLimit-*headers
Errors
Standard HTTP status codes + JSON error body:
| Status | Meaning |
|---|---|
| 400 | Bad request — invalid JSON or parameters |
| 401 | Missing or invalid API key |
| 403 | API key lacks permission for the resource |
| 404 | Resource not found |
| 409 | Conflict (e.g. concurrent modification) |
| 422 | Input failed schema validation |
| 429 | Rate limit exceeded — retry after Retry-After seconds |
| 500 | Internal server error — report to support if persistent |
| 502 / 503 | Upstream provider unavailable |
Error body: code (machine-readable), message (human-readable), request_id (for support correlation).
Rate limits
Defaults:
- 50 requests per second per API key
- 100 concurrent connections per API key
- Daily spending cap configurable per key
Higher limits available on request.
Streaming
Chat completions support Server-Sent Events (SSE) streaming via stream: true. Individual tokens are streamed in standard OpenAI SSE format. Ends with data: [DONE].
Idempotency
Non-idempotent endpoints (POST that creates resources) accept an Idempotency-Key header. If we see the same key within 24 hours, we return the cached response rather than re-executing.
Webhooks
For long-running operations (large OCR jobs, workflow runs), webhooks notify your endpoint on completion. Configure webhook URLs in the admin UI. Payloads are HMAC-signed — verify the signature before trusting.
SDKs
Official SDKs wrap the REST API with language-specific ergonomics:
- Python —
stellarbaseon PyPI - TypeScript / Node —
@stellarbase/sdkon npm - Go, Java, .NET, Rust — beta
All SDKs support the OpenAI compatibility layer — if you’re already using openai-python or openai-node, the minimum change is just the base URL.
OpenAPI spec
Full OpenAPI 3.1 spec available at https://api.stellarcloud.ai/v1/openapi.json. Import into Postman, Insomnia, or any code-generator to scaffold client code.
