StellarCloud Overview

Managed open-source AI inference, EU-hosted, per-token billing. Every model StellarBase uses internally is available as a standalone API for your own applications.

What StellarCloud is

StellarCloud is a collection of production-grade AI models served over HTTP APIs. The same endpoints that power StellarBase internally are exposed for your own use — so you get the exact inference stack we trust for regulated work, without having to deploy it.

Key properties:

Open-weights stack — leading open-source models for generation, embeddings, vision, and NER, plus our composite StellarOCR
EU-hosted — Frankfurt, Amsterdam, Prague. No transatlantic data transfers.
Zero retention — your prompts are not logged, not stored, not used to train anything
OpenAI-compatible — drop-in replacement for apps that use the OpenAI SDK
EUR billing — predictable per-token pricing, no hidden surcharges

Model families

Text processing

Language detection, lemmatization, named entity recognition, entity linking. Cheap, fast, multilingual. Detailed in Specialized Models.

Document processing

StellarOCR — one endpoint for text + layout + tables + figures + math from any PDF or image.

Embeddings & retrieval

Multilingual text embeddings (fast and high-recall tiers, plus a long-context variant), image embeddings, and rerankers.

Large language models

Open-weights LLMs for generation, reasoning, and coding — small, mid, and frontier tiers. Detailed in LLMs.

Smart routing

For the LLM endpoint, smart routing selects the cheapest / fastest / highest-quality model matching your constraints. You don’t have to pick a model by name unless you want to.

How it’s used

From StellarBase

Automatically. When your agent needs an embedding, or your workflow needs an LLM, StellarBase calls StellarCloud under the hood. No configuration required beyond the platform itself.

From your own apps

HTTP API, with a Python and TypeScript SDK. The chat endpoint is OpenAI-compatible (same request/response format), so existing OpenAI-based apps work by changing one URL.

Via StellarGate

If you want to use StellarCloud with PII anonymization — for GDPR-critical workloads — route your calls through StellarGate. Same API surface, privacy proxy in the middle.

Billing

Pay-as-you-go in EUR. Every API has a per-unit price (per million tokens, per thousand pages, per thousand images, per thousand searches). No minimums, no tiers — you pay for what you use.

See API pricing for the full breakdown.

Limits

Limit	Default	Max on request
Requests per second	50	500+
Concurrent connections	100	1,000+
Max prompt length	256K tokens	model-dependent
Max response length	32K tokens	model-dependent
File upload (OCR)	100 MB / file	1 GB on request

Reliability

99.7% SLA on the business-critical tier. Public status page. Incident history open. Multi-region failover for eligible endpoints. See Managed EU Cloud for operational details.

Self-hosted

The same models that power StellarCloud can run inside your own infrastructure. You pay an annual licence instead of per-token, get unlimited requests, and keep every byte in-network. See On-Premise.

Specialized Models — the full model catalog
LLMs — generative models in detail
Smart Routing — automatic model selection
API Reference — endpoint-by-endpoint docs