Open-source AI models,
production-ready.
We host and serve the best open-source AI models — language detection, OCR, NER, embeddings, rerankers, and LLMs. Simple API, transparent per-token pricing, EU infrastructure.
Purpose-built. State of the art. Hosted for you.
Small, fast, accurate models for NLP and vision tasks. Curated and tuned by us — you just call the API.
Language Detection
Identify 1,000+ languages with high accuracy — including rare and low-resource ones.
GlotLID v3Lemmatization
Neural lemmatizer reducing words to base forms across 60+ languages.
StanzaNamed Entity Recognition
Zero-shot NER — detect any entity type (person, org, amount, custom) without fine-tuning.
GLiNER-relex largeEntity Linking
Link detected entities to knowledge bases and canonical identifiers (Wikidata, custom KB).
GLiNER Linker largeStellarOCR
One endpoint · structured outputUpload a PDF or image. StellarOCR detects structure, extracts body text, handles complex tables, and preserves math formulas — returning clean, structured output. No stitching, no orchestration.
Headers, paragraphs, columns, figures, reading order — preserved.
Printed, handwritten, multi-language. Fast and layout-aware.
Dedicated high-fidelity extraction for tables and difficult scans.
Inline and display equations converted to clean LaTeX.
Text Embeddings — Small
Fast multilingual dense embeddings for semantic search and similarity. 1024-dim.
Qwen3-Embedding 0.6BText Embeddings — Large
Higher-quality multilingual embeddings for retrieval where recall matters most.
Qwen3-Embedding 8BText Embeddings — Long Context
8K context multilingual embeddings. Ideal for whole-document retrieval.
BGE-M3Image Embeddings
Visual embeddings for image search and cross-modal retrieval. Self-supervised.
DINOv3 ViT-LReranking
Cross-encoder re-ranking of search results. Dramatic quality boost over bi-encoders.
Qwen3-Reranker 0.6BThe best open-weights LLMs, served on our GPUs
We run them on our infrastructure so you don't have to. Full generative capability without sending your data to US model providers.
GPT-OSS 120B
Apache 2.0Open-weights LLM by OpenAI. Instruction-tuned, broad capabilities.
Devstral 2
Apache 2.0Mistral coding model. Strong on software tasks, tool use, agentic workflows.
Qwen 3.5 397B A17B
Apache 2.0Alibaba MoE flagship. Multilingual reasoning, long context, frontier-tier quality.
Ship in under 5 minutes
No complex setup. Sign up, generate keys, integrate. We handle model serving, scaling, and uptime.
Sign up
Create your account, choose a plan.
Create an API key
Scoped tokens with per-model limits.
Call any model
OpenAI-compatible endpoints.
Pay per token
Transparent usage, invoiced monthly.
curl https://api.stellarcloud.ai/v1/chat/completions \
-H "Authorization: Bearer $STELLARCLOUD_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b",
"messages": [{ "role": "user", "content": "Hello" }]
}'Inference in Europe. Governed by Europe.
All models run on GPUs in EU data centers. Your prompts and responses never leave the region. No US Cloud Act exposure. GDPR compliant by design.
Your data stays in EU. No foreign government can subpoena it via extraterritorial laws.
GDPR, EU Data Act, DORA — built in. No SCCs needed for EU data transfers.
Sub-100ms to your EU customers. No transatlantic round-trip.
Pay per million tokens. No minimums.
Transparent usage-based pricing. Only pay for what you use. Pre-paid credits or monthly invoicing for teams.
Per-token billing
Simple rates per million input/output tokens. Different rates per model.
Free tier included
Every account gets free monthly credits to test all models before committing.
Real-time usage dashboard
Track spend per model, per key, per team member. Set hard limits and alerts.
Open-source models, zero hassle.
Create an account, generate an API key, and ship. Free credits on signup.
