Search & Discovery

Natural-language retrieval across every connected source. Hybrid (semantic + keyword + structured), cross-lingual, permission-aware, citation-bound.

What “search” covers

Three layered capabilities, all exposed through the same UI and API:

Retrieval — given a query, return ranked passages from your corpus
Question answering — given a question, return an answer with citations to source passages
Discovery — exploration without a specific query: clusters, recent activity, related content

Hybrid retrieval

Three retrieval signals combine in every query:

Signal	What it captures
Semantic	Meaning — “low-cost airlines” matches “budget carriers”
Keyword	Exact terms — names, IDs, technical jargon
Structured	Metadata, entity, date, author filters

Pure semantic search misses exact-match queries (a specific contract ID, a specific gene name). Pure keyword search misses paraphrase. Hybrid is robust to both.

Cross-lingual

A query in Czech finds relevant German, French, and Polish results. Multilingual embeddings put content from every language into a shared vector space. See Multilingual.

Permission-aware

Search respects every layer of permissions. Documents a user can’t read won’t appear in results — even if they’re semantically perfect matches. This is enforced at the index layer, not as a post-filter, so there’s no information leak about what exists.

Re-ranking

After initial retrieval, a reranker re-scores the candidates for relevance — producing a much sharper top-10 than embedding similarity alone. Reranking is automatic on agent and chat queries; configurable on direct API access.

Filters

Combine semantic + structured filters in one query:

Source: only documents from a specific connector or collection
Date: published after / before / between
Language: only documents in specific languages
Entity: documents mentioning a specific person / organization / project
Type: PDFs only, emails only, etc.
Custom metadata: any field you’ve added during ingest

Question answering

For natural-language questions, search retrieves relevant passages, then an LLM synthesizes an answer grounded in those passages. The answer always cites — click any citation to jump to the source. If no relevant passages exist, the system says so rather than hallucinating.

QA is the default behaviour in chat. For direct retrieval (just give me passages, don’t synthesize), there’s a separate retrieval mode.

Discovery views

Beyond explicit search, the platform offers exploratory views:

Recent — what’s been added or modified recently
Trending — what your team has been searching / accessing
Related — given a document, what else is conceptually adjacent
Topic clusters — auto-grouped documents by theme
Citation graph — for academic / legal corpora, the network of who-cites-whom

Saved searches

Save a query for one-click re-execution. Combined with notifications, you get alerts when new documents match a saved query — useful for monitoring incoming literature, tracking a regulatory topic, watching a counterparty.

Search via API

Programmatic access for integration with other systems. Standard REST, returns ranked results with provenance and confidence scores (confidence scores ship Q3 2026). See API Reference.

Performance

Operation	Typical latency
Hybrid retrieval (top-10 from < 1M docs)	< 200 ms
Hybrid retrieval (top-10 from 100M docs)	< 600 ms
Question answering (with synthesis)	1–5 s, depending on model

Common patterns

Find supporting evidence for a claim

Pose the claim as a question. The QA flow returns the strongest supporting passages with citations.

Find counter-evidence

Negation often confuses naive search. Use the structured filter contradicts:<claim> for explicit counter-evidence retrieval.

Track a topic over time

Save the query, enable notifications, get a weekly digest of new matches.

Compare two perspectives

Run the same query against two different collections (legal team’s notes vs. opposing counsel’s submissions, your protocol vs. a published guideline) and view side-by-side.

DSM — the entity graph that powers structured queries
Multilingual — cross-language details
Chat — conversational search
API Reference — programmatic access