Conversational knowledge base · Hybrid GraphRAG · Multi-tenant

The hybrid RAG engine that answers right, fast, at any scale.

Vector + full-text + knowledge-graph retrieval on a 100% Rust stack. Every number below is measured on the live platform — replayable benchmark reports included.

Datasets benchmarked

Proven on real-world corpora — not just public QA.

Every category below was ingested and evaluated end-to-end on the live platform.

4 000+
benchmarks passed
🧪4,409
Public QA

MMLongBench & multi-hop QA — fact-checked independently, zero hallucination.

💰12/12
Finance & Private Equity

Fund prospectuses, fees, ISIN codes, vintages — exact figures, exhaustive lists.

⚖️10/10
Legal codes

Civil/insurance codes (FR + DZ), 20,000-line manuals — article-anchored, amendment-aware.

🛠️7/7
Enterprise support / SAV

After-sales & support documents — procedures, warranties, ticket knowledge bases.

📦1,515 p.
Client catalogues

Full product catalogues (100 MB+), batched OCR + image vision search — specs and product images.

🌍FR · EN · AR
Multilingual

First-class Arabic (RTL, broken-plural recall) alongside French and English — same precision.

100%
factual precision
4,409 QA public benchmarks — independently fact-checked, zero hallucination
265 ms
first results
real-time SSE streaming
500+
concurrent queries
p95 < 2 s on a single instance
25 MB
per microservice
100% Rust — cold start < 50 ms

p50 167 ms under 50 concurrent queries · ×3 faster than a Python stack · 66 unit tests green · TTFB 9 ms

Why UltraRAG

Built for precision. Engineered for load.

Vectors (Qdrant) + full-text (PostgreSQL) + dedicated entity channel + knowledge graph (Neo4j). Deterministic RRF fusion, BM25 rerank, entity boost.

Vector · Qdrant 1024-d
Full-text · PostgreSQL BM25
Entity channel · typed NER
Knowledge graph · Neo4j Cypher

→ RRF fusion + entity boost + MMR diversity

DOCUMENTSPDFDOCXJSONchunkCHUNKSQdrant1024-d vecPostgresfull-textKNOWLEDGE GRAPH — Neo4jSeven2ORGCEOPERSONFundFUNDParisPLACEISINCFOM&AHQLEADSOWNSNeo4jentity graphRRF FusionBM25 + MMR + boostQuerySSE streamANSWERsourced &verifiedLLM RouterMistral/GPT/…
Vector · QdrantFull-text · PostgresEntities · Neo4jRRF Fusion → Answer

Domain-tuned · Multilingual

Tuned for 5 domains — and tunable for yours

“Tuned” is not a setting — it’s specialized retrieval algorithms that change how the engine ranks, anchors and traces evidence for each kind of corpus. Pick a profile and the pipeline re-optimizes itself. Need a new vertical (medical, insurance, after-sales…)? We tune a dedicated profile for it.

§

Legal

Article anchoring, amendment & temporal tracing (created → modified → abrogated), in-force-version bias.

Enterprise docs

Entity & knowledge-graph emphasis — companies, people, funds, relationships always consulted.

Timeline

Chronology-aware retrieval, "as it stands now" bias, version history across document updates.

📦

Catalogue

Product corpora — exact-match boost + reranking; batched OCR for 100 MB+ catalogues; product images via vision search.

Generic

Balanced hybrid defaults — the query planner adapts the strategy per question.

Languages — full retrieval + answer quality

Françaisdefault
Englishnative
العربية — ArabicRTL · morphology

Arabic is first-class: correct stemming & normalization (hamza, tashkeel), a dedicated morphology channel for broken plurals, right-to-left rendering, and the same node & temporal tracing as French. Postgres FTS covers 8+ more analyzers (English, Spanish, German, Italian, Portuguese, Russian…).

3
languages fully tuned
FR · EN · AR — more on request

Per-workspace control

Define each corpus — no code, no re-deploy

Every workspace is an isolated corpus with its own parameters, config and metadata. Set them from the console or one API call. All optional — leave blank for sensible defaults. Domain knowledge lives in the config, never hardcoded in the engine.

  • Theme profile — the retrieval algorithm preset
  • FTS language — the corpus analyzer (incl. Arabic)
  • Predefined entities — domain terms you know matter
  • Summary hint — for count / statistics questions
  • Isolated, JWT-scoped per organization & client
Advanced — corpus options (optional)
Theme / Profile
Legal — articles + amendment/temporal tracing
switches the retrieval algorithm
Full-text search language
العربية — Arabic
per-corpus analyzer · set before ingest
Predefined entities
المادة, القانون المدني, Seven2
always boosted + highlighted
Summary hint
Répartition par … · statistiques
powers "how many" answers

What it does

An answer engine, not a search box

Hybrid 4-channel retrieval

Vector + full-text BM25 + typed entity channel + knowledge graph, fused by deterministic RRF — recall a single method misses.

Explainable answers

Every answer is sourced, with clickable inline [N] citations that open the exact PDF page or highlighted section. A live graph trace shows which entities answered.

Temporal & amendment tracing

Tracks a fact across versions — created, modified, abrogated — and answers "as it stood in 1985" vs "in force today".

Verified abstention

The engine decomposes its answer into claims, scores each against the sources, and says "insufficient evidence" rather than invent.

Vision search — real images

Ask for “the shoe cabinet with a continuous front” and get the product photo itself, linked from the page that describes it.

Interactive reading (/read)

Document on the left, chat on the right: the passages used light up live while the answer streams; click a citation to jump to the section.

Big-document ingestion

100 MB+ / 1,500-page catalogues via batched OCR; PDF, DOCX, PPTX, XLSX, images. Idempotent re-ingest by content hash.

Cost transparency

Per-query cost in dollars (embedding + LLM in/out), priced automatically from the configured model — plus a simulator to estimate a corpus before ingesting.

Multilingual — incl. Arabic

French, English and Arabic fully tuned (RTL, morphology), plus 8+ full-text analyzers. Same quality across languages.

Domain profiles

Legal, enterprise, timeline and catalogue presets re-optimize the algorithm per corpus — and we tune new verticals on request.

Resilient ingestion

Crash-safe job queue, orphan recovery, heartbeats, bounded retries with backoff, circuit breaker, and true cancellation of a running job.

Any model, every stage

LLM, embeddings, reranker, OCR and vision are each swappable — OpenAI, Mistral, Gemini, or fully local via Ollama/vLLM. One env file.

Under the hood · World-class

The S+ engine — the techniques the best RAG systems are built on

Every advanced retrieval method that defines the state of the art is built in — each one optional, each one measured. Turn on what a corpus needs; the benchmarked path stays untouched until you do.

ColBERT

Late-interaction reranking

Token-level MaxSim (ColBERT) — the precision technique behind the leading commercial engines, as an optional local sidecar.

GraphRAG

Graph-native retrieval

Entity-anchored multi-hop (local) and community-scale synthesis (global) over the knowledge graph — relational answers vectors miss.

RAPTOR

Hierarchical summaries

RAPTOR-style section summaries indexed beside the leaves: synthesis questions read the overview, detail questions read the source.

Self-RAG

Self-correcting retrieval

Self-RAG-style recovery: when evidence is weak the engine reformulates and retries once before it answers — instead of guessing.

Agentic

Agentic decomposition

Multi-part and comparative questions are split into sub-questions, each retrieved on its own, then merged into one cited answer.

/extract

Structured extraction

A JSON schema in, exact field values out — each with the source passage it came from, and an honest “not found” when it isn’t there.

Tested & verifiedAltaroc 12/12 · SPF 10/10 · +11 pts vs EdgeQuake on MMLongBench · a zero-regression eval gate guards every release.

Built for high-stakes domains

Where precision is non-negotiable

Legal & regulatory

Codes, contracts, jurisprudence. Article anchoring, amendment history, "in-force" reasoning — answers a lawyer can cite.

Finance

Funds, fees, ISINs, counterparty and portfolio Q&A. Exhaustive lists, deterministic counts, zero hallucination.

Enterprise knowledge

Policies, HR, procurement, board minutes across thousands of documents — with entities, relations and cross-references.

Catalogues & after-sales

Full product catalogues (100 MB+) and support documents — specs, warranties, procedures, and the product images themselves.

Public sector

Sovereign, on-premise, multilingual. Citizen and agent assistance behind your firewall, with full auditability.

Ingestion Pipeline

📄
IngestPDF DOCX XLSX…
Chunksection-aware
Embed1024-d · Redis
🕸
NERFUND ORG PERSON
IndexQdrant + Neo4j
Readyinstant search
PDFDOCXPPTXXLSXCSVHTMLJSONJSONLMD

Universal ingestion

Universal ingestion

PDF, DOCX, PPTX, XLSX, CSV, HTML, JSON, JSONL (one document per line) and Markdown. Section-aware chunking, concurrent embedding batches (×4 throughput), LLM entity extraction with rule-based fallback.

×4
throughput batching
<50ms
cold start
8MB
binary size
14MB
idle RAM

Multi-tenant · Conversational

Conversational & multi-tenant

Persistent conversations, automatic follow-up condensation, per-conversation memory, isolated workspaces per organization and client.

  • ✓ Conversations persistées par utilisateur
  • ✓ Condensation automatique des questions de suivi
  • ✓ Workspaces isolés par organisation & client
  • ✓ JWT auth · SSE streaming token par token
  • ✓ API OpenAI-compatible
workspace: code-civile-dz-001
Que prévoit l'article 41 du code civil algérien ?

L'article 41 dispose que…

265ms · 3 sources · score 0.94

📎 art41.pdf · §2📎 code-civil-2023.pdf · §8📎 jurisprudence.pdf · §1
Ask a question…Send

Providers — swap with one env var

OpenAI
Mistral
DeepSeek
Qwen
Cohere
Jina
Voyage
HuggingFace
Ollama / local
LLM_PROVIDER=mistral → zero code changes

No vendor lock-in

No vendor lock-in

LLM, embeddings, NER and reranker are each independently swappable: OpenAI, Mistral, DeepSeek, Qwen, Kimi, Cohere, Jina, Voyage, HuggingFace — or fully local with Ollama/TEI. One env file, zero code changes.

White-label · Sovereign · No lock-in

Your brand. Your infrastructure. Your data stays home.

UltraRAG ships as a platform you operate, not a service you depend on. Deploy it wherever your data lives, put your own brand on it, and swap any model with one env var. Nothing leaves your perimeter.

Your cloud (VPC)
Runs in your AWS / Azure / GCP / OVH account.
On-premise
Your servers, your network, your control.
Air-gapped
Fully offline with local models (Ollama / TEI).
Managed
We host it — sovereign EU infrastructure.

White-label & OEM

  • Your name, logo, colors and domain
  • Embeddable console + API under your product
  • Resell to your own customers as multi-tenant
  • Custom domain profiles tuned for your vertical

No vendor lock-in

  • Any LLM / embeddings / OCR — OpenAI, Mistral, or fully local
  • Bring your own keys; zero data retention
  • Standard stores: PostgreSQL · Qdrant · Neo4j · Redis
  • One .env — no code changes

Security & sovereignty

JWT authWorkspace isolationFail-closed multi-tenant modeData residencyZero data retentionBring-your-own-modelAudit-readyGDPR / AI-Act aligned

Architecture

6 Rust microservices, 4 specialized stores

PostgreSQL (full-text) · Qdrant (1024-d vectors) · Neo4j (entity & relation graph) · Redis (queues + cache) — observed by Prometheus + Grafana + MLflow

Rust microservices — the hot path

api-gateway
Orchestration, SSE, conversations, JWT auth
:5150
retrieval-engine
Hybrid fusion: RRF + BM25 + graph + MMR
:5151
llm-router
Multi-provider LLM (OpenAI-compatible), token-by-token streaming
:5153
embedding-service
1024-d embeddings, Redis cache (<1 ms hit)
:5154
graph-service
Typed NER (FUND/ORG/PERSON…), Neo4j Cypher
:5155
ingestion-worker
File pipeline ×N replicas, Redis queue
×2 daemon

AI sidecars & apps

reranker (TEI)
Cross-encoder rerank — Deep mode
:5158
tei-embed
Local embeddings (bge-m3) — offline option
:5159
frontend
Chat & retrieval console (Next.js)
:5058
admin-ui
Workspace & ingestion API
:8188

Data stores

PostgreSQL
Full-text (BM25) + metadata + chunks
:5433
Qdrant
Vector store — 1024-d
:6333
Neo4j
Entity & relation graph
:7474
Redis
Ingestion queue + embedding cache
:6380

Observability

Grafana
Dashboards — latency, cost, throughput
:5056
Prometheus
Metrics scrape
:9090
MLflow
Experiment & benchmark tracking
:5062

Proven reliability

Zero hallucination. Every answer is sourced.

Replayable benchmarks

  • Finance suite (funds, fees, ISIN): 12/12
  • 20,000-line document suite: 10/10
  • Fact-checking independent of the LLM judge
  • MMLongBench head-to-head: beats EdgeQuake on its own 123-question set (+11 pts)

Deterministic answers

  • Same question ⇒ same answer (fixed seed)
  • Exhaustive lists: 15/15 items, 8 runs out of 8
  • “Not in corpus” instead of a hallucination

Under real load

  • 50 concurrent: p50 167 ms
  • 500 concurrent: p95 < 2 s
  • 0.3% CPU / 54 MB RAM on the gateway

Engagement

Premium, by design

A platform license — not metered tokens. Start with a measured pilot, scale to production, own it end-to-end.

Pilot

Fixed fee
2-week evaluation
  • Your documents, your benchmarks
  • Replayable accuracy report
  • Hosted or your cloud
  • Go / no-go in two weeks
Start a pilot
Most chosen

Production

Subscription
Managed or your cloud
  • SLA + support
  • Multi-tenant workspaces
  • Monitoring (Grafana/MLflow)
  • Model & profile tuning
Talk to us

Enterprise · White-label

Custom
On-prem / air-gapped / OEM
  • Your brand & domain
  • On-premise or air-gapped
  • Custom vertical profiles
  • Source escrow on request
Request a quote

Deployment

One command. Full platform.

$ ./start_rust_stack.sh

✓ postgres · redis · qdrant · neo4j        healthy
✓ 6 Rust microservices                     healthy
✓ reranker · tei-embed                     healthy
✓ frontend · admin · vitrine               ready
✓ grafana · prometheus · mlflow            monitoring

→ App        http://localhost:5058
→ API        http://localhost:5150
→ Dashboards http://localhost:5056   (Grafana)
→ MLflow     http://localhost:5062
→ Metrics    http://localhost:9090   (Prometheus)

~25 MB Docker images per service. Runs on one server or a cluster. On-premise, private cloud or SaaS — your data stays home.

Investors · Buyers · Testers

Precision is the product. Come measure it yourself.

Live demo on your own documents, replayable benchmarks, open technical due diligence. License, white-label or acquisition — let’s talk.

contact@syctra.comDemo on your own documents · replayable benchmarks · open due diligence.

Reply within 24–48h · contact@syctra.com