Conversational knowledge base · Hybrid GraphRAG · Multi-tenant

The hybrid RAG engine that answers right, fast, at any scale.

Vector + full-text + knowledge-graph retrieval on a 100% Rust stack. Every number below is measured on the live platform — replayable benchmark reports included.

Request a demo →Demos →See capabilities →

Datasets benchmarked

Proven on real-world corpora — not just public QA.

Every category below was ingested and evaluated end-to-end on the live platform.

4 000+

benchmarks passed

🧪4,409

Public QA

MMLongBench & multi-hop QA — fact-checked independently, zero hallucination.

💰12/12

Finance & Private Equity

Fund prospectuses, fees, ISIN codes, vintages — exact figures, exhaustive lists.

⚖️10/10

Legal codes

Civil/insurance codes (FR + DZ), 20,000-line manuals — article-anchored, amendment-aware.

🛠️7/7

Enterprise support / SAV

After-sales & support documents — procedures, warranties, ticket knowledge bases.

📦1,515 p.

Client catalogues

Full product catalogues (100 MB+), batched OCR + image vision search — specs and product images.

🌍FR · EN · AR

Multilingual

First-class Arabic (RTL, broken-plural recall) alongside French and English — same precision.

100%

factual precision

4,409 QA public benchmarks — independently fact-checked, zero hallucination

265 ms

first results

real-time SSE streaming

500+

concurrent queries

p95 < 2 s on a single instance

25 MB

per microservice

100% Rust — cold start < 50 ms

p50 167 ms under 50 concurrent queries · ×3 faster than a Python stack · 66 unit tests green · TTFB 9 ms

Why UltraRAG

Built for precision. Engineered for load.

Vectors (Qdrant) + full-text (PostgreSQL) + dedicated entity channel + knowledge graph (Neo4j). Deterministic RRF fusion, BM25 rerank, entity boost.

Vector · Qdrant 1024-d

Full-text · PostgreSQL BM25

Entity channel · typed NER

Knowledge graph · Neo4j Cypher

→ RRF fusion + entity boost + MMR diversity

Vector · QdrantFull-text · PostgresEntities · Neo4jRRF Fusion → Answer

Domain-tuned · Multilingual

Tuned for 5 domains — and tunable for yours

“Tuned” is not a setting — it’s specialized retrieval algorithms that change how the engine ranks, anchors and traces evidence for each kind of corpus. Pick a profile and the pipeline re-optimizes itself. Need a new vertical (medical, insurance, after-sales…)? We tune a dedicated profile for it.

Legal

Article anchoring, amendment & temporal tracing (created → modified → abrogated), in-force-version bias.

⛓

Enterprise docs

Entity & knowledge-graph emphasis — companies, people, funds, relationships always consulted.

◷

Timeline

Chronology-aware retrieval, "as it stands now" bias, version history across document updates.

📦

Catalogue

Product corpora — exact-match boost + reranking; batched OCR for 100 MB+ catalogues; product images via vision search.

◆

Generic

Balanced hybrid defaults — the query planner adapts the strategy per question.

Languages — full retrieval + answer quality

Françaisdefault

Englishnative

العربية — ArabicRTL · morphology

Arabic is first-class: correct stemming & normalization (hamza, tashkeel), a dedicated morphology channel for broken plurals, right-to-left rendering, and the same node & temporal tracing as French. Postgres FTS covers 8+ more analyzers (English, Spanish, German, Italian, Portuguese, Russian…).

languages fully tuned

FR · EN · AR — more on request

Per-workspace control

Define each corpus — no code, no re-deploy

Every workspace is an isolated corpus with its own parameters, config and metadata. Set them from the console or one API call. All optional — leave blank for sensible defaults. Domain knowledge lives in the config, never hardcoded in the engine.

✓ Theme profile — the retrieval algorithm preset
✓ FTS language — the corpus analyzer (incl. Arabic)
✓ Predefined entities — domain terms you know matter
✓ Summary hint — for count / statistics questions
✓ Isolated, JWT-scoped per organization & client

Advanced — corpus options (optional)

Theme / Profile

Legal — articles + amendment/temporal tracing▾

switches the retrieval algorithm

Full-text search language

العربية — Arabic▾

per-corpus analyzer · set before ingest

Predefined entities

المادة, القانون المدني, Seven2▾

always boosted + highlighted

Summary hint

Répartition par … · statistiques▾

powers "how many" answers

What it does

An answer engine, not a search box

Hybrid 4-channel retrieval

Vector + full-text BM25 + typed entity channel + knowledge graph, fused by deterministic RRF — recall a single method misses.

Explainable answers

Every answer is sourced, with clickable inline [N] citations that open the exact PDF page or highlighted section. A live graph trace shows which entities answered.

Temporal & amendment tracing

Tracks a fact across versions — created, modified, abrogated — and answers "as it stood in 1985" vs "in force today".

Verified abstention

The engine decomposes its answer into claims, scores each against the sources, and says "insufficient evidence" rather than invent.

Vision search — real images

Ask for “the shoe cabinet with a continuous front” and get the product photo itself, linked from the page that describes it.

Interactive reading (/read)

Document on the left, chat on the right: the passages used light up live while the answer streams; click a citation to jump to the section.

Big-document ingestion

100 MB+ / 1,500-page catalogues via batched OCR; PDF, DOCX, PPTX, XLSX, images. Idempotent re-ingest by content hash.

Cost transparency

Per-query cost in dollars (embedding + LLM in/out), priced automatically from the configured model — plus a simulator to estimate a corpus before ingesting.

Multilingual — incl. Arabic

French, English and Arabic fully tuned (RTL, morphology), plus 8+ full-text analyzers. Same quality across languages.

Domain profiles

Legal, enterprise, timeline and catalogue presets re-optimize the algorithm per corpus — and we tune new verticals on request.

Resilient ingestion

Crash-safe job queue, orphan recovery, heartbeats, bounded retries with backoff, circuit breaker, and true cancellation of a running job.

Any model, every stage

LLM, embeddings, reranker, OCR and vision are each swappable — OpenAI, Mistral, Gemini, or fully local via Ollama/vLLM. One env file.

Under the hood · World-class

The S+ engine — the techniques the best RAG systems are built on

Every advanced retrieval method that defines the state of the art is built in — each one optional, each one measured. Turn on what a corpus needs; the benchmarked path stays untouched until you do.

ColBERT

Late-interaction reranking

Token-level MaxSim (ColBERT) — the precision technique behind the leading commercial engines, as an optional local sidecar.

GraphRAG

Graph-native retrieval

Entity-anchored multi-hop (local) and community-scale synthesis (global) over the knowledge graph — relational answers vectors miss.

RAPTOR

Hierarchical summaries

RAPTOR-style section summaries indexed beside the leaves: synthesis questions read the overview, detail questions read the source.

Self-RAG

Self-correcting retrieval

Self-RAG-style recovery: when evidence is weak the engine reformulates and retries once before it answers — instead of guessing.

Agentic

Agentic decomposition

Multi-part and comparative questions are split into sub-questions, each retrieved on its own, then merged into one cited answer.

/extract

Structured extraction

A JSON schema in, exact field values out — each with the source passage it came from, and an honest “not found” when it isn’t there.

Tested & verifiedAltaroc 12/12 · SPF 10/10 · +11 pts vs EdgeQuake on MMLongBench · a zero-regression eval gate guards every release.

Built for high-stakes domains

Where precision is non-negotiable

Legal & regulatory

Codes, contracts, jurisprudence. Article anchoring, amendment history, "in-force" reasoning — answers a lawyer can cite.

Finance

Funds, fees, ISINs, counterparty and portfolio Q&A. Exhaustive lists, deterministic counts, zero hallucination.

Enterprise knowledge

Policies, HR, procurement, board minutes across thousands of documents — with entities, relations and cross-references.

Catalogues & after-sales

Full product catalogues (100 MB+) and support documents — specs, warranties, procedures, and the product images themselves.

Public sector

Sovereign, on-premise, multilingual. Citizen and agent assistance behind your firewall, with full auditability.

Ingestion Pipeline

📄

IngestPDF DOCX XLSX…

→

✂

Chunksection-aware

→

⟡

Embed1024-d · Redis

→

🕸

NERFUND ORG PERSON

→

⚡

IndexQdrant + Neo4j

→

✓

Readyinstant search

PDFDOCXPPTXXLSXCSVHTMLJSONJSONLMD

Universal ingestion

PDF, DOCX, PPTX, XLSX, CSV, HTML, JSON, JSONL (one document per line) and Markdown. Section-aware chunking, concurrent embedding batches (×4 throughput), LLM entity extraction with rule-based fallback.

×4

throughput batching

<50ms

cold start

8MB

binary size

14MB

idle RAM

Multi-tenant · Conversational

Conversational & multi-tenant

Persistent conversations, automatic follow-up condensation, per-conversation memory, isolated workspaces per organization and client.

✓ Conversations persistées par utilisateur
✓ Condensation automatique des questions de suivi
✓ Workspaces isolés par organisation & client
✓ JWT auth · SSE streaming token par token
✓ API OpenAI-compatible

workspace: code-civile-dz-001

Que prévoit l'article 41 du code civil algérien ?

⚡

L'article 41 dispose que…

265ms · 3 sources · score 0.94

📎 art41.pdf · §2📎 code-civil-2023.pdf · §8📎 jurisprudence.pdf · §1

Ask a question…Send

Providers — swap with one env var

OpenAI

Mistral

DeepSeek

Qwen

Cohere

Jina

Voyage

HuggingFace

Ollama / local

LLM_PROVIDER=mistral → zero code changes

No vendor lock-in

LLM, embeddings, NER and reranker are each independently swappable: OpenAI, Mistral, DeepSeek, Qwen, Kimi, Cohere, Jina, Voyage, HuggingFace — or fully local with Ollama/TEI. One env file, zero code changes.

White-label · Sovereign · No lock-in

Your brand. Your infrastructure. Your data stays home.

UltraRAG ships as a platform you operate, not a service you depend on. Deploy it wherever your data lives, put your own brand on it, and swap any model with one env var. Nothing leaves your perimeter.

Your cloud (VPC)

Runs in your AWS / Azure / GCP / OVH account.

On-premise

Your servers, your network, your control.

Air-gapped

Fully offline with local models (Ollama / TEI).

Managed

We host it — sovereign EU infrastructure.

White-label & OEM

✓ Your name, logo, colors and domain
✓ Embeddable console + API under your product
✓ Resell to your own customers as multi-tenant
✓ Custom domain profiles tuned for your vertical

No vendor lock-in

✓ Any LLM / embeddings / OCR — OpenAI, Mistral, or fully local
✓ Bring your own keys; zero data retention
✓ Standard stores: PostgreSQL · Qdrant · Neo4j · Redis
✓ One .env — no code changes

Security & sovereignty

JWT authWorkspace isolationFail-closed multi-tenant modeData residencyZero data retentionBring-your-own-modelAudit-readyGDPR / AI-Act aligned

Architecture

6 Rust microservices, 4 specialized stores

PostgreSQL (full-text) · Qdrant (1024-d vectors) · Neo4j (entity & relation graph) · Redis (queues + cache) — observed by Prometheus + Grafana + MLflow

Rust microservices — the hot path

api-gateway

Orchestration, SSE, conversations, JWT auth

:5150

retrieval-engine

Hybrid fusion: RRF + BM25 + graph + MMR

:5151

llm-router

Multi-provider LLM (OpenAI-compatible), token-by-token streaming

:5153

embedding-service

1024-d embeddings, Redis cache (<1 ms hit)

:5154

graph-service

Typed NER (FUND/ORG/PERSON…), Neo4j Cypher

:5155

ingestion-worker

File pipeline ×N replicas, Redis queue

×2 daemon

AI sidecars & apps

reranker (TEI)

Cross-encoder rerank — Deep mode

:5158

tei-embed

Local embeddings (bge-m3) — offline option

:5159

frontend

Chat & retrieval console (Next.js)

:5058

admin-ui

Workspace & ingestion API

:8188

Data stores

PostgreSQL

Full-text (BM25) + metadata + chunks

:5433

Qdrant

Vector store — 1024-d

:6333

Neo4j

Entity & relation graph

:7474

Redis

Ingestion queue + embedding cache

:6380

Observability

Grafana

Dashboards — latency, cost, throughput

:5056

Prometheus

Metrics scrape

:9090

MLflow

Experiment & benchmark tracking

:5062

Proven reliability

Zero hallucination. Every answer is sourced.

Replayable benchmarks

✓ Finance suite (funds, fees, ISIN): 12/12
✓ 20,000-line document suite: 10/10
✓ Fact-checking independent of the LLM judge
✓ MMLongBench head-to-head: beats EdgeQuake on its own 123-question set (+11 pts)

Deterministic answers

✓ Same question ⇒ same answer (fixed seed)
✓ Exhaustive lists: 15/15 items, 8 runs out of 8
✓ “Not in corpus” instead of a hallucination

Under real load

✓ 50 concurrent: p50 167 ms
✓ 500 concurrent: p95 < 2 s
✓ 0.3% CPU / 54 MB RAM on the gateway

Engagement

Premium, by design

A platform license — not metered tokens. Start with a measured pilot, scale to production, own it end-to-end.

Pilot

Fixed fee

2-week evaluation

✓ Your documents, your benchmarks
✓ Replayable accuracy report
✓ Hosted or your cloud
✓ Go / no-go in two weeks

Start a pilot

Most chosen

Production

Subscription

Managed or your cloud

✓ SLA + support
✓ Multi-tenant workspaces
✓ Monitoring (Grafana/MLflow)
✓ Model & profile tuning

Talk to us

Enterprise · White-label

Custom

On-prem / air-gapped / OEM

✓ Your brand & domain
✓ On-premise or air-gapped
✓ Custom vertical profiles
✓ Source escrow on request

Request a quote

Deployment

One command. Full platform.

$ ./start_rust_stack.sh

✓ postgres · redis · qdrant · neo4j        healthy
✓ 6 Rust microservices                     healthy
✓ reranker · tei-embed                     healthy
✓ frontend · admin · vitrine               ready
✓ grafana · prometheus · mlflow            monitoring

→ App        http://localhost:5058
→ API        http://localhost:5150
→ Dashboards http://localhost:5056   (Grafana)
→ MLflow     http://localhost:5062
→ Metrics    http://localhost:9090   (Prometheus)

~25 MB Docker images per service. Runs on one server or a cluster. On-premise, private cloud or SaaS — your data stays home.

Investors · Buyers · Testers

Precision is the product. Come measure it yourself.

Live demo on your own documents, replayable benchmarks, open technical due diligence. License, white-label or acquisition — let’s talk.

✉ contact@syctra.comDemo on your own documents · replayable benchmarks · open due diligence.