Architecture
You don’t need to know how Cortex is built to use it — but understanding the moving parts makes its behavior (especially around real-time events and delivery guarantees) far easier to reason about.
Cortex is a backend platform. You interact with it through a single REST API; behind that API a set of specialized workers do the heavy lifting.
The big picture
The control plane: the API
A single, stateless, horizontally-scalable NestJS REST API is the only thing you talk to. It handles authentication, multi-tenant resource management (sessions, messages, gatherings, plugins, functions, webhooks, sanctions), publishes domain events to the internal event bus, and serves Server-Sent Events for live updates.
Everything the dashboard can do, you can do through this API — the dashboard itself is just a client of it (built on the TypeScript SDK).
The worker fleet
Workers subscribe to the event bus and each owns one responsibility. They run as multiple replicas and coordinate with distributed locks so that no event is processed twice.
| Worker | What it does |
|---|---|
| Bot Worker | Joins an ODIN room as a headless peer, captures per-speaker audio, runs voice-activity detection, transcribes each utterance (OpenAI Whisper or a local engine) and posts messages back to the API. |
| Plugin Worker | Runs installed plugins against matching events (e.g. toxicity analysis on message.created) and stores the resulting annotations. |
| Webhook Worker | Drains a durable PostgreSQL outbox and delivers webhooks to your endpoints with retries and HMAC signing. |
| Functions Invoke Worker | A reverse proxy that routes HTTP invocations and event deliveries to your serverless functions running on ODIN Fleet. |
The bot receives an individual audio stream per peer, not a mixed-down room. That’s what makes accurate speaker attribution possible: every utterance is tied to the peer that produced it.
The data plane
- PostgreSQL stores all durable state — sessions, messages, participants, gatherings,
sanctions, webhook subscriptions, the webhook outbox and more. Every row is scoped by
tenantand (almost always)project. See Multi-Tenancy. - Redis is the internal event bus (Pub/Sub) and provides the distributed locks that keep worker replicas from double-processing.
From a spoken word to your code
The end-to-end path of a single utterance:
- Someone speaks in the ODIN room.
- The Bot Worker detects the utterance, transcribes it, and
POSTs a message to the API. - The API stores the message and publishes a
message.createdevent on the bus. - In parallel, every interested consumer reacts:
- The Plugin Worker runs toxicity/summarization plugins and stores annotations
(which themselves emit
*.annotation.createdevents). - The Webhook Worker writes an outbox row for each matching subscription and delivers it to your URL, retrying on failure.
- The Functions Invoke Worker delivers the event to any function subscribed to it.
- SSE streams the event to connected dashboards.
- The Plugin Worker runs toxicity/summarization plugins and stores annotations
(which themselves emit
This is why the same change reaches you through several surfaces — webhooks, SSE and functions are all driven by the same domain-event catalog.
Delivery guarantees
- Webhooks are durable. Events are written to a PostgreSQL outbox before delivery, so
a temporarily-down endpoint never loses events — the worker retries with exponential
backoff. Design your receiver to be idempotent (dedupe by event
id); an event may be delivered more than once. - SSE is live, not replayed. Server-Sent Events are for real-time UI. If a client
disconnects, reconnect and re-fetch current state (the SDK’s
watch()does this for you). - Functions receive events through the same reliable outbox path as webhooks.