Skip to main content

Architecture

You don’t need to know how Cortex is built to use it — but understanding the moving parts makes its behavior (especially around real-time events and delivery guarantees) far easier to reason about.

Cortex is a backend platform. You interact with it through a single REST API; behind that API a set of specialized workers do the heavy lifting.

The big picture

The control plane: the API

A single, stateless, horizontally-scalable NestJS REST API is the only thing you talk to. It handles authentication, multi-tenant resource management (sessions, messages, gatherings, plugins, functions, webhooks, sanctions), publishes domain events to the internal event bus, and serves Server-Sent Events for live updates.

Everything the dashboard can do, you can do through this API — the dashboard itself is just a client of it (built on the TypeScript SDK).

The worker fleet

Workers subscribe to the event bus and each owns one responsibility. They run as multiple replicas and coordinate with distributed locks so that no event is processed twice.

WorkerWhat it does
Bot WorkerJoins an ODIN room as a headless peer, captures per-speaker audio, runs voice-activity detection, transcribes each utterance (OpenAI Whisper or a local engine) and posts messages back to the API.
Plugin WorkerRuns installed plugins against matching events (e.g. toxicity analysis on message.created) and stores the resulting annotations.
Webhook WorkerDrains a durable PostgreSQL outbox and delivers webhooks to your endpoints with retries and HMAC signing.
Functions Invoke WorkerA reverse proxy that routes HTTP invocations and event deliveries to your serverless functions running on ODIN Fleet.
Per-speaker audio matters

The bot receives an individual audio stream per peer, not a mixed-down room. That’s what makes accurate speaker attribution possible: every utterance is tied to the peer that produced it.

The data plane

  • PostgreSQL stores all durable state — sessions, messages, participants, gatherings, sanctions, webhook subscriptions, the webhook outbox and more. Every row is scoped by tenant and (almost always) project. See Multi-Tenancy.
  • Redis is the internal event bus (Pub/Sub) and provides the distributed locks that keep worker replicas from double-processing.

From a spoken word to your code

The end-to-end path of a single utterance:

  1. Someone speaks in the ODIN room.
  2. The Bot Worker detects the utterance, transcribes it, and POSTs a message to the API.
  3. The API stores the message and publishes a message.created event on the bus.
  4. In parallel, every interested consumer reacts:
    • The Plugin Worker runs toxicity/summarization plugins and stores annotations (which themselves emit *.annotation.created events).
    • The Webhook Worker writes an outbox row for each matching subscription and delivers it to your URL, retrying on failure.
    • The Functions Invoke Worker delivers the event to any function subscribed to it.
    • SSE streams the event to connected dashboards.

This is why the same change reaches you through several surfaces — webhooks, SSE and functions are all driven by the same domain-event catalog.

Delivery guarantees

  • Webhooks are durable. Events are written to a PostgreSQL outbox before delivery, so a temporarily-down endpoint never loses events — the worker retries with exponential backoff. Design your receiver to be idempotent (dedupe by event id); an event may be delivered more than once.
  • SSE is live, not replayed. Server-Sent Events are for real-time UI. If a client disconnects, reconnect and re-fetch current state (the SDK’s watch() does this for you).
  • Functions receive events through the same reliable outbox path as webhooks.