GitHub - SmooAI/smooth-operator: Polyglot AI agent service — knowledge chat, tools, durable checkpoints, human-in-the-loop, and multi-participant conversations over one schema-driven WebSocket protocol. Built on the smooth-operator-core engine (5-language parity). Deploy to Kubernetes, AWS serverless, or run locally. Hosted at lom.smoo.ai.

What it is · Quickstart · Deploy flavors · Architecture · Platform

smooth-operator gives you hybrid retrieval (dense + sparse + rerank), durable agent checkpoints, human-in-the-loop approvals, and multi-participant conversations — one operator binary that runs the same way on Kubernetes, AWS serverless, or a single laptop process. Built in the open, test-first.

What is this?

smooth-operator is a polyglot AI agent service. The agent orchestration is done by smooth-operator-core — a 5-language parity engine; the service wraps it with conversations, knowledge ingestion + retrieval, a tool catalog, and one schema-driven WebSocket protocol that clients in five languages speak natively.

You get hybrid retrieval (dense + sparse + rerank), durable agent checkpoints, human-in-the-loop approvals, and multi-participant conversations (user · ai-agent · human-agent) — behind a stable wire protocol, with storage, backplane, and auth selected by config, not by a code fork.

One operator binary, three deployment flavors (see below):

Kubernetes — the primary self-host target: a long-running service with Postgres + pgvector and a Redis/NATS backplane for multi-replica scale-out.
AWS serverless — API Gateway WebSocket + Lambda + DynamoDB + S3 Vectors, deployed with SST.
Local — a single in-memory process with auth off and zero external services, for laptop dev or to embed in-process.

The same binary picks its flavor from the environment (SMOOTH_AGENT_STORAGE · SMOOTH_AGENT_BACKPLANE · AUTH_MODE). No build flags, no second codebase.

Built in the open, test-first. See docs/Planning/Roadmap.md for what works today and what's queued.

30-second quickstart

Run the reference server locally — fully in-memory, no database, no auth, no AWS — and drive a real agent turn. The server talks to the SmooAI LLM gateway (llm.smoo.ai); bring a gateway key.

git clone https://github.com/SmooAI/smooth-operator && cd smooth-operator/rust

# Point at the gateway and seed a distinctive "17-day return window" demo doc.
export SMOOAI_GATEWAY_KEY=sk-…           # your llm.smoo.ai key
export SMOOTH_AGENT_SEED_KB=1            # seeds the demo knowledge docs

cargo run -p smooai-smooth-operator-server
# → smooth-operator-server (local flavor) listening on ws://127.0.0.1:8787/ws (model claude-haiku-4-5)

That's it — an agent backend on ws://127.0.0.1:8787/ws, with knowledge retrieval, tool-calling, and streaming. With no env set, the binary boots the local flavor: in-memory storage, in-memory backplane, loopback bind, admin off. Set SMOOTH_AGENT_STORAGE=postgres (or dynamodb) and a backplane to graduate the same binary to the k8s or serverless flavor.

No key? The server still boots and answers protocol actions — only send_message (which needs the LLM) errors cleanly until SMOOAI_GATEWAY_KEY is set.

You can also embed the local flavor in-process from Rust — smooth_operator_server::local::serve_local("127.0.0.1:8787"), or LocalServer::builder().seed_kb(true).spawn() for a handle with a graceful-shutdown switch. See deploy/local/README.md.

Watch it stream

Connect, start a session, send a turn, and watch tokens stream in — then await the authoritative terminal response. Here in TypeScript (@smooai/smooth-operator); the same shape exists in Go, .NET, Python, and Rust.

import { SmoothAgentClient } from '@smooai/smooth-operator';

const client = new SmoothAgentClient({ url: 'ws://127.0.0.1:8787/ws' });
await client.connect();

const session = await client.createConversationSession({ agentId, userName: 'Alice' });

// One turn. Iterate the stream; `await` the same handle for the final state.
const turn = client.sendMessage({ sessionId: session.sessionId, message: 'How long is your return window?' });

for await (const ev of turn) {
  if (ev.type === 'stream_chunk') console.error(`  ↳ node: ${ev.node}`); // knowledge_search, response_gen, …
  if (ev.type === 'stream_token') process.stdout.write(ev.token ?? '');  // "Our return window is 17 days…"
  if (ev.type === 'write_confirmation_required') {
    // HITL: a tool wants to write — approve, and the resumed stream flows back into this same turn.
    client.confirmToolAction({ sessionId: session.sessionId, requestId: turn.requestId, approved: true });
  }
}

const final = await turn; // EventualResponse — cost, tokens, messageId

The model autonomously calls knowledge_search, retrieves the seeded 17-day return window, and grounds its answer in it — verified live against llm.smoo.ai and across every client.

Need an embeddable web UI? The TypeScript side ships a React binding and an embeddable widget (a custom element) on top of the same client.

Deployment flavors

One operator binary, one codebase. The StorageAdapter + backplane + auth seams are what let the same agent code run on any of three flavors — application code never names a backend. The flavor is selected by config, not by a build.

	Kubernetes (primary self-host)	AWS serverless (SST)	Local (dev / embed)
Compute	Long-running pods	API GW WebSocket → Lambda	One in-process server
Storage	Postgres + pgvector	DynamoDB + S3 Vectors	In-memory
Backplane	Redis / NATS (multi-replica)	API GW connections	In-memory (single process)
Auth	`AUTH_MODE=jwt` / `smoo`	`AUTH_MODE=jwt` / `smoo`	`AUTH_MODE=none` (dev only)
`SMOOTH_AGENT_STORAGE`	`postgres`	`dynamodb`	`memory` (default)
Deploy	`helm install smooth-operator ./deploy/k8s`	`npx sst deploy` in `deploy/sst`	`cargo run -p smooai-smooth-operator-server`

# Kubernetes (Helm + ArgoCD) — service + WS ingress, Postgres + pgvector, Redis/NATS backplane
helm install smooth-operator ./deploy/k8s --set image.tag=$(git rev-parse --short HEAD)

# AWS serverless (SST) — API GW WebSocket + Lambda + DynamoDB + S3 Vectors
cd deploy/sst && pnpm install && npx sst deploy --stage prod

# Local — fully in-memory, auth off, no external services
cargo run -p smooai-smooth-operator-server

What every flavor keeps: hybrid (vector + keyword) retrieval with reranking, a clean Chat · RAG · Agents · Actions decomposition, connector-style ingestion, document-level ACLs over org isolation, and the MIT, batteries-included self-host story. See deploy/README.md and docs/DEPLOY.md for the full matrix.

Architecture

One protocol in front; a swappable engine and storage behind it. A client never names a language, a backend, or whether the engine is embedded or remote — it only ever sees the protocol.

%%{init: {'theme':'base','themeVariables':{
  'background':'#020618','primaryColor':'#0b1426','primaryTextColor':'#e6edf6','primaryBorderColor':'#2b3a52',
  'lineColor':'#7c8aa0','secondaryColor':'#0b1426','tertiaryColor':'#0b1426','fontFamily':'ui-sans-serif, system-ui, sans-serif',
  'clusterBkg':'#0b1426','clusterBorder':'#22304a'}}}%%
flowchart LR
  CLIENTS["5 native clients<br/>TS · Go · .NET · Python · Rust"]
  CLIENTS -->|"WebSocket protocol"| SVC

  subgraph SVC["smooth-operator · service"]
    PROTO["Protocol layer"] --> RT["KnowledgeChatRuntime"]
  end

  RT -->|"Agent::run"| ENGINE["smooth-operator-core<br/>5-language engine"]
  ENGINE -->|"LlmProvider"| GW[("llm.smoo.ai<br/>or BYO gateway")]
  RT -->|"StorageAdapter"| KB[("Knowledge + conversations<br/>pgvector / DynamoDB + S3 Vectors / in-memory")]

  classDef warm fill:#f49f0a,stroke:#ff6b6c,color:#1a0f00;
  classDef teal fill:#00a6a6,stroke:#00c2c2,color:#011;
  class ENGINE warm
  class GW,KB teal

An agent turn, end to end

%%{init: {'theme':'base','themeVariables':{
  'background':'#020618','primaryColor':'#0b1426','primaryTextColor':'#e6edf6','primaryBorderColor':'#2b3a52',
  'lineColor':'#7c8aa0','actorBkg':'#0b1426','actorBorder':'#2b3a52','actorTextColor':'#e6edf6',
  'signalColor':'#7c8aa0','signalTextColor':'#e6edf6','noteBkgColor':'#f49f0a','noteTextColor':'#1a0f00','noteBorderColor':'#ff6b6c',
  'fontFamily':'ui-sans-serif, system-ui, sans-serif'}}}%%
sequenceDiagram
  participant C as Client
  participant S as Service
  participant A as Agent
  participant K as Knowledge / Tools
  participant L as LLM gateway

  C->>S: send_message { sessionId, message }
  S->>A: run turn (replay prior messages)
  S-->>C: immediate_response (202, ack)
  A->>K: knowledge_search("return window")
  K-->>A: top-K snippets (the 17-day fact)
  A->>L: chat completion (grounded prompt)
  L-->>A: token deltas …
  A-->>S: TokenDelta / PhaseStart / ToolCallComplete
  S-->>C: stream_token "Our" "return" "window" …
  S-->>C: stream_chunk { node: response_gen }
  A-->>S: Completed { cost, tokens }
  S-->>C: eventual_response (200, final)

Protocol lifecycle (incl. HITL)

%%{init: {'theme':'base','themeVariables':{
  'background':'#020618','primaryColor':'#0b1426','primaryTextColor':'#e6edf6','primaryBorderColor':'#2b3a52',
  'lineColor':'#7c8aa0','secondaryColor':'#0b1426','tertiaryColor':'#0b1426','fontFamily':'ui-sans-serif, system-ui, sans-serif'}}}%%
stateDiagram-v2
  [*] --> Connected: connect
  Connected --> SessionOpen: create_session
  SessionOpen --> Streaming: send_message
  Streaming --> Streaming: stream_token · chunk
  Streaming --> AwaitingApproval: confirm_required
  AwaitingApproval --> Streaming: approve
  Streaming --> AwaitingOtp: otp_required
  AwaitingOtp --> Streaming: verify_otp
  Streaming --> SessionOpen: eventual_response
  SessionOpen --> [*]: disconnect

Full action/event tables, the AgentEvent mapping, and connection-state keys are in docs/PROTOCOL.md.

The polyglot story (honest status)

One protocol, defined once in spec/ (JSON Schema). Everything else is generated or hand-written to match it.

Surface	Status
Engine (`smooth-operator-core`)	5-language parity engine — Rust · C# · Python · TypeScript · Go, each published (crates.io / NuGet / PyPI / npm / Go module). Rust is the reference; the others mirror its surface.
Protocol clients	All five languages — TypeScript (`@smooai/smooth-operator`), Go, .NET (with a `Microsoft.Extensions.AI` `IChatClient` facade), Python, Rust. The TS side also ships a React binding and an embeddable widget.
Servers	All five languages — Rust · C# · Python · TypeScript · Go, each consuming its own language's engine so a host can run the full service in its native stack. Rust + C# carry the full surface (ingestion, admin, ACL, storage adapters); Python/TS/Go are native servers (transport · frame dispatch · per-turn engine · sessions · auth · graceful drain). All five run the shared scenario conformance corpus — protocol parity, tested.

All five native servers now exist and run the same spec/conformance/scenarios corpus — driven by the engine's deterministic mock, so they must produce identical protocol output (the corpus already caught and fixed real error-handling divergences in the TS and C# servers). The Rust + C# servers carry the full surface; the Python/TS/Go servers are native and at protocol parity, growing toward the full feature surface. The five clients, five engines, and five servers are all real.

Test-driven by default

Nothing here is vibe-coded — it's verified against a real LLM gateway. Substring tests prove a reply contains the right number; an LLM-as-judge proves the agent reasoned its way there and didn't hallucinate. We run both.

%%{init: {'theme':'base','themeVariables':{
  'background':'#020618','primaryColor':'#0b1426','primaryTextColor':'#e6edf6','primaryBorderColor':'#2b3a52',
  'lineColor':'#7c8aa0','secondaryColor':'#0b1426','tertiaryColor':'#0b1426','fontFamily':'ui-sans-serif, system-ui, sans-serif'}}}%%
flowchart TD
  U["Unit tests<br/>chunker · SSRF guard · can_access"] --> C
  C["Testcontainers conformance<br/>pgvector + DynamoDB-Local"] --> E
  E["Live cross-language E2E<br/>all 5 clients, real WebSocket turns"] --> J
  J["LLM-as-judge quality evals<br/>real gateway, rubric-scored 1–5"]

  classDef warm fill:#f49f0a,stroke:#ff6b6c,color:#1a0f00;
  classDef teal fill:#00a6a6,stroke:#00c2c2,color:#011;
  class U teal
  class J warm

All five native servers run a shared scenario conformance corpus (spec/conformance/scenarios) — language-neutral protocol flows driven by the engine's deterministic mock, so every server must produce identical output. That's the polyglot parity oracle, on top of each server's own protocol/ingestion/ACL/rerank/embedder suites and the engine's offline suite (337 tests on a deterministic MockLlmClient). The five protocol clients are exercised against a real WebSocket in a cross-language E2E harness.

The proof story

The headline isn't a count — it's a real defect a substring test would have missed. On the first live run, our LLM-as-judge scored a multi-turn answer 1/5: the runtime built a fresh agent per turn, so turn 2 had no memory of turn 1's delivery date and couldn't compute the last return day. A contains("the 22nd") assertion would have stayed green on a hallucinated guess. The judge caught it; the fix wired per-session memory; it now scores 5/5.

That's the whole bet: quality regressions that only a grader can see, caught in CI. Details — the five scenarios, the rubric, the same-model-judge knob — in docs/EVALS.md.

Gated, never silently skipped

Live tests need a gateway key. They are gated, not deleted: with SMOOTH_AGENT_E2E=1 + SMOOAI_GATEWAY_KEY they run (and print every per-scenario score under --nocapture); without them they print an explicit skip and return — so credential-free cargo test and CI stay green, and the nightly job runs the full live suite. The gateway key is read from the environment and never printed.

# Unit + conformance — no creds, runs everywhere
cd rust && cargo test

# + live LLM-as-judge evals
export SMOOAI_GATEWAY_KEY=sk-… SMOOTH_AGENT_E2E=1
cargo test -p smooai-smooth-operator-evals --test llm_judge -- --nocapture --test-threads=1

Smoo-powered or bring-your-own

A recurring principle across the whole stack: same code, two postures.

Capability	Smoo-powered (hosted)	Bring-your-own (self-host)
LLM gateway	`llm.smoo.ai`	any OpenAI-compatible endpoint
Embeddings	gateway (`text-embedding-3-small`)	`DeterministicEmbedder` or your provider
Web search	Smoo provider	Brave / Bing / Tavily via `WebSearchProvider`
Identity / RBAC	Smoo identity (`AUTH_MODE=smoo`)	`AUTH_MODE=jwt` (BYO JWT/OIDC)
Connectors	managed GitHub/Slack apps	your tokens, same `Connector` trait

Self-host brings their own; hosted wires Smoo's apps. The seams are identical — see docs/INGESTION.md, docs/TOOLS.md, and docs/STORAGE.md.

The two-repo split

Repo	What it is
`smooth-operator-core`	The agent engine — `Agent`, `Workflow`, `Tool`, `CheckpointStore`, `LlmProvider`, `Memory`, `KnowledgeBase`. A 5-language parity engine (Rust · C# · Python · TypeScript · Go), each published.
`smooth-operator` (this repo)	The service — conversations, knowledge ingestion + retrieval, the tool catalog, the WebSocket protocol, the five clients, the management console, and the Kubernetes / AWS / local deploy flavors.

Repository layout

smooth-operator/
├── spec/         # The language-neutral wire protocol (JSON Schema) — source of truth for all clients
├── rust/         # Reference server + service crate (smooai-smooth-operator) + adapters, lambda, evals, ingestion
├── typescript/   # @smooai/smooth-operator — client + React binding + embeddable widget
├── go/           # github.com/SmooAI/smooth-operator/go — protocol.Client
├── dotnet/       # SmooAI.SmoothOperator — client (+ Microsoft.Extensions.AI facade) and the C# server
├── python/       # smooth-operator (import smooth_operator) — async client
├── console/      # Next.js management console for the auth-gated /admin/* API
├── adapters/     # Storage adapters: postgres (pgvector) and dynamodb (S3 Vectors)
├── deploy/
│   ├── k8s/      # Kubernetes (Helm + ArgoCD) — Postgres + pgvector + Redis/NATS backplane
│   ├── sst/      # AWS serverless (API GW WebSocket + Lambda + DynamoDB + S3 Vectors)
│   └── local/    # Local / embed-in-process — in-memory, auth off, no external services
└── docs/         # Architecture, protocol, storage, evals, ingestion, access-control, observability, deploy, roadmap

Run it hosted

Don't want to operate it yourself? lom.smoo.ai runs smooth-operator as a managed, multi-tenant service.

Documentation

Doc	What
`docs/ARCHITECTURE.md`	System design, the agent pipeline, how it consumes the engine
`docs/PROTOCOL.md`	The schema-driven WebSocket protocol
`docs/STORAGE.md`	The `StorageAdapter` trait; Postgres and DynamoDB/S3 Vectors designs
`docs/EVALS.md`	The LLM-as-judge quality harness (the 1/5 → 5/5 story)
`docs/INGESTION.md`	Connectors, chunking, the embedder seam
`docs/TOOLS.md`	The built-in tool catalog + authoring your own
`docs/ACCESS-CONTROL.md`	Document-level ACLs over org isolation
`docs/ADMIN-API.md`	The auth-gated `/admin/*` API the console consumes
`docs/OBSERVABILITY.md`	OpenTelemetry `gen_ai.*` tracing
`docs/DEPLOY.md`	The three deploy flavors + the shared `SmooAI/deploy` package
`docs/Planning/Roadmap.md`	Phased build plan + current status

🧩 Part of Smoo AI {#part-of-smoo-ai}

smooth-operator is built and open-sourced by Smoo AI — the AI-powered business platform with AI built into every product: CRM, customer support, campaigns, field service, observability, and developer tools.

🚀 smooth-operator on the platform — smoo.ai/th
🧰 More open source from Smoo AI — smoo.ai/open-source
🧩 Sibling packages — smooth-operator-core (the 5-language engine this wraps), @smooai/deploy, smooth (the th CLI)
☁️ Hosted — lom.smoo.ai runs smooth-operator for you, managed and multi-tenant

🤝 Contributing

Built in the open, test-first. Issues and PRs welcome — see the docs vault for architecture, protocol, and the eval harness, and docs/Planning/Roadmap.md for what's queued.

📄 License

Built by Smoo AI — AI built into every product.

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
.changeset		.changeset
.github		.github
adapters		adapters
assets		assets
console		console
deploy		deploy
docs		docs
dotnet		dotnet
go		go
python		python
rust		rust
scripts		scripts
spec		spec
typescript		typescript
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What is this?

30-second quickstart

Watch it stream

Deployment flavors

Architecture

An agent turn, end to end

Protocol lifecycle (incl. HITL)

The polyglot story (honest status)

Test-driven by default

The proof story

Gated, never silently skipped

Smoo-powered or bring-your-own

The two-repo split

Repository layout

Run it hosted

Documentation

🧩 Part of Smoo AI {#part-of-smoo-ai}

🤝 Contributing

📄 License

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

What is this?

30-second quickstart

Watch it stream

Deployment flavors

Architecture

An agent turn, end to end

Protocol lifecycle (incl. HITL)

The polyglot story (honest status)

Test-driven by default

The proof story

Gated, never silently skipped

Smoo-powered or bring-your-own

The two-repo split

Repository layout

Run it hosted

Documentation

🧩 Part of Smoo AI {#part-of-smoo-ai}

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages