Technical Deep Dive into OpenClaw Architecture
An engineering guide exploring OpenClaw's architecture—lane-based concurrency, hybrid memory search, WebSocket protocol, and the agent loop that powers autonomous AI.
Introduction
In November 2025, Peter Steinberger (founder of PSPDFKit) released an open-source project that would spark both excitement and controversy across the AI community. Within two months, it had surpassed 100K GitHub stars. Security researchers raised alarms. Developers called it "the closest thing to Jarvis we've ever had."
That project is OpenClaw—an autonomous AI agent that runs entirely on your machine, remembers conversations from weeks ago, and can proactively reach out to you when something needs attention.
But this guide isn't about the hype. It's about the engineering.
By the end, you'll understand:
- How a single process safely handles messages from WhatsApp, Telegram, Discord, and more—simultaneously
- Why traditional
async/awaitfails in agent systems (and the elegant "lane" solution) - How persistent memory survives context window limits across weeks of conversation
- The clever trick that makes browser automation actually reliable
Let's dive in.
Table of Contents
- What is OpenClaw?
- The Big Picture: System Architecture
- How Messages Become Actions: The Agent Loop
- The Secret Sauce: Lane-Based Concurrency
- How the Gateway Talks to Everything: WebSocket Protocol
- Memory That Actually Works: Hybrid Search
- Surviving Long Conversations: Pre-Compaction Memory Flush
- Try It Yourself: Docker Setup
- Key Takeaways & What's Next
1. What is OpenClaw?
OpenClaw isn't just another AI chatbot. It's an autonomous AI agent that can think, act, and improve itself—all running locally on your hardware.
What Makes It Revolutionary
Self-Improving Agent — Most assistants wait for you to ask, then forget everything. OpenClaw can autonomously write new skills to extend its own capabilities. Need it to handle a workflow it doesn't know? It writes the code, creates the skill, and executes—without you touching a line.
Persistent Memory Across Weeks — The "amnesia problem" is solved. OpenClaw maintains a memory layer that retains project context, user preferences, and conversation history across sessions. Ask about something you mentioned 14 days ago, and it remembers.
Proactive "Heartbeat" — Traditional assistants are reactive: you ask, they answer. OpenClaw has a heartbeat—the ability to wake up proactively, monitor conditions, trigger automations, and reach out when something needs attention. It's the difference between a tool and a colleague.
True Autonomous Execution — OpenClaw doesn't just talk about doing things—it does them. Shell commands, file system navigation, web browsing, email, calendars, multi-step workflows. A single message like "research competitors and draft a summary email" triggers a chain of real actions.
Multi-Platform Messaging — Chat through WhatsApp, Telegram, Discord, Slack, Signal, or iMessage—wherever you already communicate. The same agent, accessible from phone, desktop, or group chats.
Sovereign AI / Local-First — All data, memory, and API keys stay on your hardware. No cloud dependency. No third-party data access. Viable for regulated industries where data cannot leave the device.
Model Agnostic — Connect to Claude, GPT, Gemini, or run fully offline with local models via Ollama. You choose the brain; OpenClaw provides the body.
The Ecosystem
OpenClaw isn't just software—it's a growing ecosystem:
| Component | What It Does |
|---|---|
| ClawHub | 3,000+ community-built skills for email, data analysis, media control |
| Moltbook | AI-agent social network where autonomous agents interact independently |
| MCP Integration | Connect to 100+ third-party services via Model Context Protocol |
The Tech Stack
| Layer | Technology |
|---|---|
| Language | TypeScript (Node.js) |
| Real-time Communication | WebSocket |
| Database | SQLite with vector extensions (sqlite-vec) |
| Browser Automation | Playwright (semantic snapshots) |
| Messaging Platforms | WhatsApp, Telegram, Discord, Slack, iMessage |
The Tradeoff: Power Requires Responsibility
OpenClaw runs with significant system access—executing commands, modifying files, browsing the web autonomously. This power comes with risk. Security researchers have flagged potential vulnerabilities (prompt injection, credential storage, elevated access). OpenClaw is suited for users who understand the implications of running autonomous agents. Docker sandboxing is strongly recommended.
Now let's look under the hood to see how it all works.
2. The Big Picture: System Architecture
Before we dive into code, let's understand what pieces exist and how they connect.
Architecture Diagram
Figure 1: OpenClaw system architecture—messaging adapters, gateway daemon, and workspace.
The best way to understand OpenClaw's architecture is to follow what happens when you send a message. Let's trace the complete flow:
1. Message Entry → The Messaging Layer
When you type "What's on my calendar today?" in WhatsApp, the message first hits OpenClaw's messaging adapters. Each platform (WhatsApp, Telegram, Discord, Slack, iMessage) has its own adapter using platform-specific libraries. For example, WhatsApp uses Baileys (a reverse-engineered protocol), while Telegram uses MTProto. These adapters normalize messages into a common format—regardless of where a message comes from, the rest of the system sees it the same way.
2. Single Point of Control → The Gateway Daemon
Every normalized message flows into the Gateway Daemon—a single, long-running process that acts as the central nervous system. This is a deliberate architectural choice: having one "boss" process eliminates coordination problems between multiple services.
Inside the Gateway, your message passes through four stages:
-
Session Router determines which conversation context this message belongs to. Is it a private DM? A group chat? A new conversation or continuing one?
-
Command Queue (Lanes) ensures messages don't collide. If you send three messages rapidly, they're queued and processed in order—never simultaneously. This prevents race conditions and corrupted state.
-
Agent Orchestrator manages the AI's thinking loop: assembling context, calling the LLM, executing tools, and iterating until a response is ready.
-
Tool Executor handles any actions the AI needs to take—running shell commands, automating browsers, reading files, or searching memory.
3. Intelligence → The LLM Provider
When the Agent Orchestrator needs to "think," it sends the assembled context (your message, conversation history, memory, skills) to an external LLM Provider—Anthropic Claude, OpenAI, or a local model like Ollama. The AI responds with either text (the answer) or tool calls (actions to take). If tools are called, results feed back into the orchestrator for another thinking round.
4. Persistence → The Workspace
Throughout this process, the Gateway reads from and writes to the Workspace—a folder structure on your machine. MEMORY.md holds long-term facts about you, memory/ contains daily conversation logs, skills/ defines special capabilities, and sessions/ stores conversation histories. This file-based approach means everything is human-readable and debuggable.
5. Response Delivery → Back Through the Stack
Once the AI produces a final response, it flows back up: Gateway → messaging adapter → your WhatsApp. The entire round trip typically completes in under a second.
The Key Architectural Insight
This is a hub-and-spoke architecture with the Gateway as the hub. All messaging channels, all client apps, all tools—everything connects through that single daemon. This simplifies state management (one source of truth), eliminates distributed coordination problems, and makes debugging straightforward. The tradeoff is that the Gateway becomes a single point of failure—but for a local-first personal assistant, that's an acceptable tradeoff for simplicity.
3. How Messages Become Actions: The Agent Loop
When you send "What meetings do I have today?" to your AI assistant, what actually happens? Let's trace the full journey.
Agent Loop Diagram
Figure 2: Agent loop phases—intake, context assembly, model inference, tool execution, streaming, persistence.
The Six Phases of Execution
Every agent run follows six distinct phases. Understanding these helps you debug issues, optimize performance, and extend the system.
Phase 1: INTAKE
The moment a message arrives, OpenClaw validates it (checking authentication and session permissions), generates a unique runId for tracing, and immediately returns an acknowledgment. This "ack-first" pattern is critical—users get instant feedback that their message was received, even if the actual processing takes several seconds. The alternative (waiting for full completion) creates a poor user experience and risks timeout errors.
Based on the document flow, the best place is Section 3, Phase 2: CONTEXT ASSEMBLY — since that's exactly when skills are loaded. Add a brief paragraph after the existing Phase 2 content.
Phase 2: CONTEXT ASSEMBLY
Before the LLM sees anything, the system assembles the full context window. This includes the conversation history from the session's JSONL file, any relevant memory (from MEMORY.md and recent daily logs), the system prompt defining the agent's behavior, and the currently loaded skills. Getting this assembly right determines whether the agent has enough context to respond intelligently—or hallucinates due to missing information.
Skills, Tools, and Plugins
OpenClaw's extensibility comes in three forms:
Skills follow the AgentSkills spec—an open standard used by Claude Code, Cursor, and others. Each skill is a folder containing a SKILL.md file with YAML frontmatter (name, description, requirements) and Markdown instructions. OpenClaw ships with bundled default skills; you can override or extend them via workspace skills (~/.openclaw/workspace/skills/) or managed skills (~/.openclaw/skills/). Priority follows: Workspace > Managed > Bundled. At load time, skills are filtered based on environment and binary requirements (e.g., a skill requiring gemini CLI won't load if it's not installed). A file watcher detects changes and updates the skills snapshot automatically. When loaded, skills are injected as a compact XML list into the system prompt—token cost is deterministic: ~195 characters base overhead plus ~97 characters per skill plus content length. Community skills are shared via ClawHub, with 3,000+ available.
Skills Loading Priority
1. Workspace skills (~/.openclaw/workspace/skills/) → Highest priority
2. Managed skills (~/.openclaw/skills/) → Shared across agents
3. Bundled skills (npm package / OpenClaw.app) → Default skills shipped with OpenClaw
Workspace skills override managed skills, which override bundled skills.
Tools are built-in capabilities: exec (shell commands with safety allowlist), browser (Playwright via semantic snapshots), memory_search, memory_get, and apply_patch. For third-party integrations, OpenClaw supports MCP (Model Context Protocol) via the mcporter skill or native support, connecting to services like Notion, Linear, and Stripe.
Plugins are TypeScript code that hooks into the system lifecycle (before_tool_call, agent_end, before_compaction). Plugins can also ship their own skills via openclaw.plugin.json. Skills change what the agent knows; tools define what it can do; plugins change how the system operates.
You can read more about skills in the official documentation: Skills - OpenClaw
Phase 3: MODEL INFERENCE
The assembled context goes to the LLM provider. The model either returns final text (the response) or requests tool calls (actions to perform). Responses stream back token-by-token rather than waiting for completion, which enables real-time UI updates and reduces perceived latency.
Phase 4: TOOL EXECUTION
When the model requests a tool call—say, checking your calendar—execution happens here. The tool runs, results are appended to the conversation, and control returns to Phase 3 for another inference round. This loop continues until the model produces final text or hits the turn limit (typically ~20 iterations). This iterative loop is what distinguishes an "agent" from a simple chatbot: the ability to take multiple actions, observe results, and adjust course.
Phase 5: STREAMING
As the model generates its final response, tokens stream to the client in real-time. Users see text appear progressively rather than waiting for a complete response. This isn't just cosmetic—it lets users interrupt or redirect if the response is heading in the wrong direction.
Phase 6: PERSISTENCE
Once complete, the full conversation (user message, tool calls, tool results, assistant response) is appended to the session's JSONL file. Memory files are updated if the agent wrote notes. A lifecycle:end event broadcasts completion. This persistence ensures continuity—the next message picks up exactly where this one left off.
Why the Tool Loop Matters
The Phase 4 → Phase 3 loop is the core of agentic behavior. A single user message like "Cancel my 3pm meeting and notify attendees" might trigger: calendar lookup → event identification → cancellation API call → contact retrieval → email composition → email send → confirmation. Each step feeds into the next, with the LLM deciding what to do based on accumulated results. Without this loop, you have a chatbot. With it, you have an agent.
4. The Secret Sauce: Lane-Based Concurrency
This is where OpenClaw gets clever. If you've written any async JavaScript, you know how easy it is to create race conditions. OpenClaw's "lane" system prevents them elegantly.
Lane-Based Concurrency Diagram
Figure 3: Lane-based concurrency—per-session queues prevent race conditions.
The Problem: Why Normal Async/Await Fails
Imagine three users send messages at the same time:
// NIGHTMARE SCENARIO - DON'T DO THIS!
async function handleMessage(user, message) {
const session = await loadSession(user); // 🔴 Race condition!
const response = await callAI(session); // 🔴 Interleaved!
await saveSession(user, response); // 🔴 Corrupted!
}
// Alice, Bob, and Carol all send messages at once:
handleMessage("alice", "Hello"); // Starts loading Alice's session
handleMessage("bob", "Hi there"); // Starts loading Bob's session
handleMessage("carol", "Hey!"); // Starts loading Carol's session
// 💥 CHAOS: Sessions get mixed up, outputs interleave, state corrupts
What goes wrong:
- Alice's response might include Bob's context
- Carol's session file might be half-written when Alice tries to read it
- Log files become unreadable garbage
The Solution: Think in "Lanes"
Instead of asking "what do I need to lock?", ask: "what can safely run in parallel?"
The Lane Mental Model:
| Lane Type | Concurrency | Purpose |
|---|---|---|
session:alice |
1 | Only ONE operation on Alice's session at a time |
session:bob |
1 | Only ONE operation on Bob's session at a time |
main (global) |
4 | Maximum 4 total operations across all users |
How It Works:
-
Message arrives for Alice
- Check Alice's session lane → Is it free? ✅
- Check global lane → Is there a slot? ✅
- Start processing!
-
Another message arrives for Alice (while first is running)
- Check Alice's session lane → Is it free? ❌ (still processing first message)
- Queue it. Alice's messages always run in order.
-
Message arrives for Bob (while Alice is running)
- Check Bob's session lane → Is it free? ✅
- Check global lane → Is there a slot? ✅
- Start processing! (Alice and Bob run in parallel, safely)
Code Example: Lane Implementation
class Lane {
private queue: QueueItem[] = [];
private active: Set<string> = new Set();
constructor(
public name: string,
public maxConcurrency: number
) {}
canRun(): boolean {
// Can we start a new job?
return this.active.size < this.maxConcurrency && this.queue.length > 0;
}
dequeue(): QueueItem | undefined {
if (!this.canRun()) return undefined;
const item = this.queue.shift(); // Take from front (FIFO)
if (item) this.active.add(item.id);
return item;
}
complete(id: string): void {
this.active.delete(id);
// Now there's room for the next item!
}
}
The beauty: No locks, no mutexes, no deadlocks. Just queues.
Queue Modes: Handling Different Situations
What if someone sends 5 messages while the AI is still thinking about the first one?
| Mode | What Happens | Use Case |
|---|---|---|
| collect (default) | Combine all queued messages into one | User types in bursts |
| steer | Interrupt current thinking, pivot to new message | "Actually, do this instead!" |
| followup | Queue for after current run finishes | "When you're done, also check X" |
| interrupt | Abort everything, start fresh | "STOP! Emergency!" |
Example of collect mode:
User: "Hello"
User: "Check my email"
User: "And the weather"
// Without collect: 3 separate AI runs (slow, expensive)
// With collect: 1 AI run with all 3 messages (efficient!)
5. How the Gateway Talks to Everything: WebSocket Protocol
The Gateway uses WebSocket to communicate with all clients in real-time. Let's see the conversation flow.
WebSocket Connection Lifecycle Diagram
Figure 4: WebSocket connection lifecycle—connect, subscribe, stream, disconnect.
Understanding the Message Types
OpenClaw uses three types of WebSocket messages:
1. REQUEST (Client → Gateway)
{
type: "req",
id: "msg-001", // For tracking responses
method: "agent", // What action to take
params: {
sessionKey: "main",
message: "What's the weather?"
}
}
2. RESPONSE (Gateway → Client)
{
type: "res",
id: "msg-001", // Matches the request!
ok: true,
payload: {
runId: "run-abc123",
status: "accepted"
}
}
3. EVENT (Gateway → Client, server-push)
{
type: "event",
event: "agent",
payload: {
stream: "assistant",
delta: "The weather today is " // Streaming text!
}
}
The Connection Dance (Step by Step)
-
Client opens WebSocket connection
- "Hey Gateway, I want to talk!"
-
Client sends
connectrequest (MUST be first)- "Here's my authentication token and device info"
-
Gateway responds with
hello-ok- "Welcome! Here's who else is connected"
-
Client sends
agentrequest- "User wants to know the weather"
-
Gateway responds IMMEDIATELY with ACK
- "Got it! Tracking number: run-abc123"
-
Gateway streams events as AI thinks
event:tool→ "Looking up weather..."event:assistant→ "The weather today is..."
-
Gateway sends final response
- "Done! Here's the complete answer"
Why Immediate ACK Matters
Without ACK:
User sends message → ⏳ 30 seconds of nothing → Response appears
(User thinks: "Did it work? Should I resend?")
With ACK:
User sends message → ✅ "Got it!" (0.1 seconds) → Streaming updates → Response
(User thinks: "Cool, it's working on it!")
6. Memory That Actually Works: Hybrid Search
How does OpenClaw remember that you mentioned your dog's name three weeks ago? Two search methods working together.
Hybrid Memory Search Diagram
Figure 5: Hybrid memory search—vector (semantic) and BM25 (keyword) combined.
The Problem: One Search Method Isn't Enough
| Search Type | What It's Good At | What It Misses |
|---|---|---|
| Vector (Semantic) | "meeting" matches "appointment" | Exact codes like AUTH_TOKEN_123 |
| BM25 (Text) | Exact matches, IDs, error codes | Paraphrases ("car" won't match "automobile") |
Real example:
- You told the AI: "My API key is sk-abc123xyz"
- Later you ask: "What's my API key?"
- Vector search might miss it (semantic similarity is low)
- BM25 finds it instantly (exact text match)
How Hybrid Search Works
async function hybridSearch(query: string): Promise<Result[]> {
// Step 1: Search BOTH systems in parallel
const [vectorResults, bm25Results] = await Promise.all([
vectorSearch(query), // "Does this MEAN similar things?"
bm25Search(query) // "Does this CONTAIN similar words?"
]);
// Step 2: Merge results by chunk ID
const merged = new Map();
for (const result of vectorResults) {
merged.set(result.id, { vectorScore: result.score });
}
for (const result of bm25Results) {
const existing = merged.get(result.id) || {};
existing.textScore = 1 / (1 + result.rank); // Convert rank to score
merged.set(result.id, existing);
}
// Step 3: Calculate combined score
const scored = [...merged.entries()].map(([id, scores]) => ({
id,
// Vector gets more weight (0.7) because semantic is usually better
score: (0.7 * (scores.vectorScore || 0)) + (0.3 * (scores.textScore || 0))
}));
// Step 4: Sort and return best matches
return scored.sort((a, b) => b.score - a.score);
}
The Memory File Structure
~/.openclaw/workspace/
├── MEMORY.md # Long-term notes about you
│ └── "User prefers morning meetings"
│ └── "Their dog is named Max"
│ └── "Allergic to peanuts"
│
└── memory/
├── 2026-01-31.md # What happened Jan 31
├── 2026-02-01.md # What happened Feb 1
├── 2026-02-02.md # What happened yesterday
└── 2026-02-03.md # What happened today
Why plain files?
- You can read them yourself (no special tools needed)
- Easy to debug ("why did it remember that?")
- Easy to edit ("let me correct this memory")
7. Surviving Long Conversations: Pre-Compaction Memory Flush
AI models have limited "context windows" (how much text they can see at once). What happens when a conversation gets too long?
Pre-Compaction Memory Flush Diagram
Figure 6: Pre-compaction memory flush—silent write before summarization.
The Problem: Context Windows Fill Up
Imagine you've been chatting with your AI assistant for an hour. The conversation is now 50,000 tokens—but the AI can only see 100,000 tokens at once. Add your memory files, system prompt, and skills, and suddenly you're running out of room.
Without flush:
Context fills up → AI summarizes everything → Important details lost!
"What was that API key you mentioned 45 minutes ago?"
"I don't recall any API key." 😱
With flush:
Context fills up → AI secretly saves important details → Then summarizes
"What was that API key you mentioned 45 minutes ago?"
"It was sk-abc123xyz, which you mentioned at 2:15 PM."
How Pre-Compaction Flush Works
-
System monitors context usage
- "We've used 80,000 of 100,000 tokens..."
-
Threshold reached
- "⚠️ We're getting close to the limit!"
-
Silent agent turn (user doesn't see this!)
System: "Session nearing compaction. Store durable memories now." AI: [Writes important details to memory/2026-02-03.md] AI: "NO_REPLY" (signals done, nothing to say to user) -
Compaction proceeds
- Old messages get summarized
- But memory files are PRESERVED
-
Context freed up, memories intact
- AI can recall what it wrote to memory
- User never knew this happened
The Genius Part
The AI essentially takes notes before an exam, knowing it won't be able to see the original textbook later. Self-preservation through proactive memory management.
8. Try It Yourself: Docker Setup
Why Docker? As Simon Willison put it: "I'm not brave enough to run OpenClaw directly on my Mac." Docker provides filesystem isolation, network control, easy cleanup, and a reproducible environment.
Prerequisites
- Docker Desktop installed and running
- Git installed
- A Telegram account (optional, for mobile access)
- An API key (OpenAI, Anthropic, or ChatGPT subscription)
Step 1: Clone the Repository
git clone https://github.com/openclaw/openclaw
cd openclaw
The repository contains docker-setup.sh which uses Docker Compose and their docker-compose.yml file.
Step 2: Run the Setup Script
./docker-setup.sh
This creates two folders on your machine (mounted as volumes in the container):
| Folder | Purpose |
|---|---|
~/.openclaw |
Configuration, memory, API keys, session data |
~/openclaw/workspace |
Files available to the agent (reads and writes here) |
Step 3: Answer the Setup Questions
The setup wizard asks several questions. Here are the recommended answers:
| Question | Recommended Answer |
|---|---|
| Onboarding mode | manual |
| What do you want to set up? | Local gateway (this machine) |
| Model provider | Choose based on your preference (see below) |
| Tailscale | no (say no unless you specifically need remote access) |
Model Provider Options
| Provider | Setup | Cost Consideration |
|---|---|---|
| ChatGPT OAuth | Authenticate with your ChatGPT account | Capped by $20/month subscription (recommended for cost control) |
| OpenAI API | Provide OPENAI_API_KEY |
Pay per token (can get expensive!) |
| Anthropic | Provide ANTHROPIC_API_KEY |
Pay per token (best results with Claude) |
| Local (Ollama) | Install Ollama, pull a model | Free but slower |
Tip: Simon Willison chose ChatGPT OAuth because "I've heard that OpenClaw can spend a lot of tokens on API plans, and using ChatGPT put an easy upper limit on how much it could spend."
ChatGPT OAuth Flow
If you choose ChatGPT OAuth:
- OpenClaw gives you a URL to open in your browser
- You authenticate with ChatGPT
- It redirects to a localhost URL that shows an error (this is expected!)
- Copy that localhost URL and paste it back into OpenClaw to complete authentication
Step 4: Verify the Container is Running
docker ps
You should see a container running the image openclaw:local with a name like openclaw-openclaw-gateway-1.
Step 5: Run Administrative Commands
Use the openclaw-cli container for management commands. Important: Run these from the same folder as docker-compose.yml:
# Check status
docker compose run --rm openclaw-cli status
# List devices
docker compose run --rm openclaw-cli devices list
Step 6: Access the Web UI
OpenClaw runs a web UI on port 18789.
Get your access token:
docker compose run --rm openclaw-cli dashboard --no-open
This outputs a URL like:
http://localhost:18789?token=YOUR_SECRET_TOKEN
Open this URL in your browser to access the dashboard.
Troubleshooting: "pairing required" Error
If you see disconnected (1008): pairing required, you need to approve the device:
# List pending pairings
docker compose exec openclaw-gateway \
node dist/index.js devices list
# Approve a pending request (use the Request ID from the list)
docker compose exec openclaw-gateway \
node dist/index.js devices approve \
YOUR-REQUEST-ID-HERE
Step 7: Set Up Telegram (Optional but Recommended)
Telegram lets you control OpenClaw from your phone—very useful!
Create a Telegram Bot:
- Open Telegram and start a chat with
@BotFather - Send the command
/newbot - Follow the prompts to name your bot
- Copy the token BotFather gives you
Provide the token during setup (or add it later in configuration).
Approve the pairing:
# OpenClaw will send you a pairing code via Telegram
docker compose run --rm openclaw-cli pairing approve telegram <CODE>
Now you can message your bot directly from Telegram on your phone!
Step 8: Installing Extra Packages (Optional)
The OpenClaw bot runs as a non-root user (for security). To install extra packages:
# Get a root shell in the container
docker compose exec -u root openclaw-gateway bash
# Install packages (example: ripgrep)
apt-get update && apt-get install -y ripgrep
Quick Reference: Common Commands
| Task | Command |
|---|---|
| Check container status | docker ps |
| View OpenClaw status | docker compose run --rm openclaw-cli status |
| Get dashboard URL | docker compose run --rm openclaw-cli dashboard --no-open |
| List devices | docker compose exec openclaw-gateway node dist/index.js devices list |
| Approve device | docker compose exec openclaw-gateway node dist/index.js devices approve <ID> |
| Approve Telegram | docker compose run --rm openclaw-cli pairing approve telegram <CODE> |
| Root shell | docker compose exec -u root openclaw-gateway bash |
| Stop OpenClaw | docker compose down |
| Start OpenClaw | docker compose up -d openclaw-gateway |
Security Reminders
- Use Docker — Don't run directly on your machine
- ChatGPT OAuth — Consider this to cap spending
- Skip Tailscale — Unless you specifically need remote access
- Review approvals — Check
~/.openclaw/exec-approvals.jsonregularly - Limit workspace — Only put files in
~/openclaw/workspacethat you want the agent to access
Additional Resources
9. Key Takeaways & What's Next
What We Learned
1. Architecture Matters
- One Gateway daemon simplifies everything
- File-based memory is debuggable and simple
- WebSocket enables real-time communication
2. Concurrency is Hard (Lanes Make It Easier)
- Don't ask "what do I lock?" — ask "what can run in parallel?"
- Session lanes ensure ordering
- Global lanes prevent resource exhaustion
3. Memory Needs Both Semantic and Exact Search
- Vector search catches meaning
- BM25 catches exact matches
- Together they cover each other's blind spots
4. Long Conversations Need Proactive Memory
- Don't wait until compaction to save important details
- Silent agent turns preserve critical context
Design Patterns to Remember
| Pattern | What It Does | When to Use |
|---|---|---|
| Acknowledge-Then-Execute | Return immediately, process async | Any long-running operation |
| Dual-Queue Serialization | Two queue layers (session + global) | Multi-tenant systems |
| Event Stream Multiplexing | One connection, multiple logical streams | Real-time UIs |
| Content-Hash Caching | Skip re-processing unchanged content | Embedding, indexing |
| Graceful Degradation | Fall back when services fail | Production resilience |
Next Steps
To learn more:
- Read the full engineering guide
- Explore the TypeScript examples
- Try modifying the code and see what happens!
Questions to explore:
- How would you add a new messaging platform?
- What happens if the Gateway crashes mid-conversation?
- How do "skills" differ from "plugins"?
Build something:
- Create a custom skill for your workflow
- Build a simple Gateway client in your favorite language
- Experiment with different queue modes
Appendix: Quick Reference
File Locations
~/.openclaw/
├── config.json # API keys, settings
├── workspace/
│ ├── MEMORY.md # Long-term memory
│ ├── memory/ # Daily logs
│ └── skills/ # Skill definitions
└── sessions/ # Conversation histories
WebSocket Endpoints
| Endpoint | Purpose |
|---|---|
ws://127.0.0.1:18789 |
Gateway WebSocket API |
http://localhost:18793 |
Canvas UI |
Lane Defaults
| Lane | Concurrency | Purpose |
|---|---|---|
session:* |
1 | Per-session serialization |
main |
4 | Primary agent runs |
subagent |
8 | Parallel sub-tasks |
cron |
2 | Scheduled jobs |