Technical Deep Dive into OpenClaw Architecture

Introduction

In November 2025, Peter Steinberger (founder of PSPDFKit) released an open-source project that would spark both excitement and controversy across the AI community. Within two months, it had surpassed 100K GitHub stars. Security researchers raised alarms. Developers called it "the closest thing to Jarvis we've ever had."

That project is OpenClaw—an autonomous AI agent that runs entirely on your machine, remembers conversations from weeks ago, and can proactively reach out to you when something needs attention.

But this guide isn't about the hype. It's about the engineering.

By the end, you'll understand:

How a single process safely handles messages from WhatsApp, Telegram, Discord, and more—simultaneously
Why traditional async/await fails in agent systems (and the elegant "lane" solution)
How persistent memory survives context window limits across weeks of conversation
The clever trick that makes browser automation actually reliable

Let's dive in.

What is OpenClaw?
The Big Picture: System Architecture
How Messages Become Actions: The Agent Loop
The Secret Sauce: Lane-Based Concurrency
How the Gateway Talks to Everything: WebSocket Protocol
Memory That Actually Works: Hybrid Search
Surviving Long Conversations: Pre-Compaction Memory Flush
Try It Yourself: Docker Setup
Key Takeaways & What's Next

1. What is OpenClaw?

OpenClaw isn't just another AI chatbot. It's an autonomous AI agent that can think, act, and improve itself—all running locally on your hardware.

What Makes It Revolutionary

Self-Improving Agent — Most assistants wait for you to ask, then forget everything. OpenClaw can autonomously write new skills to extend its own capabilities. Need it to handle a workflow it doesn't know? It writes the code, creates the skill, and executes—without you touching a line.

Persistent Memory Across Weeks — The "amnesia problem" is solved. OpenClaw maintains a memory layer that retains project context, user preferences, and conversation history across sessions. Ask about something you mentioned 14 days ago, and it remembers.

Proactive "Heartbeat" — Traditional assistants are reactive: you ask, they answer. OpenClaw has a heartbeat—the ability to wake up proactively, monitor conditions, trigger automations, and reach out when something needs attention. It's the difference between a tool and a colleague.

True Autonomous Execution — OpenClaw doesn't just talk about doing things—it does them. Shell commands, file system navigation, web browsing, email, calendars, multi-step workflows. A single message like "research competitors and draft a summary email" triggers a chain of real actions.

Multi-Platform Messaging — Chat through WhatsApp, Telegram, Discord, Slack, Signal, or iMessage—wherever you already communicate. The same agent, accessible from phone, desktop, or group chats.

Sovereign AI / Local-First — All data, memory, and API keys stay on your hardware. No cloud dependency. No third-party data access. Viable for regulated industries where data cannot leave the device.

Model Agnostic — Connect to Claude, GPT, Gemini, or run fully offline with local models via Ollama. You choose the brain; OpenClaw provides the body.

The Ecosystem

OpenClaw isn't just software—it's a growing ecosystem:

Component	What It Does
ClawHub	3,000+ community-built skills for email, data analysis, media control
Moltbook	AI-agent social network where autonomous agents interact independently
MCP Integration	Connect to 100+ third-party services via Model Context Protocol

The Tech Stack

Layer	Technology
Language	TypeScript (Node.js)
Real-time Communication	WebSocket
Database	SQLite with vector extensions (sqlite-vec)
Browser Automation	Playwright (semantic snapshots)
Messaging Platforms	WhatsApp, Telegram, Discord, Slack, iMessage

The Tradeoff: Power Requires Responsibility

OpenClaw runs with significant system access—executing commands, modifying files, browsing the web autonomously. This power comes with risk. Security researchers have flagged potential vulnerabilities (prompt injection, credential storage, elevated access). OpenClaw is suited for users who understand the implications of running autonomous agents. Docker sandboxing is strongly recommended.

Now let's look under the hood to see how it all works.

2. The Big Picture: System Architecture

Before we dive into code, let's understand what pieces exist and how they connect.

Architecture Diagram

OpenClaw System Architecture

Figure 1: OpenClaw system architecture—messaging adapters, gateway daemon, and workspace.

The best way to understand OpenClaw's architecture is to follow what happens when you send a message. Let's trace the complete flow:

1. Message Entry → The Messaging Layer

When you type "What's on my calendar today?" in WhatsApp, the message first hits OpenClaw's messaging adapters. Each platform (WhatsApp, Telegram, Discord, Slack, iMessage) has its own adapter using platform-specific libraries. For example, WhatsApp uses Baileys (a reverse-engineered protocol), while Telegram uses MTProto. These adapters normalize messages into a common format—regardless of where a message comes from, the rest of the system sees it the same way.

2. Single Point of Control → The Gateway Daemon

Every normalized message flows into the Gateway Daemon—a single, long-running process that acts as the central nervous system. This is a deliberate architectural choice: having one "boss" process eliminates coordination problems between multiple services.

Inside the Gateway, your message passes through four stages:

Session Router determines which conversation context this message belongs to. Is it a private DM? A group chat? A new conversation or continuing one?
Command Queue (Lanes) ensures messages don't collide. If you send three messages rapidly, they're queued and processed in order—never simultaneously. This prevents race conditions and corrupted state.
Agent Orchestrator manages the AI's thinking loop: assembling context, calling the LLM, executing tools, and iterating until a response is ready.
Tool Executor handles any actions the AI needs to take—running shell commands, automating browsers, reading files, or searching memory.

3. Intelligence → The LLM Provider

When the Agent Orchestrator needs to "think," it sends the assembled context (your message, conversation history, memory, skills) to an external LLM Provider—Anthropic Claude, OpenAI, or a local model like Ollama. The AI responds with either text (the answer) or tool calls (actions to take). If tools are called, results feed back into the orchestrator for another thinking round.

4. Persistence → The Workspace

Throughout this process, the Gateway reads from and writes to the Workspace—a folder structure on your machine. MEMORY.md holds long-term facts about you, memory/ contains daily conversation logs, skills/ defines special capabilities, and sessions/ stores conversation histories. This file-based approach means everything is human-readable and debuggable.

5. Response Delivery → Back Through the Stack

Once the AI produces a final response, it flows back up: Gateway → messaging adapter → your WhatsApp. The entire round trip typically completes in under a second.

The Key Architectural Insight

This is a hub-and-spoke architecture with the Gateway as the hub. All messaging channels, all client apps, all tools—everything connects through that single daemon. This simplifies state management (one source of truth), eliminates distributed coordination problems, and makes debugging straightforward. The tradeoff is that the Gateway becomes a single point of failure—but for a local-first personal assistant, that's an acceptable tradeoff for simplicity.

3. How Messages Become Actions: The Agent Loop

When you send "What meetings do I have today?" to your AI assistant, what actually happens? Let's trace the full journey.

Agent Loop Diagram

Agent Loop Phases

Figure 2: Agent loop phases—intake, context assembly, model inference, tool execution, streaming, persistence.

The Six Phases of Execution

Every agent run follows six distinct phases. Understanding these helps you debug issues, optimize performance, and extend the system.

Phase 1: INTAKE

The moment a message arrives, OpenClaw validates it (checking authentication and session permissions), generates a unique runId for tracing, and immediately returns an acknowledgment. This "ack-first" pattern is critical—users get instant feedback that their message was received, even if the actual processing takes several seconds. The alternative (waiting for full completion) creates a poor user experience and risks timeout errors.

Based on the document flow, the best place is Section 3, Phase 2: CONTEXT ASSEMBLY — since that's exactly when skills are loaded. Add a brief paragraph after the existing Phase 2 content.

Phase 2: CONTEXT ASSEMBLY

Before the LLM sees anything, the system assembles the full context window. This includes the conversation history from the session's JSONL file, any relevant memory (from MEMORY.md and recent daily logs), the system prompt defining the agent's behavior, and the currently loaded skills. Getting this assembly right determines whether the agent has enough context to respond intelligently—or hallucinates due to missing information.

Skills, Tools, and Plugins

OpenClaw's extensibility comes in three forms:

Skills follow the AgentSkills spec—an open standard used by Claude Code, Cursor, and others. Each skill is a folder containing a SKILL.md file with YAML frontmatter (name, description, requirements) and Markdown instructions. OpenClaw ships with bundled default skills; you can override or extend them via workspace skills (~/.openclaw/workspace/skills/) or managed skills (~/.openclaw/skills/). Priority follows: Workspace > Managed > Bundled. At load time, skills are filtered based on environment and binary requirements (e.g., a skill requiring gemini CLI won't load if it's not installed). A file watcher detects changes and updates the skills snapshot automatically. When loaded, skills are injected as a compact XML list into the system prompt—token cost is deterministic: ~195 characters base overhead plus ~97 characters per skill plus content length. Community skills are shared via ClawHub, with 3,000+ available.

Skills Loading Priority

1. Workspace skills  (~/.openclaw/workspace/skills/)  → Highest priority
2. Managed skills    (~/.openclaw/skills/)            → Shared across agents
3. Bundled skills    (npm package / OpenClaw.app)     → Default skills shipped with OpenClaw

Workspace skills override managed skills, which override bundled skills.

Tools are built-in capabilities: exec (shell commands with safety allowlist), browser (Playwright via semantic snapshots), memory_search, memory_get, and apply_patch. For third-party integrations, OpenClaw supports MCP (Model Context Protocol) via the mcporter skill or native support, connecting to services like Notion, Linear, and Stripe.

Plugins are TypeScript code that hooks into the system lifecycle (before_tool_call, agent_end, before_compaction). Plugins can also ship their own skills via openclaw.plugin.json. Skills change what the agent knows; tools define what it can do; plugins change how the system operates.

You can read more about skills in the official documentation: Skills - OpenClaw

Phase 3: MODEL INFERENCE

The assembled context goes to the LLM provider. The model either returns final text (the response) or requests tool calls (actions to perform). Responses stream back token-by-token rather than waiting for completion, which enables real-time UI updates and reduces perceived latency.

Phase 4: TOOL EXECUTION

When the model requests a tool call—say, checking your calendar—execution happens here. The tool runs, results are appended to the conversation, and control returns to Phase 3 for another inference round. This loop continues until the model produces final text or hits the turn limit (typically ~20 iterations). This iterative loop is what distinguishes an "agent" from a simple chatbot: the ability to take multiple actions, observe results, and adjust course.

Phase 5: STREAMING

As the model generates its final response, tokens stream to the client in real-time. Users see text appear progressively rather than waiting for a complete response. This isn't just cosmetic—it lets users interrupt or redirect if the response is heading in the wrong direction.

Phase 6: PERSISTENCE

Once complete, the full conversation (user message, tool calls, tool results, assistant response) is appended to the session's JSONL file. Memory files are updated if the agent wrote notes. A lifecycle:end event broadcasts completion. This persistence ensures continuity—the next message picks up exactly where this one left off.

Why the Tool Loop Matters

The Phase 4 → Phase 3 loop is the core of agentic behavior. A single user message like "Cancel my 3pm meeting and notify attendees" might trigger: calendar lookup → event identification → cancellation API call → contact retrieval → email composition → email send → confirmation. Each step feeds into the next, with the LLM deciding what to do based on accumulated results. Without this loop, you have a chatbot. With it, you have an agent.

4. The Secret Sauce: Lane-Based Concurrency

This is where OpenClaw gets clever. If you've written any async JavaScript, you know how easy it is to create race conditions. OpenClaw's "lane" system prevents them elegantly.

Lane-Based Concurrency Diagram

Lane-Based Concurrency

Figure 3: Lane-based concurrency—per-session queues prevent race conditions.

The Problem: Why Normal Async/Await Fails

Imagine three users send messages at the same time:

// NIGHTMARE SCENARIO - DON'T DO THIS!
async function handleMessage(user, message) {
  const session = await loadSession(user);     // 🔴 Race condition!
  const response = await callAI(session);       // 🔴 Interleaved!
  await saveSession(user, response);            // 🔴 Corrupted!
}

// Alice, Bob, and Carol all send messages at once:
handleMessage("alice", "Hello");       // Starts loading Alice's session
handleMessage("bob", "Hi there");      // Starts loading Bob's session
handleMessage("carol", "Hey!");        // Starts loading Carol's session

// 💥 CHAOS: Sessions get mixed up, outputs interleave, state corrupts

What goes wrong:

Alice's response might include Bob's context
Carol's session file might be half-written when Alice tries to read it
Log files become unreadable garbage

The Solution: Think in "Lanes"

Instead of asking "what do I need to lock?", ask: "what can safely run in parallel?"

The Lane Mental Model:

Lane Type	Concurrency	Purpose
`session:alice`	1	Only ONE operation on Alice's session at a time
`session:bob`	1	Only ONE operation on Bob's session at a time
`main` (global)	4	Maximum 4 total operations across all users

How It Works:

Message arrives for Alice
- Check Alice's session lane → Is it free? ✅
- Check global lane → Is there a slot? ✅
- Start processing!
Another message arrives for Alice (while first is running)
- Check Alice's session lane → Is it free? ❌ (still processing first message)
- Queue it. Alice's messages always run in order.
Message arrives for Bob (while Alice is running)
- Check Bob's session lane → Is it free? ✅
- Check global lane → Is there a slot? ✅
- Start processing! (Alice and Bob run in parallel, safely)

Code Example: Lane Implementation

class Lane {
  private queue: QueueItem[] = [];
  private active: Set<string> = new Set();

  constructor(
    public name: string,
    public maxConcurrency: number
  ) {}

  canRun(): boolean {
    // Can we start a new job?
    return this.active.size < this.maxConcurrency && this.queue.length > 0;
  }

  dequeue(): QueueItem | undefined {
    if (!this.canRun()) return undefined;

    const item = this.queue.shift();  // Take from front (FIFO)
    if (item) this.active.add(item.id);
    return item;
  }

  complete(id: string): void {
    this.active.delete(id);
    // Now there's room for the next item!
  }
}

The beauty: No locks, no mutexes, no deadlocks. Just queues.

Queue Modes: Handling Different Situations

What if someone sends 5 messages while the AI is still thinking about the first one?

Mode	What Happens	Use Case
collect (default)	Combine all queued messages into one	User types in bursts
steer	Interrupt current thinking, pivot to new message	"Actually, do this instead!"
followup	Queue for after current run finishes	"When you're done, also check X"
interrupt	Abort everything, start fresh	"STOP! Emergency!"

Example of collect mode:

User: "Hello"
User: "Check my email"
User: "And the weather"

// Without collect: 3 separate AI runs (slow, expensive)
// With collect: 1 AI run with all 3 messages (efficient!)

5. How the Gateway Talks to Everything: WebSocket Protocol

The Gateway uses WebSocket to communicate with all clients in real-time. Let's see the conversation flow.

WebSocket Connection Lifecycle Diagram

WebSocket Connection Lifecycle

Figure 4: WebSocket connection lifecycle—connect, subscribe, stream, disconnect.

Understanding the Message Types

OpenClaw uses three types of WebSocket messages:

1. REQUEST (Client → Gateway)

{
  type: "req",
  id: "msg-001",           // For tracking responses
  method: "agent",         // What action to take
  params: {
    sessionKey: "main",
    message: "What's the weather?"
  }
}

2. RESPONSE (Gateway → Client)

{
  type: "res",
  id: "msg-001",           // Matches the request!
  ok: true,
  payload: {
    runId: "run-abc123",
    status: "accepted"
  }
}

3. EVENT (Gateway → Client, server-push)

{
  type: "event",
  event: "agent",
  payload: {
    stream: "assistant",
    delta: "The weather today is "  // Streaming text!
  }
}

The Connection Dance (Step by Step)

Client opens WebSocket connection
- "Hey Gateway, I want to talk!"
Client sends connect request (MUST be first)
- "Here's my authentication token and device info"
Gateway responds with hello-ok
- "Welcome! Here's who else is connected"
Client sends agent request
- "User wants to know the weather"
Gateway responds IMMEDIATELY with ACK
- "Got it! Tracking number: run-abc123"
Gateway streams events as AI thinks
- event:tool → "Looking up weather..."
- event:assistant → "The weather today is..."
Gateway sends final response
- "Done! Here's the complete answer"

Why Immediate ACK Matters

Without ACK:
User sends message → ⏳ 30 seconds of nothing → Response appears
(User thinks: "Did it work? Should I resend?")

With ACK:
User sends message → ✅ "Got it!" (0.1 seconds) → Streaming updates → Response
(User thinks: "Cool, it's working on it!")

6. Memory That Actually Works: Hybrid Search

How does OpenClaw remember that you mentioned your dog's name three weeks ago? Two search methods working together.

Hybrid Memory Search Diagram

Figure 5: Hybrid memory search—vector (semantic) and BM25 (keyword) combined.

The Problem: One Search Method Isn't Enough

Search Type	What It's Good At	What It Misses
Vector (Semantic)	"meeting" matches "appointment"	Exact codes like `AUTH_TOKEN_123`
BM25 (Text)	Exact matches, IDs, error codes	Paraphrases ("car" won't match "automobile")

Real example:

You told the AI: "My API key is sk-abc123xyz"
Later you ask: "What's my API key?"
Vector search might miss it (semantic similarity is low)
BM25 finds it instantly (exact text match)

How Hybrid Search Works

async function hybridSearch(query: string): Promise<Result[]> {
  // Step 1: Search BOTH systems in parallel
  const [vectorResults, bm25Results] = await Promise.all([
    vectorSearch(query),    // "Does this MEAN similar things?"
    bm25Search(query)       // "Does this CONTAIN similar words?"
  ]);

  // Step 2: Merge results by chunk ID
  const merged = new Map();

  for (const result of vectorResults) {
    merged.set(result.id, { vectorScore: result.score });
  }

  for (const result of bm25Results) {
    const existing = merged.get(result.id) || {};
    existing.textScore = 1 / (1 + result.rank);  // Convert rank to score
    merged.set(result.id, existing);
  }

  // Step 3: Calculate combined score
  const scored = [...merged.entries()].map(([id, scores]) => ({
    id,
    // Vector gets more weight (0.7) because semantic is usually better
    score: (0.7 * (scores.vectorScore || 0)) + (0.3 * (scores.textScore || 0))
  }));

  // Step 4: Sort and return best matches
  return scored.sort((a, b) => b.score - a.score);
}

The Memory File Structure

~/.openclaw/workspace/
├── MEMORY.md               # Long-term notes about you
│   └── "User prefers morning meetings"
│   └── "Their dog is named Max"
│   └── "Allergic to peanuts"
│
└── memory/
    ├── 2026-01-31.md       # What happened Jan 31
    ├── 2026-02-01.md       # What happened Feb 1
    ├── 2026-02-02.md       # What happened yesterday
    └── 2026-02-03.md       # What happened today

Why plain files?

You can read them yourself (no special tools needed)
Easy to debug ("why did it remember that?")
Easy to edit ("let me correct this memory")

7. Surviving Long Conversations: Pre-Compaction Memory Flush

AI models have limited "context windows" (how much text they can see at once). What happens when a conversation gets too long?

Pre-Compaction Memory Flush Diagram

Pre-Compaction Memory Flush

Figure 6: Pre-compaction memory flush—silent write before summarization.

The Problem: Context Windows Fill Up

Imagine you've been chatting with your AI assistant for an hour. The conversation is now 50,000 tokens—but the AI can only see 100,000 tokens at once. Add your memory files, system prompt, and skills, and suddenly you're running out of room.

Without flush:

Context fills up → AI summarizes everything → Important details lost!
"What was that API key you mentioned 45 minutes ago?"
"I don't recall any API key." 😱

With flush:

Context fills up → AI secretly saves important details → Then summarizes
"What was that API key you mentioned 45 minutes ago?"
"It was sk-abc123xyz, which you mentioned at 2:15 PM."

How Pre-Compaction Flush Works

System monitors context usage
- "We've used 80,000 of 100,000 tokens..."
Threshold reached
- "⚠️ We're getting close to the limit!"

Silent agent turn (user doesn't see this!)

System: "Session nearing compaction. Store durable memories now."
AI: [Writes important details to memory/2026-02-03.md]
AI: "NO_REPLY" (signals done, nothing to say to user)

Compaction proceeds
- Old messages get summarized
- But memory files are PRESERVED
Context freed up, memories intact
- AI can recall what it wrote to memory
- User never knew this happened

The Genius Part

The AI essentially takes notes before an exam, knowing it won't be able to see the original textbook later. Self-preservation through proactive memory management.

8. Try It Yourself: Docker Setup

Why Docker? As Simon Willison put it: "I'm not brave enough to run OpenClaw directly on my Mac." Docker provides filesystem isolation, network control, easy cleanup, and a reproducible environment.

Prerequisites

Docker Desktop installed and running
Git installed
A Telegram account (optional, for mobile access)
An API key (OpenAI, Anthropic, or ChatGPT subscription)

Step 1: Clone the Repository

git clone https://github.com/openclaw/openclaw
cd openclaw

The repository contains docker-setup.sh which uses Docker Compose and their docker-compose.yml file.

Step 2: Run the Setup Script

./docker-setup.sh

This creates two folders on your machine (mounted as volumes in the container):

Folder	Purpose
`~/.openclaw`	Configuration, memory, API keys, session data
`~/openclaw/workspace`	Files available to the agent (reads and writes here)

Step 3: Answer the Setup Questions

The setup wizard asks several questions. Here are the recommended answers:

Question	Recommended Answer
Onboarding mode	`manual`
What do you want to set up?	`Local gateway (this machine)`
Model provider	Choose based on your preference (see below)
Tailscale	`no` (say no unless you specifically need remote access)

Model Provider Options

Provider	Setup	Cost Consideration
ChatGPT OAuth	Authenticate with your ChatGPT account	Capped by $20/month subscription (recommended for cost control)
OpenAI API	Provide `OPENAI_API_KEY`	Pay per token (can get expensive!)
Anthropic	Provide `ANTHROPIC_API_KEY`	Pay per token (best results with Claude)
Local (Ollama)	Install Ollama, pull a model	Free but slower

Tip: Simon Willison chose ChatGPT OAuth because "I've heard that OpenClaw can spend a lot of tokens on API plans, and using ChatGPT put an easy upper limit on how much it could spend."

ChatGPT OAuth Flow

If you choose ChatGPT OAuth:

OpenClaw gives you a URL to open in your browser
You authenticate with ChatGPT
It redirects to a localhost URL that shows an error (this is expected!)
Copy that localhost URL and paste it back into OpenClaw to complete authentication

Step 4: Verify the Container is Running

docker ps

You should see a container running the image openclaw:local with a name like openclaw-openclaw-gateway-1.

Step 5: Run Administrative Commands

Use the openclaw-cli container for management commands. Important: Run these from the same folder as docker-compose.yml:

# Check status
docker compose run --rm openclaw-cli status

# List devices
docker compose run --rm openclaw-cli devices list

Step 6: Access the Web UI

OpenClaw runs a web UI on port 18789.

Get your access token:

docker compose run --rm openclaw-cli dashboard --no-open

This outputs a URL like:

http://localhost:18789?token=YOUR_SECRET_TOKEN

Open this URL in your browser to access the dashboard.

Troubleshooting: "pairing required" Error

If you see disconnected (1008): pairing required, you need to approve the device:

# List pending pairings
docker compose exec openclaw-gateway \
  node dist/index.js devices list

# Approve a pending request (use the Request ID from the list)
docker compose exec openclaw-gateway \
  node dist/index.js devices approve \
  YOUR-REQUEST-ID-HERE

Step 7: Set Up Telegram (Optional but Recommended)

Telegram lets you control OpenClaw from your phone—very useful!

Create a Telegram Bot:

Open Telegram and start a chat with @BotFather
Send the command /newbot
Follow the prompts to name your bot
Copy the token BotFather gives you

Provide the token during setup (or add it later in configuration).

Approve the pairing:

# OpenClaw will send you a pairing code via Telegram
docker compose run --rm openclaw-cli pairing approve telegram <CODE>

Now you can message your bot directly from Telegram on your phone!

Step 8: Installing Extra Packages (Optional)

The OpenClaw bot runs as a non-root user (for security). To install extra packages:

# Get a root shell in the container
docker compose exec -u root openclaw-gateway bash

# Install packages (example: ripgrep)
apt-get update && apt-get install -y ripgrep

Quick Reference: Common Commands

Task	Command
Check container status	`docker ps`
View OpenClaw status	`docker compose run --rm openclaw-cli status`
Get dashboard URL	`docker compose run --rm openclaw-cli dashboard --no-open`
List devices	`docker compose exec openclaw-gateway node dist/index.js devices list`
Approve device	`docker compose exec openclaw-gateway node dist/index.js devices approve <ID>`
Approve Telegram	`docker compose run --rm openclaw-cli pairing approve telegram <CODE>`
Root shell	`docker compose exec -u root openclaw-gateway bash`
Stop OpenClaw	`docker compose down`
Start OpenClaw	`docker compose up -d openclaw-gateway`

Security Reminders

Use Docker — Don't run directly on your machine
ChatGPT OAuth — Consider this to cap spending
Skip Tailscale — Unless you specifically need remote access
Review approvals — Check ~/.openclaw/exec-approvals.json regularly
Limit workspace — Only put files in ~/openclaw/workspace that you want the agent to access

Additional Resources

9. Key Takeaways & What's Next

What We Learned

1. Architecture Matters

One Gateway daemon simplifies everything
File-based memory is debuggable and simple
WebSocket enables real-time communication

2. Concurrency is Hard (Lanes Make It Easier)

Don't ask "what do I lock?" — ask "what can run in parallel?"
Session lanes ensure ordering
Global lanes prevent resource exhaustion

3. Memory Needs Both Semantic and Exact Search

Vector search catches meaning
BM25 catches exact matches
Together they cover each other's blind spots

4. Long Conversations Need Proactive Memory

Don't wait until compaction to save important details
Silent agent turns preserve critical context

Design Patterns to Remember

Pattern	What It Does	When to Use
Acknowledge-Then-Execute	Return immediately, process async	Any long-running operation
Dual-Queue Serialization	Two queue layers (session + global)	Multi-tenant systems
Event Stream Multiplexing	One connection, multiple logical streams	Real-time UIs
Content-Hash Caching	Skip re-processing unchanged content	Embedding, indexing
Graceful Degradation	Fall back when services fail	Production resilience

Next Steps

To learn more:

Read the full engineering guide
Explore the TypeScript examples
Try modifying the code and see what happens!

Questions to explore:

How would you add a new messaging platform?
What happens if the Gateway crashes mid-conversation?
How do "skills" differ from "plugins"?

Build something:

Create a custom skill for your workflow
Build a simple Gateway client in your favorite language
Experiment with different queue modes

Appendix: Quick Reference

File Locations

~/.openclaw/
├── config.json              # API keys, settings
├── workspace/
│   ├── MEMORY.md            # Long-term memory
│   ├── memory/              # Daily logs
│   └── skills/              # Skill definitions
└── sessions/                # Conversation histories

WebSocket Endpoints

Endpoint	Purpose
`ws://127.0.0.1:18789`	Gateway WebSocket API
`http://localhost:18793`	Canvas UI

Lane Defaults

Lane	Concurrency	Purpose
`session:*`	1	Per-session serialization
`main`	4	Primary agent runs
`subagent`	8	Parallel sub-tasks
`cron`	2	Scheduled jobs