Part 2: Understand the Architecture — AI Lobster-Raising Guide: OpenClaw from Beginner to Advanced

If you think of OpenClaw as a bot that replies to messages, much of the system will seem overbuilt. If you think of it as a long-running agent platform, the architecture starts to feel coherent.

01 The Three-Layer Architecture

OpenClaw uses a Gateway -> Node -> Channel structure.

Layer	Role	Key idea
Gateway	Central control plane	Maintains WebSocket, sessions, and agent routing
Node	Execution layer	Handles local commands, device access, recording, and system actions
Channel	Messaging ingress	Connects Telegram, Discord, Feishu, DingTalk, and more

How one message flows through the system

A typical interaction looks like this:

User sends message -> Channel receives it -> Gateway routes it -> Agent reasons -> Node executes -> Response goes back

Why it binds to localhost by default

OpenClaw follows a loopback-first design. Gateway binds to 127.0.0.1 by default.

That gives you three immediate benefits:

No public port exposure by default
Lower latency for local execution
Explicit remote access only when you choose to expose it

A practical rule is to run only one Gateway per host, especially for channels that rely on an exclusive authenticated session such as WhatsApp Web.

02 Memory: Why OpenClaw Can “Remember”

Memory is one of the defining differences between OpenClaw and a normal chatbot. A useful way to model it is as four layers:

Layer	Stored in	Purpose
SOUL	`SOUL.md`	Core identity, values, and agent personality
TOOLS	Skills and extensions	What the agent can currently do
USER	`MEMORY.md` + vector store	Long-term facts, preferences, and decisions
Session	Memory + `sessions.json`	Current conversational context

Daily logs

OpenClaw writes daily interaction logs into memory/YYYY-MM-DD.md. When a new session starts, it can pull recent logs to preserve continuity.

Pre-compaction

When context approaches token limits, OpenClaw can:

Detect the threshold
Save important information into MEMORY.md and daily logs
Compress old context to free room

That is one of the reasons a long-running OpenClaw agent feels more persistent than a simple chat window.

Semantic memory search

Memory retrieval combines:

embedding-based semantic lookup
keyword-style retrieval such as BM25

Typical tools include:

memory_search
memory_get

03 Workspace: Everything Is Files

A lot of OpenClaw’s power comes from making system state legible as files.

workspace/
├── AGENTS.md
├── SOUL.md
├── USER.md
├── MEMORY.md
├── HEARTBEAT.md
├── memory/
│   └── YYYY-MM-DD.md
├── skills/
└── sessions.json

What these files do

File	Purpose
`AGENTS.md`	Role, behavior boundaries, and response style
`SOUL.md`	Deeper stable identity
`USER.md`	User information and preferences
`MEMORY.md`	Long-term memory
`HEARTBEAT.md`	Timers and proactive behavior
`skills/`	Workspace-specific Skills
`sessions.json`	Session metadata

This is classic OpenClaw: a complex runtime represented in plain files rather than opaque internal state.

04 Sessions and Identity

OpenClaw is strict about who can talk to the agent and how contexts are separated.

DM pairing

Unknown users do not automatically enter a private conversation. By default they get a pairing code first, and must be approved before full interaction begins.

That protects against:

random users burning your API budget
unapproved access to private workflows
your chat channels becoming public attack surfaces

allowFrom

Known accounts can be pre-approved:

allowFrom:
  - telegram:123456789
  - whatsapp:+8613800138000
  - discord:user#1234

requireMention in groups

In groups, the safest default is to respond only when explicitly mentioned. That cuts down both noise and unnecessary token usage.

Session isolation

Scenario	Behavior	Long-term memory loaded
Private chat	Shared `main session`	Yes
Group chat	Separate session per group	No
Cross-channel private chat	Can converge into the same main session	Yes

05 Design Philosophy

OpenClaw’s technical choices reflect a clear philosophy.

Unix style

The project heavily favors:

small tools
composability
text-based state

Minimal core tools

The core tool set is intentionally small:

Read
Write
Edit
Bash

Fewer tools mean a shorter system prompt, less prompt bloat, and cleaner agent decision-making.

It does not treat MCP as the center of the world

OpenClaw can bridge to MCP-related workflows when needed, but it strongly favors CLI-based expansion. The underlying idea is that an agent that can write and use tools directly has a higher ceiling.

Self-extension

A distinctive OpenClaw pattern is:

Encounter a missing capability
Write or modify a Skill
Reload and continue the original task

That is part of why it feels more like a live system than a static assistant.

06 Scale and Runtime Footprint

OpenClaw is not a tiny utility.

Metric	Value
Code size	About 430k lines of TypeScript
Runtime memory	About 1 GB
Startup time	Roughly 3 to 5 seconds
Official extensions	40+
Built-in Skills	55

That tells you two things:

it is a real platform, not a toy demo
you should operate it with the mindset of a persistent service