docs / architecture

Architecture

How the packages fit together, the security model, and key design decisions.

Package Dependency Graph

dependency-graph
@walkie-talkie/shared          ← Protocol types & constants

    ├──▶ @walkie-talkie/server   ← Express + WS + node-pty
    │        │
    │        └──▶ @walkie-talkie/cli      ← CLI launcher (bundles server)
    │        └──▶ @walkie-talkie/service  ← Electron tray app

    ├──▶ @walkie-talkie/client   ← Framework-agnostic WS client
    │        │
    │        └──▶ @walkie-talkie/react    ← React hooks + TerminalView
    │                  │
    │                  └──▶ @walkie-talkie/web  ← Next.js web client

    └──▶ @walkie-talkie/www      ← Landing page (independent)

The shared package is the root — it defines the contract between server and client. Everything else builds on top.

Layered Architecture

Protocol Layer (shared)

TypeScript interfaces define every message type. Both server and client import these types, so the protocol is enforced at compile time. No runtime validation is needed between trusted packages.

shared/src/protocol.ts

Server Layer

The server has three concerns:

  1. Terminal managementTerminalSession wraps node-pty, maintains a scrollback buffer (100KB max), and emits data/exit events.
  2. AuthenticationTokenManager handles token generation, consumption, and session persistence.
  3. WebSocket bridge — Routes messages between authenticated clients and their terminals. Each session is isolated.
server/src/index.ts

Client Layer

The client is deliberately framework-agnostic. WalkieTalkieClient is a plain TypeScript class with no React, Vue, or Angular dependencies. It handles:

client/src/client.ts

React Layer

A thin wrapper over the client. useWalkieTalkie manages reactive state (terminal list, connection state) and output buffering. TerminalViewis an xterm.js component with auto-fit and resize detection.

react/src/useWalkieTalkie.ts

Security Model

Token Lifecycle

token-lifecycle
Server generates token

    │  Token: a1b2-c3d4-e5f6-7890
    │  TTL: 5 minutes
    │  Usage: single-use
    │  Storage: in-memory only


Client receives token (QR code, URL, or CLI output)


Client sends: { type: "auth", token: "a1b2-..." }


Server validates:
    ├── Token exists? ─── No  → auth:fail (invalid_token)
    ├── Already used?  ── Yes → auth:fail (invalid_token)
    ├── Expired?       ── Yes → auth:fail (invalid_token)
    └── Valid ─────────────── → Consume token


                              Generate sessionId (UUID)
                              Persist to ~/.walkie-talkie/sessions.json
                              Send: { type: "auth:ok", sessionId: "..." }

Session Persistence

Sessions survive server restarts with a 24-hour expiry:

Session Isolation

Each session has its own set of terminals. Session A cannot see or interact with Session B's terminals, even on the same server.

Data Flow

data-flow
┌──────────┐    WebSocket     ┌──────────┐     node-pty    ┌──────────┐
│          │  ──────────────▶ │          │ ──────────────▶ │          │
│  Browser │   JSON messages  │  Server  │   raw bytes     │   PTY    │
│  Client  │  ◀────────────── │          │ ◀────────────── │ (shell)  │
│          │                  │          │                 │          │
└──────────┘                  └──────────┘                 └──────────┘

  xterm.js                     Express +                   /bin/bash
  renders                      ws library                  or similar
  terminal                     routes msgs

Terminal I/O Path

  1. User types in xterm.js → onData callback fires
  2. Client sends terminal:input over WebSocket
  3. Server writes to PTY via pty.write(data)
  4. PTY processes input, produces output
  5. Server receives PTY output via pty.onData
  6. Server sends terminal:output over WebSocket
  7. Client writes to xterm.js via terminal.write(data)

Scrollback

The server maintains a 100KB scrollback buffer per terminal. On session resume, this buffer is replayed so the client sees the terminal exactly as it was. The client-side React hook also buffers 100KB per terminal for component re-mounts.

Heartbeat / Keepalive

The server sends WebSocket pings every 30 seconds. If a client doesn't respond with a pong before the next ping, the connection is terminated. This detects dead connections from network drops or closed tabs.

Reconnection Strategy

reconnect
Disconnect detected


Wait 1s → reconnect attempt 1
    │ fail

Wait 2s → reconnect attempt 2
    │ fail

Wait 4s → reconnect attempt 3
    │ fail

Wait 8s → reconnect attempt 4
    ...

Wait 30s (max) → reconnect attempt N
    │ success

Send auth:resume { sessionId }


Server replays terminal list + scrollback

Monorepo Structure

directory-layout
walkie-talkie/
├── shared/           @walkie-talkie/shared    — Protocol types
├── server/           @walkie-talkie/server    — Core server
├── client/           @walkie-talkie/client    — WS client
├── react/            @walkie-talkie/react     — React hooks + components
├── cli/              @walkie-talkie/cli       — CLI launcher
├── service/          @walkie-talkie/service   — Electron tray app
├── web/              @walkie-talkie/web       — Next.js web client (5 views)
├── www/              @walkie-talkie/www       — Landing page + docs
├── pnpm-workspace.yaml
└── package.json

All packages use workspace:* for internal dependencies, TypeScript for source, and compile to dist/ with type definitions.

Build Order

build-chain
# Correct build order (each depends on the previous)
pnpm --filter @walkie-talkie/shared build    # Types first
pnpm --filter @walkie-talkie/server build    # Server (uses shared)
pnpm --filter @walkie-talkie/client build    # Client (uses shared)
pnpm --filter @walkie-talkie/react build     # React (uses client + shared)

# These can be parallel after the above:
pnpm --filter @walkie-talkie/cli build       # Bundles server with esbuild
pnpm --filter @walkie-talkie/web build       # Next.js build
pnpm --filter @walkie-talkie/www build       # Landing page build

Key Design Decisions

Why node-pty?

Real pseudo-terminals support all the features users expect: shell prompts, tab completion, colors, curses apps (vim, htop, etc). Alternatives like child_process.exec can't do this.

Why framework-agnostic client?

By keeping the core client as a plain TypeScript class, it works in React, Vue, Svelte, Node.js scripts, Electron, or even a raw browser console. The React package is just a thin wrapper — other framework wrappers could be built trivially.

Why single-use tokens?

If a token is leaked (someone screenshots the QR code), it can only be used once. After that, the attacker needs a new token. Sessions are identified by UUID, which is never displayed in a QR code.

Why JSON over WebSocket?

Simplicity. The protocol has ~13 message types. JSON is human-readable, easy to debug, and any language can parse it. The overhead is negligible compared to terminal output bandwidth.