writings

React in ≈120 lines

To truly understand how React works, we can build a tiny version of React from scratch.

By implementing a simple render function, a virtual DOM, and a diffing algorithm, we demystify the core ideas behind the library and appreciate its design tradeoffs.

Props & state

React components juggle two data sources. Propsflow into a component from its parent and are immutable.State lives inside a component and can change – triggering a new render. Everything visible on screen is a pure function of(props, state).

Parent (state)Child (props)Grand child (props)

Fig 9 – props cascade downward; only callbacks travel up.

A canonical pattern: parent owns state, passes value + callback to a child. The child never mutates data directly; it requests changes by calling the callback.

1function Parent() {
2  const [count, set] = React.useState(0);
3  return (
4    <Counter
5      label="Clicks"
6      value={count}               // ✅ prop (read‑only by child)
7      onIncrement={() => set(count + 1)} // ✅ callback prop
8    />
9  );
10}
11
12function Counter({ label, value, onIncrement }) {
13  return (
14    <button onClick={onIncrement}>
15      {label}: {value}
16    </button>
17  );
18}

The button displays value (a prop) yet changes it by invoking onIncrement. The update travels up, setStatefires, React re‑renders Parent, and the new prop cascades back down.

User eventsetState()Render()Commit DOM

Fig 10 – one click’s round‑trip through the data flow.

Local state is for information the component alone owns – e.g. UI toggles, transient form input:

1function Toggle() {
2  const [on, set] = React.useState(false); // ✅ local state
3  return (
4    <button onClick={() => set(!on)}>
5      {on ? "ON" : "OFF"}
6    </button>
7  );
8}

A golden rule: never mutate props. Doing so violates the parent’s expectations and destroys the guarantee that render is pure.

1// ⚠️ anti‑pattern: child mutates prop object
2function List({ items }) {
3  items.push("⚠️");          // bad – breaks parent's assumptions
4  return <ul>{items.map(i => <li key={i}>{i}</li>)}</ul>;
5}

Listing O – anti‑pattern: mutating a prop breaks referential integrity and can cause phantom re‑renders.

In practice, most state lives at the lowest shared ancestor that needs the data – a principle nicknamed “lift state up.” React’s one‑way data flow keeps mental overhead low: to understand any component, trace props in, local state inside, and callbacks out. No hidden observers, no two‑way bindings.

With a solid grasp of props & state we can now explore how React orchestrates their lifecycle – mounting, updating, and unmounting – while keeping your effects predictable.

Writing React 19 from scratch

From HTML parsing to a tiny React in ~120 lines — DOM cost, diffing, hooks, lifecycle.

1  From Electrons to Bytes

Every computer is a colossal city of microscopic switches called transistors. Each switch is either conducting electricity or blocking it. We label these states 1 and 0. A single bit isn’t impressive, yet eight neighbours joined at the hip form a byte—the basic currency of memory.

Fig 1 – eight bits dancing between 0 and 1 combine into a byte (0‑255).

Why care? Because every extra bit doubles the number of distinct patterns. One bit can only say “yes/no”; two bits track four states; ten bits already distinguish 1 024 cases. That exponential head‑room is how tiny chips stream 4K video and protect cryptographic keys.

Fig 2 – slide to feel how information explodes with more bits.

Fig 3 – turning raw patterns into real‑world meaning.

The same 0‑1 patterns can wear many hats. With one bit we encode truth;two bits steer a robot across the four cardinal points;four bits cover every hexadecimal digit you type in CSS; bump to seven bits and you hold the entire classic ASCII table.

A cleaner mental image of memory is a long street of numbered mailboxes. Address 0x0000 is the first mailbox, 0x0001 the next, and so on. Each mailbox stores exactly one byte. Ask the hardware “what’s at 0x0042?” and it walks to that mailbox and hands back the byte inside.

0x0048
0x0145
0x024C
0x034C
0x044F
0x050A
0x0600
0x0797
0x083C
0x09FA
0x0a5D
0x0b00
0x0c13
0x0d00
0x0eBE
0x0fEF

Fig 4 – sixteen consecutive mailboxes. The first six spell “HELLO\n” in ASCII.

Meaning lives in the reader. The sequence 48 45 4C 4C 4F might be text ("HELLO"), five CPU opcodes, or part of a JPEG header—depending on the software interpreting those addresses. A pointer in C is just a street address; a Python string is an address plus a length; an executable is a giant relocation map telling the OS which mailboxes should land where.

When a program launches, its bytes are copied onto this street and the CPU’s program counter points at the first instruction.

FetchDecodeExecute

Fig 5 – the CPU repeats a fetch → decode → execute ballet billions of times per second.

At this raw level the processor knows nothing about variables or objects—it only consumes opcodes. All the structure we enjoy in high‑level languages eventually compresses into this binary choreography.

48 65 6C 6C 6F 2C 20 43 50 55 21
48 65 78 20 64 75 6D 70 20 6F 66 20
48 65 6C 6C 6F 2C 20 43 50 55 21

Listing A – a hex dump of the ASCII string “Hello, CPU!” stored as raw bytes.

1section .text
2global  _start
3_start:
4    mov     rax, 60     ; sys_exit
5    mov     rdi, 42     ; status code
6    syscall

Listing B – minimal x86‑64 assembly: exit with status 42.

These two listings are different facets of the same reality: bytes that shape behaviour. In Listing A we gaze at data; in Listing B the bytes are behaviour. Grasping that duality prepares us for the next chapter, where we climb one rung up to study machine instructions—the specific binary patterns each CPU understands.

2  Machine Instructions

The journey from abstract algorithms to working silicon begins with machine instructions. These are small binary sentences the CPU can speak natively. Each instruction encodes what to do (an opcode) and who to do it with (registers, memory addresses, or immediate values).

A modern x86‑64 chip exposes sixteen 64‑bit general‑purpose registers. Here are the first four, wired into the arithmetic‑logic unit (ALU) that performs additions, bitwise ops and comparisons:

Unlike high‑level languages, machine code is obsessed with placement: which register holds which operand, how many bytes the immediate value spans, and whether memory operands cross cache‑line boundaries. This low‑level specificity is what makes code run quickly but also what makes it brittle across architectures.

48 05 05 00 00 00   C3
^  ^  ^^^^^^^^^^^   ^
|  |       |        └─ 0xC3 = RET
|  |       └────────── Immediate 32‑bit value 5
|  └────────────────── Opcode ADD (imm32 to RAX)
└───────────────────── REX.W (64‑bit operand)

Listing C – seven raw bytes implementing ADD RAX, 5 andRET. The initial 0x48 is the “REX.W” prefix that switches the CPU into 64‑bit operand mode.

1; add 5 to RAX and return
2mov     rax, 2          ; set RAX = 2
3add     rax, 5          ; RAX = RAX + 5
4ret

Listing D – human‑readable Intel syntax disassembly of the same idea.

Notice how a single assembly instruction add rax, 5 blurred into multiple machine‑code fields: a prefix byte, an opcode byte, a mode‑specifying ModR/M byte and a 32‑bit immediate operand. The assembler takes care of that busywork for us, which is why the next chapter pivots to Assembly Language—a thin textual veneer over these raw opcodes that humans can actually edit.

Writing a programming language from scratch

Considerations for building a custom language optimized for AI collaboration.

Core files & formats

A SpecBundle distills every rule, contract and preference an LLM agent needs into five plainly‑named files. They live together under spec/, version independently and lint in CI. The stack below is opinionated but battle‑tested across Copilot, Cursor IDE and smol‑dev:

AGENTS.mdschema.jsontasks.yamlpermissions.yamlllm.hints.tomlspec/ bundle (v 1.0)

AGENTS.md carries the story—persona, style guide and worked examples. Markdown renders nicely in GitHub and IDE panels, and its free‑form nature encourages rich commentary. Keep critical “MUST / SHOULD” rules in bullet lists so both humans and models parse them deterministically.

1## Coding style
2* Prefer TypeScript strict mode.
3* No console logs in production code.
4* Tests live beside source files as `*.test.ts`.
5  
6### Example commit
7```diff
8+ export async function fetchUser(id: string): Promise { … }
9```

Listing 3 – excerpt from AGENTS.md.

schema.json is the machine contract. Each tool call the model may request must validate against this schema before execution. Fail closed; refuse any payload that breaks the rules. A short excerpt:

1{
2  "$schema": "https://json-schema.org/draft/2020-12/schema",
3  "$id": "https://example.com/tools/writeFile",
4  "type": "object",
5  "required": ["path", "contents"],
6  "properties": {
7    "path": { "type": "string", "pattern": "^src/.*\\.tsx$" },
8    "contents": { "type": "string", "maxLength": 10000 }
9  }
10}

Listing 4 – schema fragment for the writeFile tool.

While schema enforces shape, tasks.yaml offers context. Each “card” ties a repository goal—“bump deps”, “add tests”—to preferred tools and hints. Agents can pick a card, execute its tools, then mark it done. YAML’s heredoc blocks keep multi‑line guidance readable:

1---
2version: 1.2
3cards:
4  - id: refactor_component
5    goal: "Modernise a legacy React class component to hooks"
6    tools: ["writeFile", "runTests"]
7    hints: |
8      * Preserve existing snapshot outputs
9      * Component must remain SSR‑safe
10...

Listing 5 – a task card guiding a refactor.

Security lives in permissions.yaml: explicit allow‑lists for paths and network domains. The runner denies any file write or HTTP call outside this set—no exceptions, no surprises:

1allowedPaths:
2  - "src/**"
3  - "!src/secrets/**"
4network:
5  domains:
6    - "api.github.com"
7    - "registry.npmjs.org"

Finally, llm.hints.toml tweaks model temperature, sampling and persona. TOML’s strict syntax beats YAML ambiguity for small config blobs:

1[model]
2temperature = 0.2
3top_p       = 0.9
4
5[persona]
6prefix       = "Kai"
7voice        = "concise"

Listing 6 – model & persona knobs.

Together these files replace the monolithic prompt with a layered, diff‑friendly bundle. Need to add a new tool? Touch onlyschema.json. Tighten network rules? PR topermissions.yaml. Narrative evolves daily, contracts less often—and CI keeps every layer in sync.

Specification paradigms

Early agent projects packed every rule, example and safety rail into a single monster prompt. This mono‑spec worked—until it didn’t. The file grew past 3 000 tokens, diffs turned to noise, and CI couldn’t tell if a change was harmless copy‑editing or a policy break. Developers started copy‑pasting snippets, models lost context windows, and subtle contradictions crept in.

Single file promptprose + rules + examples3 000 + tokenshard to diff, lint, testAGENTS.mdschema.jsontasks.yamlpermissions.yamlllm.hints.tomleach layer lints & versions independently“Mono‑spec”“Layered bundle”

The industry’s response was layering. Instead of one opaque blob, we treat a spec like a software artifact with separable concerns:

  • AGENTS.md – long‑form narrative: persona, coding style, examples.
  • schema.json – machine contract: tool names, parameter shapes, structured outputs.
  • tasks.yaml – curated task “cards” that map goals to tool sequences.
  • permissions.yaml – explicit allow‑lists for files, URLs, shell ops.
  • llm.hints.toml – model tunables (temperature, bias, stop words).

Layering buys precision diffusion: narrative evolves weekly, schemas maybe monthly, permissions on incident day. Each file owns its ownversion field so linters can enforce stability. A breaking change inschema.json (say, renaming oldPath → path) need not disturb the prose or tasks—CI simply pins the model to the new schema once tests pass.

1diff --git a/spec/AGENTS.md b/spec/AGENTS.md
2@@
3-## Version 1.3
4+## Version 1.4  ← narrative tweak (typo fix)
5
6diff --git a/spec/schema.json b/spec/schema.json
7@@
8-"$id": "tool.schema:1.1",
9-"version": "1.1",
10+"$id": "tool.schema:2.0",
11+"version": "2.0",      ← breaking param rename
12
13diff --git a/spec/tasks.yaml b/spec/tasks.yaml
14@@
15-version: 1.1
16+version: 1.2           ← new “refactor” card added
17

Listing C – a single PR bumps three layers at their own cadence; reviewing risk is now surgical instead of holistic.

Layered specs also unlock polyglot tooling. JSON Schema can be validated by AJV or Pydantic, YAML cards may feed TUI dashboards, TOML hints tune the OpenAI SDK—and none of these tools need to parse Markdown. This composability keeps specs alive: they fail CI when wrong, surface docs when right, and scale gracefully with both humans and models.

In the search for LLM specs

Exploring benchmarks, evaluation suites and system cards that shape the spec space.