From fcedfddf00b3f994e4f4e40332ac7fc192c63244 Mon Sep 17 00:00:00 2001 From: polwex Date: Sun, 5 Oct 2025 21:56:51 +0700 Subject: claude is gud --- OLD.md | 685 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 685 insertions(+) create mode 100644 OLD.md (limited to 'OLD.md') diff --git a/OLD.md b/OLD.md new file mode 100644 index 0000000..8a00255 --- /dev/null +++ b/OLD.md @@ -0,0 +1,685 @@ + +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Overview + +This is Vere, the Urbit runtime environment - the lowest layer of the Urbit stack. It includes the Nock virtual machine, I/O drivers, event log, and snapshotting system. The codebase is written in C and uses Zig as the build system. + +**Active Project**: We are porting Vere from C to OCaml 5.x with Eio, using a hybrid approach where jets remain in C and are called via FFI. + +## Current Vere (C) Documentation + +### Build System + +#### Primary Commands + +- **Build native debug binary**: `zig build` +- **Run specific tests**: `zig build --summary all` + - Available tests: `nock-test`, `ames-test`, `palloc-test`, `equality-test`, `hashtable-test`, `hamt-test`, `jets-test`, `retrieve-test`, `serial-test`, `boot-test`, `newt-test`, `vere-noun-test`, `unix-test`, `pact-test`, `tracy-test` +- **Build for all supported targets**: `zig build -Dall` +- **Release build**: `zig build -Drelease` +- **Build with optimization**: `zig build -Doptimize=ReleaseFast` + +#### Common Build Options + +- `-Dtarget=`: Cross-compile (e.g., `aarch64-linux-musl`, `x86_64-macos`) +- `-Doptimize=`: Debug (default), ReleaseSafe, ReleaseSmall, ReleaseFast +- `-Dpace=`: Release train - once (default), live, soon, edge +- `-Dcopt=`: Additional compiler flags (can be specified multiple times) +- `-Dasan`: Enable address sanitizer (native only, requires llvm@19) +- `-Dubsan`: Enable undefined behavior sanitizer (native only, requires llvm@19) +- `-Dtracy`: Enable Tracy profiler integration +- `-Dcpu-dbg`: Enable CPU debug mode (-DU3_CPU_DEBUG) +- `-Dmem-dbg`: Enable memory debug mode (-DU3_MEMORY_DEBUG) + +#### Build Output + +Built binaries are placed in `zig-out//` (e.g., `zig-out/x86_64-linux-musl/urbit`) + +### Architecture + +#### Package Structure + +The runtime is organized into distinct packages in the `pkg/` directory: + +- **pkg/c3**: Basic utilities for Urbit's C style (types, macros, portability) +- **pkg/ent**: Cross-platform entropy source wrapper (`getentropy(2)`) +- **pkg/ur**: Jam/cue implementation (Urbit's bitwise noun serialization) +- **pkg/noun**: Nock virtual machine, memory management, jets, and snapshotting +- **pkg/past**: Parser for Urbit's bytecode format +- **pkg/vere**: I/O drivers, event log (LMDB), main event loop, pier management + +#### Key Components + +**Nock VM (pkg/noun)**: +- `nock.c`: Nock interpreter and execution +- `jets.c`: Jet registration and acceleration (optimized implementations of Nock formulas) +- `allocate.c`: Loom memory management (arena allocator) +- `manage.c`: Noun memory lifecycle and garbage collection +- `events.c`: Event processing and persistence +- `jets/`: Directory tree of jet implementations organized by Hoon library structure + +**I/O System (pkg/vere/io)**: +- `ames.c`: UDP networking driver (Urbit's P2P protocol) +- `behn.c`: Timer driver +- `term.c`: Terminal I/O +- `http.c`: HTTP server/client (using h2o) +- `unix.c`: Filesystem synchronization +- `conn.c`: IPC connections +- `lick.c`: Inter-process communication +- `mesa.c`: Alternative networking protocol +- `cttp.c`: HTTP client effects + +**Event Persistence**: +- `pkg/vere/disk.c`: Event log management using LMDB +- Events are persisted to disk and can be replayed for crash recovery + +**Main Entry Points**: +- `pkg/vere/main.c`: CLI argument parsing and runtime initialization +- `pkg/vere/lord.c`: "Lord" process management (worker process coordination) +- `pkg/vere/pier.c`: Pier (ship instance) lifecycle + +#### Memory Management + +Urbit uses a custom memory model called the "loom": +- Fixed-size arena allocator (2GB or 4GB depending on architecture) +- All Urbit data structures (nouns) live in the loom +- Snapshot-based persistence allows complete memory dumps +- Reference counting and mark-and-sweep GC for cleanup + +### Development Workflow + +#### Working with Fake Ships + +Always develop on fake ships, not live network ships. Fake ships use deterministic keys and communicate over local loopback. + +**Boot a new fake ship**: +```console +zig build +./zig-out//urbit -F zod +``` + +**Boot with development pill** (faster): +```console +./zig-out//urbit -F zod -B solid.pill +``` + +**Launch existing fake ship**: +```console +./zig-out//urbit zod +``` + +#### Debugging + +**GDB debugging**: +```bash +zig build +gdb --args ./zig-out//urbit zod +``` + +In GDB, set: +```gdb +set follow-fork-mode child +handle SIGSEGV nostop noprint +``` + +**macOS lldb debugging**: + +On macOS, you must configure lldb to handle Mach exceptions properly. Start the ship with `-t` flag when debugging, or attach after starting. Then run: +```lldb +p (void)darwin_register_mach_exception_handler() +pro hand -p true -s false -n false SIGBUS +pro hand -p true -s false -n false SIGSEGV +``` + +### Git Workflow + +#### Branch Naming + +All branches for review must follow: `i//` where `` is the GitHub issue number. + +#### Commit Style + +- Use imperative mood for commit messages +- Include short description (required) and optional long description + +#### Pull Request Format + +```markdown +### Description + +Resolves #. + +[Thorough description of changes] + +### Related + +[Related issues, links, papers, etc.] +``` + +#### Branch Structure + +- `develop` (default): edge train - for runtime developers +- `release`: soon train - for early adopters +- `master`: live train - for production + +PRs should target `develop` by default. + +### Testing + +Tests are colocated with implementation code: +- `pkg/noun/*_tests.c`: Noun system tests +- `pkg/vere/*_tests.c`: Vere I/O and persistence tests +- `pkg/ur/tests.c`, `pkg/ent/tests.c`: Package-specific tests + +Run individual test suites with `zig build --summary all`. + +### Dependencies + +External libraries (managed by Zig build system in `ext/`): +- **GMP**: Multi-precision arithmetic +- **OpenSSL**: Cryptography +- **libuv**: Async I/O event loop +- **LMDB**: Memory-mapped database for event log +- **h2o**: HTTP server +- **curl**: HTTP client +- **libsigsegv**: Signal handling for memory protection +- **urcrypt**: Urbit cryptographic primitives +- **wasm3**: WebAssembly interpreter +- **natpmp**: NAT port mapping +- **zlib**: Compression + +### Important Defines + +- `U3_OS_osx` / `U3_OS_linux` / `U3_OS_windows`: Platform detection +- `U3_CPU_aarch64`: ARM64 architecture +- `U3_CPU_DEBUG`: Enable CPU debugging +- `U3_MEMORY_DEBUG`: Enable memory debugging +- `U3_GUARD_PAGE`: Enable guard pages for loom +- `U3_SNAPSHOT_VALIDATION`: Validate snapshots on load +- `C3DBG`: Enable debug assertions + +### Code Style + +The codebase uses Urbit-specific C conventions: +- Custom types: `c3_w` (word), `c3_y` (byte), `c3_o` (loobean), etc. +- Naming: `u3_` prefix for public APIs, `u3X_` for module X +- Heavy use of macros for memory management and control flow +- Arena-based allocation rather than malloc/free + +--- + +## OCaml Port Plan + +### Executive Summary + +This is a phased approach to porting Urbit's runtime (Vere) from C to OCaml 5.x with Eio. The port targets ~32,500 lines of C code (excluding jets). Using a hybrid approach where jets remain in C and are called via FFI, + +### Why OCaml? + +1. **Functional alignment**: Hoon (Urbit's language) is functional; OCaml's paradigm matches better than imperative C +2. **Safety**: Strong typing, exhaustive pattern matching, immutability by default prevent entire classes of bugs +3. **Performance**: OCaml 5.x multicore + Eio provides excellent performance with effect handlers +4. **GC integration**: OCaml's GC can be integrated with noun reference counting more naturally than manual C memory management +5. **Maintenance**: More maintainable codebase with algebraic data types and pattern matching + +### Hybrid Strategy: Keep Jets in C + +**Key Decision**: Keep jets in C, call via FFI from OCaml. This dramatically reduces scope: +- **Don't port**: 187 jet files (~15k LOC) +- **Don't port**: urcrypt, wasm3, softfloat dependencies +- **Do port**: Core noun system, Nock interpreter, I/O drivers + +#### Why Keep Jets in C? + +1. **Proven implementations**: Jets are highly optimized and battle-tested +2. **Crypto dependencies**: Ed25519, ECDSA, etc. already use C libraries (urcrypt) +3. **WebAssembly**: wasm3 integration (3k LOC jet) stays in C +4. **Floating point**: softfloat dependency for IEEE compliance +5. **Reduced risk**: Don't need to reimplement/validate 187 jets +6. **Performance**: C jets are already fast; FFI overhead negligible for typical jet calls + +### Target OCaml Structure + +``` +urbit-ocaml/ +├── dune-project # Project metadata +├── dune-workspace # Workspace config +│ +├── lib/ # OCaml libraries +│ ├── noun/ # Core noun system (port from pkg/noun) +│ │ ├── dune +│ │ ├── types.ml[i] # Noun ADTs +│ │ ├── loom.ml[i] # Memory management +│ │ ├── jam.ml[i] # Serialization +│ │ ├── nock.ml[i] # Nock interpreter +│ │ ├── jets_ffi.ml[i] # FFI to C jets +│ │ └── jets_registry.ml[i] # Jet dispatch +│ │ +│ ├── runtime/ # Runtime I/O (port from pkg/vere) +│ │ ├── dune +│ │ ├── db.ml[i] # Event log +│ │ ├── pier.ml[i] # Pier management +│ │ ├── io/ +│ │ │ ├── ames.ml[i] # UDP networking +│ │ │ ├── http.ml[i] # HTTP server/client +│ │ │ ├── term.ml[i] # Terminal +│ │ │ ├── unix.ml[i] # Filesystem +│ │ │ └── behn.ml[i] # Timers +│ │ └── king.ml[i] # Main orchestrator +│ │ +│ └── c_bridge/ # C FFI bridge +│ ├── dune # Links against existing C code +│ ├── noun_ffi.ml[i] # Noun <-> C noun conversion +│ ├── jets_ffi.ml[i] # Call C jets +│ └── stubs/ # C stubs for FFI +│ ├── noun_stubs.c +│ └── jet_stubs.c +│ +├── bin/ # Executables +│ ├── dune +│ └── urbit.ml # Main entry point +│ +├── test/ # Tests +│ ├── dune +│ ├── test_noun.ml # Noun system tests +│ ├── test_nock.ml # Nock interpreter tests +│ ├── test_jets.ml # Jet FFI tests +│ └── test_integration.ml # End-to-end tests +│ +├── bench/ # Benchmarks +│ ├── dune +│ └── bench_nock.ml +│ +├── c/ # Keep existing C code +│ ├── pkg/noun/ # Copied from vere +│ ├── pkg/ur/ +│ └── ext/ # External deps (gmp, urcrypt, etc.) +│ +└── doc/ # Documentation + └── architecture.md +``` + +--- + +## Phase-by-Phase Implementation Plan + +### Phase 0: Foundation & Development Environment + +**Goal**: Set up OCaml development environment and validate approach + +#### Prerequisites + +```bash +# OCaml 5.2+ (for multicore/effects) +opam switch create vere-ocaml 5.2.1 +eval $(opam env) + +# Core build tools +opam install dune ocamlformat ocaml-lsp-server + +# Essential libraries +opam install \ + eio_main \ # Effects-based I/O + zarith \ # Bignum arithmetic + cmdliner \ # CLI parsing + logs \ # Structured logging + fmt \ # Formatting/pretty-printing + \ + alcotest \ # Testing + qcheck \ # Property-based testing + qcheck-alcotest \ # QCheck integration + bechamel \ # Benchmarking + \ + ctypes \ # FFI to C + ctypes-foreign # Dynamic FFI + +# Optional but recommended +opam install \ + ocaml-lsp-server \ # LSP for editor support + ocamlformat \ # Code formatting + odoc \ # Documentation generation + utop \ # Better REPL + landmarks # Profiling +``` + +#### Initial Project Setup + +```bash +# Create directory outside existing vere repo +cd ~/code/urbit +mkdir vere-ocaml && cd vere-ocaml + +# Initialize dune project +cat > dune-project << 'EOF' +(lang dune 3.16) +(name urbit) +(version 0.1.0) + +(generate_opam_files true) + +(package + (name urbit) + (synopsis "Urbit runtime in OCaml") + (description "Urbit's Nock VM and I/O drivers implemented in OCaml with Eio") + (depends + (ocaml (>= 5.2.0)) + dune + eio_main + zarith + cmdliner + logs + fmt + ctypes + ctypes-foreign + (alcotest :with-test) + (qcheck :with-test) + (qcheck-alcotest :with-test))) + +(using ctypes 0.3) +EOF +``` + +**Deliverables**: +- Working `dune build` and `dune test` +- FFI examples calling C Vere functions from OCaml +- CI/CD pipeline (GitHub Actions) + +--- + +### Phase 1: Core Noun System + +**Goal**: Implement the foundational noun data structures and basic operations + +#### 1.1: Noun Type System + +**Implementation**: +```ocaml +(* types.ml *) +type noun = + | Direct of int (* 31-bit direct atoms *) + | Indirect of indirect + +and indirect = + | Atom of bigint (* Arbitrary precision atoms *) + | Cell of noun * noun (* Pairs [head tail] *) +``` + +#### 1.2: Jam/Cue Serialization + +**Files to port**: +- `pkg/ur/serial.c` (~500 LOC) + +**Strategy**: +- Maintain wire-format compatibility with C version +- Optimize for OCaml's GC characteristics + +#### 1.3: Memory Management + +**OCaml approach**: +```ocaml +(* loom.ml *) +module Loom : sig + type t + val create : size:int -> t + val allocate : t -> noun -> noun (* Intern in loom *) + val snapshot : t -> bytes (* For persistence *) + val restore : bytes -> t +end +``` + +**Strategy**: Hybrid approach - Use OCaml GC for most nouns, reserve loom for snapshot/restore + +--- + +### Phase 2: Nock Interpreter + +**Goal**: Implement a working Nock interpreter that can execute Nock formulas + +#### 2.1: Basic Interpreter + +**Implementation**: +```ocaml +(* nock.ml *) +type formula = + | Axis of int + | Const of noun + | Cell of formula * formula + | Inc of formula + | Eq of formula * formula + | If of formula * formula * formula + | Compose of formula * formula + | Push of formula * formula + | Hint of hint * formula + | ... + +val nock : subject:noun -> formula:noun -> noun +``` + +#### 2.2: Jet Infrastructure & FFI + +**FFI Strategy**: +```ocaml +(* jets_ffi.ml *) +module C = struct + (* Convert OCaml noun to C u3_noun *) + let to_c_noun : Types.noun -> uint32_t = (* ... *) + + (* Convert C u3_noun to OCaml noun *) + let of_c_noun : uint32_t -> Types.noun = (* ... *) + + (* Foreign function binding *) + let u3qa_add = foreign "u3qa_add" + (uint32_t @-> uint32_t @-> returning uint32_t) +end + +(* High-level wrapper *) +let add a b = + let a_c = C.to_c_noun a in + let b_c = C.to_c_noun b in + let result_c = C.u3qa_add a_c b_c in + C.of_c_noun result_c +``` + +--- + +### Phase 3: I/O System with Eio + +**Goal**: Port I/O drivers to Eio's structured concurrency model + +#### 3.1: Event Log & Persistence + +**Implementation**: +```ocaml +(* db.ml *) +module EventLog : sig + type t + val open_ : path:string -> t + val append : t -> event:noun -> unit + val read : t -> from:int -> noun Seq.t + val snapshot : t -> noun -> unit +end +``` + +#### 3.2: Ames (UDP Networking) + +```ocaml +(* io/ames.ml *) +module Ames : sig + val start : + sw:Eio.Switch.t -> + net:_ Eio.Net.t -> + port:int -> + on_packet:(noun -> unit) -> + unit +end +``` + +#### 3.3: HTTP Server/Client + +**Strategy**: Use OCaml-native HTTP (cohttp-eio or dream) instead of porting h2o + +#### 3.4: Other I/O Drivers +- Terminal I/O using Lambda-Term or Notty +- Unix filesystem using Eio.Path +- Timers using Eio.Time +- IPC using Unix domain sockets + +--- + +### Phase 4: Pier Management & Orchestration + +**Goal**: Implement high-level runtime orchestration + +```ocaml +(* pier.ml *) +module Pier : sig + type t + + val boot : + sw:Eio.Switch.t -> + env:_ Eio.Stdenv.t -> + path:string -> + pill:noun -> + t + + val resume : + sw:Eio.Switch.t -> + env:_ Eio.Stdenv.t -> + path:string -> + t + + val poke : t -> noun -> unit + val scry : t -> path:noun -> noun option +end +``` + +--- + +### Phase 5: Performance & Polish + +**Goal**: Match or exceed C performance and prepare for production + +#### Key Optimizations +1. Noun allocation/deallocation +2. Nock interpreter inner loop +3. Jet dispatch via FFI +4. Serialization (jam/cue) +5. Hash table operations + +#### Production Readiness +- Comprehensive error handling +- Structured logging +- Documentation (odoc) +- Network compatibility testing with C Vere +- Distribution packaging + +--- + +## Testing Strategy + +### 1. Unit Tests (Alcotest) +```ocaml +(* test/test_noun.ml *) +let test_atom_small () = + let n = Types.atom (Z.of_int 42) in + check bool "is atom" true (Types.is_atom n) +``` + +### 2. Property Tests (QCheck) +```ocaml +(* Roundtrip property: cue(jam(x)) = x *) +let prop_jam_cue_roundtrip = + Test.make ~name:"jam/cue roundtrip" + (arbitrary_noun ()) + (fun n -> + let serialized = Jam.jam n in + let deserialized = Jam.cue serialized in + noun_equal n deserialized) +``` + +### 3. FFI Validation +Compare C jet output vs OCaml Nock interpretation for all jet calls + +### 4. Cross-Validation +Test binary compatibility with existing C Vere for: +- Network protocols +- Event logs +- Snapshots +- Pills + +--- + +## Success Criteria + +### Milestone 1 +- [ ] Can execute basic Nock programs +- [ ] Jam/cue roundtrip works +- [ ] Jets callable via FFI +- [ ] Performance within 2x of C version + +### Milestone 2 +- [ ] Can boot a fake ship +- [ ] Event log persistence works +- [ ] Basic I/O (Ames, HTTP, terminal) functional +- [ ] Can process simple pokes + +### Milestone 3 +- [ ] Full feature parity with C Vere +- [ ] Performance at or better than C version +- [ ] Production-ready (error handling, logging, monitoring) +- [ ] Network-compatible with C Vere + +--- + +## Risk Assessment & Mitigation + +### High Risks + +1. **Memory Model Mismatch** + - **Risk**: OCaml GC vs C loom semantics + - **Mitigation**: Hybrid approach, extensive testing, gradual migration + +2. **Performance Regression** + - **Risk**: OCaml slower than hand-tuned C + - **Mitigation**: Benchmark-driven development, optimization phase, compiler flags + +### Medium Risks + +3. **FFI Complexity** + - **Risk**: Noun conversion overhead between OCaml and C + - **Mitigation**: Optimize conversion layer, batch operations + +4. **I/O Performance** + - **Risk**: Eio maturity, performance characteristics + - **Mitigation**: Benchmarks, fallback to Lwt if needed + +--- + +## Revised Timeline + + +**Phases**: +- Foundation & Environment +- Core Noun System +- Nock Interpreter & Jet FFI +- I/O System +- Pier Management +- Performance & Polish + + +--- + +## Next Steps + +1. Set up OCaml development environment +2. Create project structure with dune +3. Implement basic noun types +4. Create FFI bridge to C jets +5. Begin porting jam/cue serialization + +--- + +*This plan is a living document. Update as we learn from implementation.* -- cgit v1.2.3