summaryrefslogtreecommitdiff
path: root/OLD.md
diff options
context:
space:
mode:
Diffstat (limited to 'OLD.md')
-rw-r--r--OLD.md685
1 files changed, 685 insertions, 0 deletions
diff --git a/OLD.md b/OLD.md
new file mode 100644
index 0000000..8a00255
--- /dev/null
+++ b/OLD.md
@@ -0,0 +1,685 @@
+
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Overview
+
+This is Vere, the Urbit runtime environment - the lowest layer of the Urbit stack. It includes the Nock virtual machine, I/O drivers, event log, and snapshotting system. The codebase is written in C and uses Zig as the build system.
+
+**Active Project**: We are porting Vere from C to OCaml 5.x with Eio, using a hybrid approach where jets remain in C and are called via FFI.
+
+## Current Vere (C) Documentation
+
+### Build System
+
+#### Primary Commands
+
+- **Build native debug binary**: `zig build`
+- **Run specific tests**: `zig build <test-name> --summary all`
+ - Available tests: `nock-test`, `ames-test`, `palloc-test`, `equality-test`, `hashtable-test`, `hamt-test`, `jets-test`, `retrieve-test`, `serial-test`, `boot-test`, `newt-test`, `vere-noun-test`, `unix-test`, `pact-test`, `tracy-test`
+- **Build for all supported targets**: `zig build -Dall`
+- **Release build**: `zig build -Drelease`
+- **Build with optimization**: `zig build -Doptimize=ReleaseFast`
+
+#### Common Build Options
+
+- `-Dtarget=<target>`: Cross-compile (e.g., `aarch64-linux-musl`, `x86_64-macos`)
+- `-Doptimize=<mode>`: Debug (default), ReleaseSafe, ReleaseSmall, ReleaseFast
+- `-Dpace=<train>`: Release train - once (default), live, soon, edge
+- `-Dcopt=<flag>`: Additional compiler flags (can be specified multiple times)
+- `-Dasan`: Enable address sanitizer (native only, requires llvm@19)
+- `-Dubsan`: Enable undefined behavior sanitizer (native only, requires llvm@19)
+- `-Dtracy`: Enable Tracy profiler integration
+- `-Dcpu-dbg`: Enable CPU debug mode (-DU3_CPU_DEBUG)
+- `-Dmem-dbg`: Enable memory debug mode (-DU3_MEMORY_DEBUG)
+
+#### Build Output
+
+Built binaries are placed in `zig-out/<target-triple>/` (e.g., `zig-out/x86_64-linux-musl/urbit`)
+
+### Architecture
+
+#### Package Structure
+
+The runtime is organized into distinct packages in the `pkg/` directory:
+
+- **pkg/c3**: Basic utilities for Urbit's C style (types, macros, portability)
+- **pkg/ent**: Cross-platform entropy source wrapper (`getentropy(2)`)
+- **pkg/ur**: Jam/cue implementation (Urbit's bitwise noun serialization)
+- **pkg/noun**: Nock virtual machine, memory management, jets, and snapshotting
+- **pkg/past**: Parser for Urbit's bytecode format
+- **pkg/vere**: I/O drivers, event log (LMDB), main event loop, pier management
+
+#### Key Components
+
+**Nock VM (pkg/noun)**:
+- `nock.c`: Nock interpreter and execution
+- `jets.c`: Jet registration and acceleration (optimized implementations of Nock formulas)
+- `allocate.c`: Loom memory management (arena allocator)
+- `manage.c`: Noun memory lifecycle and garbage collection
+- `events.c`: Event processing and persistence
+- `jets/`: Directory tree of jet implementations organized by Hoon library structure
+
+**I/O System (pkg/vere/io)**:
+- `ames.c`: UDP networking driver (Urbit's P2P protocol)
+- `behn.c`: Timer driver
+- `term.c`: Terminal I/O
+- `http.c`: HTTP server/client (using h2o)
+- `unix.c`: Filesystem synchronization
+- `conn.c`: IPC connections
+- `lick.c`: Inter-process communication
+- `mesa.c`: Alternative networking protocol
+- `cttp.c`: HTTP client effects
+
+**Event Persistence**:
+- `pkg/vere/disk.c`: Event log management using LMDB
+- Events are persisted to disk and can be replayed for crash recovery
+
+**Main Entry Points**:
+- `pkg/vere/main.c`: CLI argument parsing and runtime initialization
+- `pkg/vere/lord.c`: "Lord" process management (worker process coordination)
+- `pkg/vere/pier.c`: Pier (ship instance) lifecycle
+
+#### Memory Management
+
+Urbit uses a custom memory model called the "loom":
+- Fixed-size arena allocator (2GB or 4GB depending on architecture)
+- All Urbit data structures (nouns) live in the loom
+- Snapshot-based persistence allows complete memory dumps
+- Reference counting and mark-and-sweep GC for cleanup
+
+### Development Workflow
+
+#### Working with Fake Ships
+
+Always develop on fake ships, not live network ships. Fake ships use deterministic keys and communicate over local loopback.
+
+**Boot a new fake ship**:
+```console
+zig build
+./zig-out/<target>/urbit -F zod
+```
+
+**Boot with development pill** (faster):
+```console
+./zig-out/<target>/urbit -F zod -B solid.pill
+```
+
+**Launch existing fake ship**:
+```console
+./zig-out/<target>/urbit zod
+```
+
+#### Debugging
+
+**GDB debugging**:
+```bash
+zig build
+gdb --args ./zig-out/<target>/urbit zod
+```
+
+In GDB, set:
+```gdb
+set follow-fork-mode child
+handle SIGSEGV nostop noprint
+```
+
+**macOS lldb debugging**:
+
+On macOS, you must configure lldb to handle Mach exceptions properly. Start the ship with `-t` flag when debugging, or attach after starting. Then run:
+```lldb
+p (void)darwin_register_mach_exception_handler()
+pro hand -p true -s false -n false SIGBUS
+pro hand -p true -s false -n false SIGSEGV
+```
+
+### Git Workflow
+
+#### Branch Naming
+
+All branches for review must follow: `i/<N>/<description>` where `<N>` is the GitHub issue number.
+
+#### Commit Style
+
+- Use imperative mood for commit messages
+- Include short description (required) and optional long description
+
+#### Pull Request Format
+
+```markdown
+### Description
+
+Resolves #<N>.
+
+[Thorough description of changes]
+
+### Related
+
+[Related issues, links, papers, etc.]
+```
+
+#### Branch Structure
+
+- `develop` (default): edge train - for runtime developers
+- `release`: soon train - for early adopters
+- `master`: live train - for production
+
+PRs should target `develop` by default.
+
+### Testing
+
+Tests are colocated with implementation code:
+- `pkg/noun/*_tests.c`: Noun system tests
+- `pkg/vere/*_tests.c`: Vere I/O and persistence tests
+- `pkg/ur/tests.c`, `pkg/ent/tests.c`: Package-specific tests
+
+Run individual test suites with `zig build <test-name> --summary all`.
+
+### Dependencies
+
+External libraries (managed by Zig build system in `ext/`):
+- **GMP**: Multi-precision arithmetic
+- **OpenSSL**: Cryptography
+- **libuv**: Async I/O event loop
+- **LMDB**: Memory-mapped database for event log
+- **h2o**: HTTP server
+- **curl**: HTTP client
+- **libsigsegv**: Signal handling for memory protection
+- **urcrypt**: Urbit cryptographic primitives
+- **wasm3**: WebAssembly interpreter
+- **natpmp**: NAT port mapping
+- **zlib**: Compression
+
+### Important Defines
+
+- `U3_OS_osx` / `U3_OS_linux` / `U3_OS_windows`: Platform detection
+- `U3_CPU_aarch64`: ARM64 architecture
+- `U3_CPU_DEBUG`: Enable CPU debugging
+- `U3_MEMORY_DEBUG`: Enable memory debugging
+- `U3_GUARD_PAGE`: Enable guard pages for loom
+- `U3_SNAPSHOT_VALIDATION`: Validate snapshots on load
+- `C3DBG`: Enable debug assertions
+
+### Code Style
+
+The codebase uses Urbit-specific C conventions:
+- Custom types: `c3_w` (word), `c3_y` (byte), `c3_o` (loobean), etc.
+- Naming: `u3_` prefix for public APIs, `u3X_` for module X
+- Heavy use of macros for memory management and control flow
+- Arena-based allocation rather than malloc/free
+
+---
+
+## OCaml Port Plan
+
+### Executive Summary
+
+This is a phased approach to porting Urbit's runtime (Vere) from C to OCaml 5.x with Eio. The port targets ~32,500 lines of C code (excluding jets). Using a hybrid approach where jets remain in C and are called via FFI,
+
+### Why OCaml?
+
+1. **Functional alignment**: Hoon (Urbit's language) is functional; OCaml's paradigm matches better than imperative C
+2. **Safety**: Strong typing, exhaustive pattern matching, immutability by default prevent entire classes of bugs
+3. **Performance**: OCaml 5.x multicore + Eio provides excellent performance with effect handlers
+4. **GC integration**: OCaml's GC can be integrated with noun reference counting more naturally than manual C memory management
+5. **Maintenance**: More maintainable codebase with algebraic data types and pattern matching
+
+### Hybrid Strategy: Keep Jets in C
+
+**Key Decision**: Keep jets in C, call via FFI from OCaml. This dramatically reduces scope:
+- **Don't port**: 187 jet files (~15k LOC)
+- **Don't port**: urcrypt, wasm3, softfloat dependencies
+- **Do port**: Core noun system, Nock interpreter, I/O drivers
+
+#### Why Keep Jets in C?
+
+1. **Proven implementations**: Jets are highly optimized and battle-tested
+2. **Crypto dependencies**: Ed25519, ECDSA, etc. already use C libraries (urcrypt)
+3. **WebAssembly**: wasm3 integration (3k LOC jet) stays in C
+4. **Floating point**: softfloat dependency for IEEE compliance
+5. **Reduced risk**: Don't need to reimplement/validate 187 jets
+6. **Performance**: C jets are already fast; FFI overhead negligible for typical jet calls
+
+### Target OCaml Structure
+
+```
+urbit-ocaml/
+├── dune-project # Project metadata
+├── dune-workspace # Workspace config
+│
+├── lib/ # OCaml libraries
+│ ├── noun/ # Core noun system (port from pkg/noun)
+│ │ ├── dune
+│ │ ├── types.ml[i] # Noun ADTs
+│ │ ├── loom.ml[i] # Memory management
+│ │ ├── jam.ml[i] # Serialization
+│ │ ├── nock.ml[i] # Nock interpreter
+│ │ ├── jets_ffi.ml[i] # FFI to C jets
+│ │ └── jets_registry.ml[i] # Jet dispatch
+│ │
+│ ├── runtime/ # Runtime I/O (port from pkg/vere)
+│ │ ├── dune
+│ │ ├── db.ml[i] # Event log
+│ │ ├── pier.ml[i] # Pier management
+│ │ ├── io/
+│ │ │ ├── ames.ml[i] # UDP networking
+│ │ │ ├── http.ml[i] # HTTP server/client
+│ │ │ ├── term.ml[i] # Terminal
+│ │ │ ├── unix.ml[i] # Filesystem
+│ │ │ └── behn.ml[i] # Timers
+│ │ └── king.ml[i] # Main orchestrator
+│ │
+│ └── c_bridge/ # C FFI bridge
+│ ├── dune # Links against existing C code
+│ ├── noun_ffi.ml[i] # Noun <-> C noun conversion
+│ ├── jets_ffi.ml[i] # Call C jets
+│ └── stubs/ # C stubs for FFI
+│ ├── noun_stubs.c
+│ └── jet_stubs.c
+│
+├── bin/ # Executables
+│ ├── dune
+│ └── urbit.ml # Main entry point
+│
+├── test/ # Tests
+│ ├── dune
+│ ├── test_noun.ml # Noun system tests
+│ ├── test_nock.ml # Nock interpreter tests
+│ ├── test_jets.ml # Jet FFI tests
+│ └── test_integration.ml # End-to-end tests
+│
+├── bench/ # Benchmarks
+│ ├── dune
+│ └── bench_nock.ml
+│
+├── c/ # Keep existing C code
+│ ├── pkg/noun/ # Copied from vere
+│ ├── pkg/ur/
+│ └── ext/ # External deps (gmp, urcrypt, etc.)
+│
+└── doc/ # Documentation
+ └── architecture.md
+```
+
+---
+
+## Phase-by-Phase Implementation Plan
+
+### Phase 0: Foundation & Development Environment
+
+**Goal**: Set up OCaml development environment and validate approach
+
+#### Prerequisites
+
+```bash
+# OCaml 5.2+ (for multicore/effects)
+opam switch create vere-ocaml 5.2.1
+eval $(opam env)
+
+# Core build tools
+opam install dune ocamlformat ocaml-lsp-server
+
+# Essential libraries
+opam install \
+ eio_main \ # Effects-based I/O
+ zarith \ # Bignum arithmetic
+ cmdliner \ # CLI parsing
+ logs \ # Structured logging
+ fmt \ # Formatting/pretty-printing
+ \
+ alcotest \ # Testing
+ qcheck \ # Property-based testing
+ qcheck-alcotest \ # QCheck integration
+ bechamel \ # Benchmarking
+ \
+ ctypes \ # FFI to C
+ ctypes-foreign # Dynamic FFI
+
+# Optional but recommended
+opam install \
+ ocaml-lsp-server \ # LSP for editor support
+ ocamlformat \ # Code formatting
+ odoc \ # Documentation generation
+ utop \ # Better REPL
+ landmarks # Profiling
+```
+
+#### Initial Project Setup
+
+```bash
+# Create directory outside existing vere repo
+cd ~/code/urbit
+mkdir vere-ocaml && cd vere-ocaml
+
+# Initialize dune project
+cat > dune-project << 'EOF'
+(lang dune 3.16)
+(name urbit)
+(version 0.1.0)
+
+(generate_opam_files true)
+
+(package
+ (name urbit)
+ (synopsis "Urbit runtime in OCaml")
+ (description "Urbit's Nock VM and I/O drivers implemented in OCaml with Eio")
+ (depends
+ (ocaml (>= 5.2.0))
+ dune
+ eio_main
+ zarith
+ cmdliner
+ logs
+ fmt
+ ctypes
+ ctypes-foreign
+ (alcotest :with-test)
+ (qcheck :with-test)
+ (qcheck-alcotest :with-test)))
+
+(using ctypes 0.3)
+EOF
+```
+
+**Deliverables**:
+- Working `dune build` and `dune test`
+- FFI examples calling C Vere functions from OCaml
+- CI/CD pipeline (GitHub Actions)
+
+---
+
+### Phase 1: Core Noun System
+
+**Goal**: Implement the foundational noun data structures and basic operations
+
+#### 1.1: Noun Type System
+
+**Implementation**:
+```ocaml
+(* types.ml *)
+type noun =
+ | Direct of int (* 31-bit direct atoms *)
+ | Indirect of indirect
+
+and indirect =
+ | Atom of bigint (* Arbitrary precision atoms *)
+ | Cell of noun * noun (* Pairs [head tail] *)
+```
+
+#### 1.2: Jam/Cue Serialization
+
+**Files to port**:
+- `pkg/ur/serial.c` (~500 LOC)
+
+**Strategy**:
+- Maintain wire-format compatibility with C version
+- Optimize for OCaml's GC characteristics
+
+#### 1.3: Memory Management
+
+**OCaml approach**:
+```ocaml
+(* loom.ml *)
+module Loom : sig
+ type t
+ val create : size:int -> t
+ val allocate : t -> noun -> noun (* Intern in loom *)
+ val snapshot : t -> bytes (* For persistence *)
+ val restore : bytes -> t
+end
+```
+
+**Strategy**: Hybrid approach - Use OCaml GC for most nouns, reserve loom for snapshot/restore
+
+---
+
+### Phase 2: Nock Interpreter
+
+**Goal**: Implement a working Nock interpreter that can execute Nock formulas
+
+#### 2.1: Basic Interpreter
+
+**Implementation**:
+```ocaml
+(* nock.ml *)
+type formula =
+ | Axis of int
+ | Const of noun
+ | Cell of formula * formula
+ | Inc of formula
+ | Eq of formula * formula
+ | If of formula * formula * formula
+ | Compose of formula * formula
+ | Push of formula * formula
+ | Hint of hint * formula
+ | ...
+
+val nock : subject:noun -> formula:noun -> noun
+```
+
+#### 2.2: Jet Infrastructure & FFI
+
+**FFI Strategy**:
+```ocaml
+(* jets_ffi.ml *)
+module C = struct
+ (* Convert OCaml noun to C u3_noun *)
+ let to_c_noun : Types.noun -> uint32_t = (* ... *)
+
+ (* Convert C u3_noun to OCaml noun *)
+ let of_c_noun : uint32_t -> Types.noun = (* ... *)
+
+ (* Foreign function binding *)
+ let u3qa_add = foreign "u3qa_add"
+ (uint32_t @-> uint32_t @-> returning uint32_t)
+end
+
+(* High-level wrapper *)
+let add a b =
+ let a_c = C.to_c_noun a in
+ let b_c = C.to_c_noun b in
+ let result_c = C.u3qa_add a_c b_c in
+ C.of_c_noun result_c
+```
+
+---
+
+### Phase 3: I/O System with Eio
+
+**Goal**: Port I/O drivers to Eio's structured concurrency model
+
+#### 3.1: Event Log & Persistence
+
+**Implementation**:
+```ocaml
+(* db.ml *)
+module EventLog : sig
+ type t
+ val open_ : path:string -> t
+ val append : t -> event:noun -> unit
+ val read : t -> from:int -> noun Seq.t
+ val snapshot : t -> noun -> unit
+end
+```
+
+#### 3.2: Ames (UDP Networking)
+
+```ocaml
+(* io/ames.ml *)
+module Ames : sig
+ val start :
+ sw:Eio.Switch.t ->
+ net:_ Eio.Net.t ->
+ port:int ->
+ on_packet:(noun -> unit) ->
+ unit
+end
+```
+
+#### 3.3: HTTP Server/Client
+
+**Strategy**: Use OCaml-native HTTP (cohttp-eio or dream) instead of porting h2o
+
+#### 3.4: Other I/O Drivers
+- Terminal I/O using Lambda-Term or Notty
+- Unix filesystem using Eio.Path
+- Timers using Eio.Time
+- IPC using Unix domain sockets
+
+---
+
+### Phase 4: Pier Management & Orchestration
+
+**Goal**: Implement high-level runtime orchestration
+
+```ocaml
+(* pier.ml *)
+module Pier : sig
+ type t
+
+ val boot :
+ sw:Eio.Switch.t ->
+ env:_ Eio.Stdenv.t ->
+ path:string ->
+ pill:noun ->
+ t
+
+ val resume :
+ sw:Eio.Switch.t ->
+ env:_ Eio.Stdenv.t ->
+ path:string ->
+ t
+
+ val poke : t -> noun -> unit
+ val scry : t -> path:noun -> noun option
+end
+```
+
+---
+
+### Phase 5: Performance & Polish
+
+**Goal**: Match or exceed C performance and prepare for production
+
+#### Key Optimizations
+1. Noun allocation/deallocation
+2. Nock interpreter inner loop
+3. Jet dispatch via FFI
+4. Serialization (jam/cue)
+5. Hash table operations
+
+#### Production Readiness
+- Comprehensive error handling
+- Structured logging
+- Documentation (odoc)
+- Network compatibility testing with C Vere
+- Distribution packaging
+
+---
+
+## Testing Strategy
+
+### 1. Unit Tests (Alcotest)
+```ocaml
+(* test/test_noun.ml *)
+let test_atom_small () =
+ let n = Types.atom (Z.of_int 42) in
+ check bool "is atom" true (Types.is_atom n)
+```
+
+### 2. Property Tests (QCheck)
+```ocaml
+(* Roundtrip property: cue(jam(x)) = x *)
+let prop_jam_cue_roundtrip =
+ Test.make ~name:"jam/cue roundtrip"
+ (arbitrary_noun ())
+ (fun n ->
+ let serialized = Jam.jam n in
+ let deserialized = Jam.cue serialized in
+ noun_equal n deserialized)
+```
+
+### 3. FFI Validation
+Compare C jet output vs OCaml Nock interpretation for all jet calls
+
+### 4. Cross-Validation
+Test binary compatibility with existing C Vere for:
+- Network protocols
+- Event logs
+- Snapshots
+- Pills
+
+---
+
+## Success Criteria
+
+### Milestone 1
+- [ ] Can execute basic Nock programs
+- [ ] Jam/cue roundtrip works
+- [ ] Jets callable via FFI
+- [ ] Performance within 2x of C version
+
+### Milestone 2
+- [ ] Can boot a fake ship
+- [ ] Event log persistence works
+- [ ] Basic I/O (Ames, HTTP, terminal) functional
+- [ ] Can process simple pokes
+
+### Milestone 3
+- [ ] Full feature parity with C Vere
+- [ ] Performance at or better than C version
+- [ ] Production-ready (error handling, logging, monitoring)
+- [ ] Network-compatible with C Vere
+
+---
+
+## Risk Assessment & Mitigation
+
+### High Risks
+
+1. **Memory Model Mismatch**
+ - **Risk**: OCaml GC vs C loom semantics
+ - **Mitigation**: Hybrid approach, extensive testing, gradual migration
+
+2. **Performance Regression**
+ - **Risk**: OCaml slower than hand-tuned C
+ - **Mitigation**: Benchmark-driven development, optimization phase, compiler flags
+
+### Medium Risks
+
+3. **FFI Complexity**
+ - **Risk**: Noun conversion overhead between OCaml and C
+ - **Mitigation**: Optimize conversion layer, batch operations
+
+4. **I/O Performance**
+ - **Risk**: Eio maturity, performance characteristics
+ - **Mitigation**: Benchmarks, fallback to Lwt if needed
+
+---
+
+## Revised Timeline
+
+
+**Phases**:
+- Foundation & Environment
+- Core Noun System
+- Nock Interpreter & Jet FFI
+- I/O System
+- Pier Management
+- Performance & Polish
+
+
+---
+
+## Next Steps
+
+1. Set up OCaml development environment
+2. Create project structure with dune
+3. Implement basic noun types
+4. Create FFI bridge to C jets
+5. Begin porting jam/cue serialization
+
+---
+
+*This plan is a living document. Update as we learn from implementation.*