neoinityes

author: polwex <polwex@sortug.com> 2025-10-20 13:13:39 +0700
committer: polwex <polwex@sortug.com> 2025-10-20 13:13:39 +0700
commit: d21900836f89b2bf9cd55ff1708a4619c8b89656 (patch)
tree: bb3a5842ae408ffa465814c6bbf27a5002866252 /ocaml/LOOM.md
1 files changed, 361 insertions, 0 deletions
diff --git a/ocaml/LOOM.md b/ocaml/LOOM.md
new file mode 100644
index 0000000..4cc95a3
--- /dev/null
+++ b/ocaml/LOOM.md
@@ -0,0 +1,361 @@
+# The Loom Question: Memory Architecture in Neovere
+
+## What is Vere's Loom?
+
+Vere allocates a **contiguous arena** of memory (originally 2GB, now expandable to larger sizes) called the "loom". All Urbit nouns live within this arena. The loom is:
+
+1. **Pre-allocated**: Grab 2GB (or more) of virtual memory at startup
+2. **Contiguous**: All nouns at known offsets within this range
+3. **Self-contained**: The entire Urbit state fits in one memory region
+4. **Snapshotable**: Can write the whole loom to disk for persistence
+5. **Fixed-size** (traditionally): Can't grow beyond initial allocation
+
+### Why Did Vere Choose This?
+
+Vere is written in C, which has no garbage collector. The loom provides:
+
+**1. Memory Management**
+- Custom allocator for nouns
+- Reference counting for cleanup
+- Memory pools for different noun sizes
+- Compaction/defragmentation
+
+**2. Persistence**
+- Snapshot = copy loom to disk
+- Resume = load loom from disk
+- Extremely fast (just `memcpy`)
+- Deterministic memory layout
+
+**3. Performance**
+- Nouns are 32-bit offsets into loom (compact!)
+- Pointer arithmetic for tree navigation
+- Cache-friendly sequential allocation
+- No malloc/free overhead
+
+**4. Determinism**
+- Same events → same memory layout
+- Reproducible across machines
+- Critical for consensus/jets
+
+**5. Jets**
+- C code can use raw pointers to nouns
+- No GC to worry about
+- Direct memory access for performance
+
+### The Loom's Limitations
+
+**Hard Limit**: 2GB was the original ceiling
+- Large ships hit this and crashed
+- Recent Vere expanded to 4GB, 8GB, etc.
+- But still fundamentally limited
+
+**Fragmentation**: Memory can't be reclaimed easily
+- Dead nouns leave holes
+- Compaction is expensive
+- Can run out of contiguous space
+
+**Complexity**: Manual memory management
+- Reference counting bugs
+- Memory leaks
+- Segfaults
+
+## Sword's Revolution: Loom-Free Architecture
+
+Sword (Rust implementation) took a radically different approach:
+
+**No Fixed Arena**
+- Nouns live on Rust's heap
+- No size limit (grows with actual usage)
+- Standard Rust allocator
+
+**Hash-Consing**
+- Every noun gets a unique hash
+- Duplicate nouns share memory automatically
+- Efficient deduplication
+
+**Indirect References**
+- Nouns reference each other via handles/IDs
+- Not raw pointers
+- Relocatable (for GC)
+
+**Persistence**
+- Use jam/cue serialization
+- Snapshot = jam the entire state
+- Resume = cue from disk
+- Slower than memcpy but more flexible
+
+### Sword's Wins
+
+✅ **No Memory Limit**: Ships can grow arbitrarily large
+✅ **Simpler**: Rust's allocator handles complexity
+✅ **Safer**: No manual memory management bugs
+✅ **Still Deterministic**: Hash-consing ensures consistency
+
+### Sword's Tradeoffs
+
+❌ **Slower Snapshots**: jam/cue vs raw memory copy
+❌ **Larger Memory Overhead**: Rust allocator + hashes
+❌ **More Complex Jets**: Can't use raw pointers
+
+But the wins far outweigh the costs!
+
+## OCaml's Position: Best of Both Worlds?
+
+OCaml gives us unique advantages:
+
+### What We Already Have
+
+**1. Automatic GC**
+- OCaml's generational GC manages memory
+- No manual allocation/deallocation
+- Mature, well-tested, fast
+
+**2. Zarith (GMP) for Atoms**
+- Arbitrary-precision integers
+- Efficient representation
+- Automatic memory management
+
+**3. Jam/Cue Serialization**
+- Already implemented and working!
+- Can serialize entire kernel
+- Proven approach (Sword uses it)
+
+**4. No Segfaults**
+- Memory safety by default
+- Type system prevents many bugs
+- Runtime checks where needed
+
+### Three Possible Approaches
+
+#### Option 1: Pure OCaml Heap (Current - Recommended!)
+
+**Implementation:**
+```ocaml
+type noun =
+  | Atom of Z.t           (* Zarith handles memory *)
+  | Cell of noun * noun   (* OCaml GC handles cells *)
+```
+
+**Pros:**
+- ✅ Simple, already working
+- ✅ No memory limit
+- ✅ Automatic memory management
+- ✅ Type-safe
+- ✅ Persistence via jam/cue (working!)
+
+**Cons:**
+- ❌ Larger memory overhead than loom
+- ❌ Slower snapshots (jam/cue vs memcpy)
+- ❌ GC pauses (though generational GC helps)
+
+**Status:** This is what we have now and it works great!
+
+#### Option 2: OCaml Loom (Not Recommended)
+
+**Implementation:**
+```ocaml
+(* Allocate 2GB Bigarray *)
+let loom = Bigarray.Array1.create Bigarray.char Bigarray.c_layout (2 * 1024 * 1024 * 1024)
+
+type noun_offset = int  (* Offset into loom *)
+```
+
+**Pros:**
+- ✅ Matches Vere architecture
+- ✅ Fast snapshots (write Bigarray to disk)
+- ✅ Compact representation
+
+**Cons:**
+- ❌ 2GB hard limit (or whatever we choose)
+- ❌ Manual memory management (complex!)
+- ❌ Fights OCaml's GC
+- ❌ Unsafe (index out of bounds = crash)
+- ❌ Fragmentation issues
+- ❌ Complex to implement correctly
+
+**Status:** Possible but not worth it - we'd be reimplementing malloc in OCaml!
+
+#### Option 3: Hybrid with Hash-Consing
+
+**Implementation:**
+```ocaml
+module NounTable = Hashtbl.Make(struct
+  type t = noun
+  let equal = (=)
+  let hash = Hashtbl.hash
+end)
+
+let noun_cache : noun NounTable.t = NounTable.create 100000
+
+let make_cell h t =
+  let candidate = Cell (h, t) in
+  match NounTable.find_opt noun_cache candidate with
+  | Some existing -> existing  (* Reuse! *)
+  | None ->
+      NounTable.add noun_cache candidate candidate;
+      candidate
+```
+
+**Pros:**
+- ✅ Automatic deduplication
+- ✅ Memory savings (like Sword)
+- ✅ Still no size limit
+- ✅ Deterministic (same nouns = same structure)
+
+**Cons:**
+- ❌ Hash table overhead
+- ❌ GC pressure from large hash tables
+- ❌ Need to decide when to clear cache
+
+**Status:** Optional future optimization, not needed yet
+
+## Recommendation: Stay Loom-Free
+
+**Thesis:** The loom is a C-ism that doesn't make sense in OCaml (or Rust).
+
+### Why No Loom?
+
+1. **OCaml has a GC** - Using it is simpler and safer than fighting it
+2. **No size limit** - Ships can grow as large as system RAM allows
+3. **Jam/Cue works** - We already have working persistence
+4. **Type safety** - OCaml prevents many memory bugs automatically
+5. **Simpler code** - No custom allocator to maintain
+
+### Counter: "But Vere snapshots are instant!"
+
+True, but:
+- Our jam is fast enough (1.2s for solid pill)
+- Most ships snapshot infrequently
+- Correctness > raw speed for snapshots
+- Can optimize jam/cue later if needed
+
+### Counter: "But the loom is deterministic!"
+
+OCaml is deterministic too:
+- Same nouns = same structure (referential transparency)
+- Jam output is deterministic (same noun = same bytes)
+- GC doesn't affect logical structure
+- Hash-consing can ensure physical equality if needed
+
+### Counter: "But jets need raw pointers!"
+
+Not in modern languages:
+- Sword jets work with indirect references
+- OCaml jets can operate on `noun` type directly
+- Type safety > raw speed (until proven bottleneck)
+- Can use unsafe when absolutely necessary
+
+## Practical Implications
+
+### Current Architecture (Keep It!)
+
+**Allocation:**
+```ocaml
+(* Atoms: Zarith manages memory *)
+let a = Atom (Z.of_int 42)
+
+(* Cells: OCaml GC manages memory *)
+let c = Cell (a, a)
+```
+
+**Persistence:**
+```ocaml
+(* Snapshot: jam to bytes, write to file *)
+let snapshot kernel =
+  let bytes = Serial.jam kernel in
+  write_file "kernel.jam" bytes
+
+(* Resume: read file, cue to noun *)
+let resume () =
+  let bytes = read_file "kernel.jam" in
+  Serial.cue bytes
+```
+
+**Memory Growth:**
+- Automatic! GC expands heap as needed
+- System RAM is the only limit
+- Large ships work fine
+
+### Future Optimizations (If Needed)
+
+**1. Hash-Consing** (memory deduplication)
+- Add global noun cache
+- Reuse identical nouns
+- Reduces memory footprint
+
+**2. Incremental Snapshots**
+- Snapshot diffs instead of full kernel
+- Faster save/resume
+- More complex but doable
+
+**3. Memory-Mapped I/O**
+- Use mmap for large nouns
+- OS handles paging
+- Transparent to our code
+
+**4. Generational Snapshots**
+- Keep old/new separated
+- Only snapshot changed parts
+- Faster incremental saves
+
+## Comparison Table
+
+| Feature | Vere Loom | Sword Heap | Neovere (Current) |
+|---------|-----------|------------|-------------------|
+| Size limit | 2-8GB | None | None |
+| Memory management | Manual | Rust allocator | OCaml GC |
+| Snapshot speed | Instant (memcpy) | Slow (jam/cue) | Slow (jam/cue) |
+| Memory safety | Unsafe (C) | Safe (Rust) | Safe (OCaml) |
+| Complexity | High | Medium | Low |
+| Deduplication | Manual | Hash-consing | Optional |
+| Growing ships | Hard limit | ✓ Works | ✓ Works |
+
+## Conclusion
+
+**We should NOT implement a loom in OCaml.**
+
+Instead:
+1. ✅ Use OCaml's GC (already working)
+2. ✅ Use jam/cue for persistence (already working)
+3. ✅ Let memory grow naturally (no limits)
+4. ⏳ Add hash-consing later if needed (optimization)
+5. ⏳ Optimize jam/cue if snapshots become a bottleneck
+
+The loom made sense for C in 2013. It doesn't make sense for OCaml in 2025.
+
+**Follow Sword's lead:** Modern languages with GC don't need manual memory arenas.
+
+## Open Questions
+
+1. **Should we add hash-consing now or later?**
+   - Later. Current approach works fine.
+   - Add when memory usage becomes a concern.
+
+2. **How do we handle very large ships (100GB+ kernel)?**
+   - Cross that bridge when we come to it
+   - System RAM is cheap
+   - Incremental snapshots if needed
+
+3. **What about jets that need performance?**
+   - Start with safe OCaml
+   - Profile to find bottlenecks
+   - Use unsafe/Obj.magic only where proven necessary
+   - Still better than C's unsafety
+
+4. **Can we beat Vere's snapshot speed?**
+   - Probably not for raw memcpy
+   - But we can get close with optimized jam/cue
+   - And we win on flexibility/safety
+
+## References
+
+- Vere loom: `pkg/noun/allocate.c`, `pkg/noun/manage.c`
+- Sword architecture: `rust/sword/src/mem.rs`, `rust/sword/src/noun.rs`
+- OCaml GC: https://v2.ocaml.org/manual/gc.html
+- Zarith: https://github.com/ocaml/Zarith
+
+## Decision
+
+**Status: LOOM-FREE ARCHITECTURE ✅**
+
+We will continue using OCaml's heap and GC. No loom needed.
author	polwex <polwex@sortug.com>	2025-10-20 13:13:39 +0700
committer	polwex <polwex@sortug.com>	2025-10-20 13:13:39 +0700
commit	d21900836f89b2bf9cd55ff1708a4619c8b89656 (patch)
tree	bb3a5842ae408ffa465814c6bbf27a5002866252 /ocaml/LOOM.md