# The Loom Question: Memory Architecture in Neovere

## What is Vere's Loom?

Vere allocates a **contiguous arena** of memory (originally 2GB, now expandable to larger sizes) called the "loom". All Urbit nouns live within this arena. The loom is:

1. **Pre-allocated**: Grab 2GB (or more) of virtual memory at startup
2. **Contiguous**: All nouns at known offsets within this range
3. **Self-contained**: The entire Urbit state fits in one memory region
4. **Snapshotable**: Can write the whole loom to disk for persistence
5. **Fixed-size** (traditionally): Can't grow beyond initial allocation

### Why Did Vere Choose This?

Vere is written in C, which has no garbage collector. The loom provides:

**1. Memory Management**
- Custom allocator for nouns
- Reference counting for cleanup
- Memory pools for different noun sizes
- Compaction/defragmentation

**2. Persistence**
- Snapshot = copy loom to disk
- Resume = load loom from disk
- Extremely fast (just `memcpy`)
- Deterministic memory layout

**3. Performance**
- Nouns are 32-bit offsets into loom (compact!)
- Pointer arithmetic for tree navigation
- Cache-friendly sequential allocation
- No malloc/free overhead

**4. Determinism**
- Same events → same memory layout
- Reproducible across machines
- Critical for consensus/jets

**5. Jets**
- C code can use raw pointers to nouns
- No GC to worry about
- Direct memory access for performance

### The Loom's Limitations

**Hard Limit**: 2GB was the original ceiling
- Large ships hit this and crashed
- Recent Vere expanded to 4GB, 8GB, etc.
- But still fundamentally limited

**Fragmentation**: Memory can't be reclaimed easily
- Dead nouns leave holes
- Compaction is expensive
- Can run out of contiguous space

**Complexity**: Manual memory management
- Reference counting bugs
- Memory leaks
- Segfaults

## Sword's Revolution: Loom-Free Architecture

Sword (Rust implementation) took a radically different approach:

**No Fixed Arena**
- Nouns live on Rust's heap
- No size limit (grows with actual usage)
- Standard Rust allocator

**Hash-Consing**
- Every noun gets a unique hash
- Duplicate nouns share memory automatically
- Efficient deduplication

**Indirect References**
- Nouns reference each other via handles/IDs
- Not raw pointers
- Relocatable (for GC)

**Persistence**
- Use jam/cue serialization
- Snapshot = jam the entire state
- Resume = cue from disk
- Slower than memcpy but more flexible

### Sword's Wins

✅ **No Memory Limit**: Ships can grow arbitrarily large
✅ **Simpler**: Rust's allocator handles complexity
✅ **Safer**: No manual memory management bugs
✅ **Still Deterministic**: Hash-consing ensures consistency

### Sword's Tradeoffs

❌ **Slower Snapshots**: jam/cue vs raw memory copy
❌ **Larger Memory Overhead**: Rust allocator + hashes
❌ **More Complex Jets**: Can't use raw pointers

But the wins far outweigh the costs!

## OCaml's Position: Best of Both Worlds?

OCaml gives us unique advantages:

### What We Already Have

**1. Automatic GC**
- OCaml's generational GC manages memory
- No manual allocation/deallocation
- Mature, well-tested, fast

**2. Zarith (GMP) for Atoms**
- Arbitrary-precision integers
- Efficient representation
- Automatic memory management

**3. Jam/Cue Serialization**
- Already implemented and working!
- Can serialize entire kernel
- Proven approach (Sword uses it)

**4. No Segfaults**
- Memory safety by default
- Type system prevents many bugs
- Runtime checks where needed

### Three Possible Approaches

#### Option 1: Pure OCaml Heap (Current - Recommended!)

**Implementation:**
```ocaml
type noun =
  | Atom of Z.t           (* Zarith handles memory *)
  | Cell of noun * noun   (* OCaml GC handles cells *)
```

**Pros:**
- ✅ Simple, already working
- ✅ No memory limit
- ✅ Automatic memory management
- ✅ Type-safe
- ✅ Persistence via jam/cue (working!)

**Cons:**
- ❌ Larger memory overhead than loom
- ❌ Slower snapshots (jam/cue vs memcpy)
- ❌ GC pauses (though generational GC helps)

**Status:** This is what we have now and it works great!

#### Option 2: OCaml Loom (Not Recommended)

**Implementation:**
```ocaml
(* Allocate 2GB Bigarray *)
let loom = Bigarray.Array1.create Bigarray.char Bigarray.c_layout (2 * 1024 * 1024 * 1024)

type noun_offset = int  (* Offset into loom *)
```

**Pros:**
- ✅ Matches Vere architecture
- ✅ Fast snapshots (write Bigarray to disk)
- ✅ Compact representation

**Cons:**
- ❌ 2GB hard limit (or whatever we choose)
- ❌ Manual memory management (complex!)
- ❌ Fights OCaml's GC
- ❌ Unsafe (index out of bounds = crash)
- ❌ Fragmentation issues
- ❌ Complex to implement correctly

**Status:** Possible but not worth it - we'd be reimplementing malloc in OCaml!

#### Option 3: Hybrid with Hash-Consing

**Implementation:**
```ocaml
module NounTable = Hashtbl.Make(struct
  type t = noun
  let equal = (=)
  let hash = Hashtbl.hash
end)

let noun_cache : noun NounTable.t = NounTable.create 100000

let make_cell h t =
  let candidate = Cell (h, t) in
  match NounTable.find_opt noun_cache candidate with
  | Some existing -> existing  (* Reuse! *)
  | None ->
      NounTable.add noun_cache candidate candidate;
      candidate
```

**Pros:**
- ✅ Automatic deduplication
- ✅ Memory savings (like Sword)
- ✅ Still no size limit
- ✅ Deterministic (same nouns = same structure)

**Cons:**
- ❌ Hash table overhead
- ❌ GC pressure from large hash tables
- ❌ Need to decide when to clear cache

**Status:** Optional future optimization, not needed yet

## Recommendation: Stay Loom-Free

**Thesis:** The loom is a C-ism that doesn't make sense in OCaml (or Rust).

### Why No Loom?

1. **OCaml has a GC** - Using it is simpler and safer than fighting it
2. **No size limit** - Ships can grow as large as system RAM allows
3. **Jam/Cue works** - We already have working persistence
4. **Type safety** - OCaml prevents many memory bugs automatically
5. **Simpler code** - No custom allocator to maintain

### Counter: "But Vere snapshots are instant!"

True, but:
- Our jam is fast enough (1.2s for solid pill)
- Most ships snapshot infrequently
- Correctness > raw speed for snapshots
- Can optimize jam/cue later if needed

### Counter: "But the loom is deterministic!"

OCaml is deterministic too:
- Same nouns = same structure (referential transparency)
- Jam output is deterministic (same noun = same bytes)
- GC doesn't affect logical structure
- Hash-consing can ensure physical equality if needed

### Counter: "But jets need raw pointers!"

Not in modern languages:
- Sword jets work with indirect references
- OCaml jets can operate on `noun` type directly
- Type safety > raw speed (until proven bottleneck)
- Can use unsafe when absolutely necessary

## Practical Implications

### Current Architecture (Keep It!)

**Allocation:**
```ocaml
(* Atoms: Zarith manages memory *)
let a = Atom (Z.of_int 42)

(* Cells: OCaml GC manages memory *)
let c = Cell (a, a)
```

**Persistence:**
```ocaml
(* Snapshot: jam to bytes, write to file *)
let snapshot kernel =
  let bytes = Serial.jam kernel in
  write_file "kernel.jam" bytes

(* Resume: read file, cue to noun *)
let resume () =
  let bytes = read_file "kernel.jam" in
  Serial.cue bytes
```

**Memory Growth:**
- Automatic! GC expands heap as needed
- System RAM is the only limit
- Large ships work fine

### Future Optimizations (If Needed)

**1. Hash-Consing** (memory deduplication)
- Add global noun cache
- Reuse identical nouns
- Reduces memory footprint

**2. Incremental Snapshots**
- Snapshot diffs instead of full kernel
- Faster save/resume
- More complex but doable

**3. Memory-Mapped I/O**
- Use mmap for large nouns
- OS handles paging
- Transparent to our code

**4. Generational Snapshots**
- Keep old/new separated
- Only snapshot changed parts
- Faster incremental saves

## Comparison Table

| Feature | Vere Loom | Sword Heap | Neovere (Current) |
|---------|-----------|------------|-------------------|
| Size limit | 2-8GB | None | None |
| Memory management | Manual | Rust allocator | OCaml GC |
| Snapshot speed | Instant (memcpy) | Slow (jam/cue) | Slow (jam/cue) |
| Memory safety | Unsafe (C) | Safe (Rust) | Safe (OCaml) |
| Complexity | High | Medium | Low |
| Deduplication | Manual | Hash-consing | Optional |
| Growing ships | Hard limit | ✓ Works | ✓ Works |

## Conclusion

**We should NOT implement a loom in OCaml.**

Instead:
1. ✅ Use OCaml's GC (already working)
2. ✅ Use jam/cue for persistence (already working)
3. ✅ Let memory grow naturally (no limits)
4. ⏳ Add hash-consing later if needed (optimization)
5. ⏳ Optimize jam/cue if snapshots become a bottleneck

The loom made sense for C in 2013. It doesn't make sense for OCaml in 2025.

**Follow Sword's lead:** Modern languages with GC don't need manual memory arenas.

## Open Questions

1. **Should we add hash-consing now or later?**
   - Later. Current approach works fine.
   - Add when memory usage becomes a concern.

2. **How do we handle very large ships (100GB+ kernel)?**
   - Cross that bridge when we come to it
   - System RAM is cheap
   - Incremental snapshots if needed

3. **What about jets that need performance?**
   - Start with safe OCaml
   - Profile to find bottlenecks
   - Use unsafe/Obj.magic only where proven necessary
   - Still better than C's unsafety

4. **Can we beat Vere's snapshot speed?**
   - Probably not for raw memcpy
   - But we can get close with optimized jam/cue
   - And we win on flexibility/safety

## References

- Vere loom: `pkg/noun/allocate.c`, `pkg/noun/manage.c`
- Sword architecture: `rust/sword/src/mem.rs`, `rust/sword/src/noun.rs`
- OCaml GC: https://v2.ocaml.org/manual/gc.html
- Zarith: https://github.com/ocaml/Zarith

## Decision

**Status: LOOM-FREE ARCHITECTURE ✅**

We will continue using OCaml's heap and GC. No loom needed.