diff options
Diffstat (limited to 'ocaml/NEED_FOR_SPEED.md')
-rw-r--r-- | ocaml/NEED_FOR_SPEED.md | 68 |
1 files changed, 68 insertions, 0 deletions
diff --git a/ocaml/NEED_FOR_SPEED.md b/ocaml/NEED_FOR_SPEED.md new file mode 100644 index 0000000..3b9ae94 --- /dev/null +++ b/ocaml/NEED_FOR_SPEED.md @@ -0,0 +1,68 @@ +Let's make this runtime the faster freaking ever + +Claude advices: + + Why it's slow: + + 1. Pure interpreter - We're executing every single Nock instruction one by one. No jets (hand-optimized stdlib functions) + 2. 8 events processing HUGE state - That 8MB pill expands into gigabytes of noun trees in memory. Each event modifies massive state + 3. Parallelism doesn't help much - All 32 cores are fighting over the same shared event_list noun, causing GC contention. This is memory-bound, not CPU-bound + 4. No optimization - C Vere has: + - Jets (bypass Nock for stdlib) + - Hash-consing (deduplicate nouns) + - Years of tuning + + How slow is it really? + + Let's see single-core performance: + + dune exec test/test_solid_sequential.exe + + This will show the actual speed of nock.ml, nock_iter.ml, and nock_tail.ml without parallel overhead. My guess: each lifecycle takes several seconds even on one core. + + The real solutions: + + 1. Jets - Implement fast paths for common operations (Hoon stdlib) + 2. Hash-consing - Deduplicate nouns to save memory + 3. Better GC tuning - Reduce allocation pressure + 4. Compile hot paths - JIT common formulas + + Throwing 32 cores at an interpreted workload with heavy memory contention won't magically make it fast. The interpreter itself needs optimization first! 🤷 + +> then I guess we better get working on those real solutions. Which is more approachable at our current state. Hash-consing? + +● Hash-consing is definitely more approachable! Here's why: + + What you already have: + + - ✅ Mug (hash) computation working perfectly + - ✅ Clean noun type definition + - ✅ All noun construction centralized through Noun.atom and Noun.cell + + What hash-consing does: + + (* Without hash-consing *) + let a = Noun.cell (Noun.atom 42) (Noun.atom 1) in + let b = Noun.cell (Noun.atom 42) (Noun.atom 1) in + (* a and b are different objects in memory *) + + (* With hash-consing *) + let a = Noun.cell (Noun.atom 42) (Noun.atom 1) in + let b = Noun.cell (Noun.atom 42) (Noun.atom 1) in + (* a and b point to THE SAME object in memory! *) + + Benefits: + + 1. Massive memory savings - That 8MB solid pill has tons of repeated structure + 2. O(1) equality - Just pointer comparison instead of tree traversal + 3. Better cache locality - Shared structures stay hot + 4. Less GC pressure - Fewer objects to track + + Implementation approach: + + 1. Add a global Weak.t hash table (weak refs so GC can still collect unused nouns) + 2. Modify Noun.atom and Noun.cell to check the table first + 3. Return existing noun if found, or insert new one + 4. Use mug as the hash key + + Want me to implement hash-consing? It's maybe 100-200 lines of code and will give you huge wins! |