summaryrefslogtreecommitdiff
path: root/ocaml/NEED_FOR_SPEED.md
diff options
context:
space:
mode:
Diffstat (limited to 'ocaml/NEED_FOR_SPEED.md')
-rw-r--r--ocaml/NEED_FOR_SPEED.md68
1 files changed, 68 insertions, 0 deletions
diff --git a/ocaml/NEED_FOR_SPEED.md b/ocaml/NEED_FOR_SPEED.md
new file mode 100644
index 0000000..3b9ae94
--- /dev/null
+++ b/ocaml/NEED_FOR_SPEED.md
@@ -0,0 +1,68 @@
+Let's make this runtime the faster freaking ever
+
+Claude advices:
+
+ Why it's slow:
+
+ 1. Pure interpreter - We're executing every single Nock instruction one by one. No jets (hand-optimized stdlib functions)
+ 2. 8 events processing HUGE state - That 8MB pill expands into gigabytes of noun trees in memory. Each event modifies massive state
+ 3. Parallelism doesn't help much - All 32 cores are fighting over the same shared event_list noun, causing GC contention. This is memory-bound, not CPU-bound
+ 4. No optimization - C Vere has:
+ - Jets (bypass Nock for stdlib)
+ - Hash-consing (deduplicate nouns)
+ - Years of tuning
+
+ How slow is it really?
+
+ Let's see single-core performance:
+
+ dune exec test/test_solid_sequential.exe
+
+ This will show the actual speed of nock.ml, nock_iter.ml, and nock_tail.ml without parallel overhead. My guess: each lifecycle takes several seconds even on one core.
+
+ The real solutions:
+
+ 1. Jets - Implement fast paths for common operations (Hoon stdlib)
+ 2. Hash-consing - Deduplicate nouns to save memory
+ 3. Better GC tuning - Reduce allocation pressure
+ 4. Compile hot paths - JIT common formulas
+
+ Throwing 32 cores at an interpreted workload with heavy memory contention won't magically make it fast. The interpreter itself needs optimization first! 🤷
+
+> then I guess we better get working on those real solutions. Which is more approachable at our current state. Hash-consing?
+
+● Hash-consing is definitely more approachable! Here's why:
+
+ What you already have:
+
+ - ✅ Mug (hash) computation working perfectly
+ - ✅ Clean noun type definition
+ - ✅ All noun construction centralized through Noun.atom and Noun.cell
+
+ What hash-consing does:
+
+ (* Without hash-consing *)
+ let a = Noun.cell (Noun.atom 42) (Noun.atom 1) in
+ let b = Noun.cell (Noun.atom 42) (Noun.atom 1) in
+ (* a and b are different objects in memory *)
+
+ (* With hash-consing *)
+ let a = Noun.cell (Noun.atom 42) (Noun.atom 1) in
+ let b = Noun.cell (Noun.atom 42) (Noun.atom 1) in
+ (* a and b point to THE SAME object in memory! *)
+
+ Benefits:
+
+ 1. Massive memory savings - That 8MB solid pill has tons of repeated structure
+ 2. O(1) equality - Just pointer comparison instead of tree traversal
+ 3. Better cache locality - Shared structures stay hot
+ 4. Less GC pressure - Fewer objects to track
+
+ Implementation approach:
+
+ 1. Add a global Weak.t hash table (weak refs so GC can still collect unused nouns)
+ 2. Modify Noun.atom and Noun.cell to check the table first
+ 3. Return existing noun if found, or insert new one
+ 4. Use mug as the hash key
+
+ Want me to implement hash-consing? It's maybe 100-200 lines of code and will give you huge wins!