vere/C_INSIGHTS.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92

##  on nock.c
CLAUDE:
The key is that call_count is static (shared across ALL function calls), while my_call is local (each function call gets its own copy on the stack).

  Here's what happens without my_call:

  1. Call #0 ENTERS → logs "ENTER call #0", call_count becomes 1
  2. Call #1 ENTERS (recursive) → logs "ENTER call #1", call_count becomes 2
  3. Call #2 ENTERS (recursive) → logs "ENTER call #2", call_count becomes 3
  4. Call #2 EXITS → logs "EXIT call #2" using (call_count-1 = 2) ✓ Correct!
  5. Call #1 EXITS → logs "EXIT call #1" using (call_count-1 = 2) ✗ Wrong! Should be 1
  6. Call #0 EXITS → logs "EXIT call #0" using (call_count-1 = 2) ✗ Wrong! Should be 0

  The problem: by the time call #0 exits, call_count has already been incremented by all the recursive calls.

  With my_call (a local variable):

  1. Call #0: my_call=0 (saved on stack), call_count++
  2. Call #1: my_call=1 (saved on stack), call_count++
  3. Call #2: my_call=2 (saved on stack), call_count++
  4. Call #2 EXITS → uses its own my_call=2 ✓
  5. Call #1 EXITS → uses its own my_call=1 ✓
  6. Call #0 EXITS → uses its own my_call=0 ✓

  Each recursive call preserves its own my_call value on the stack, so the EXIT log always matches the ENTER log!

  This is a classic recursion pattern in C: use static variables for things you want to share across calls (like a counter), but capture their values in local variables if you need them to survive recursion.


  ## after first round on vere
  CLAUDE:
 Current Logging Features

  The C Nock interpreter now logs:
  - Call number: Unique ID for each Nock evaluation (call #0, #1, etc.)
  - Recursion depth: Indented to show call hierarchy
  - Opcode: The Nock rule being executed (0-12) or CELL for distribution
  - Subject type: Whether the subject (bus) is a cell or atom
  - Mug: Hash of the subject/result for tracking identical nouns
  - Return value type: Cell or atom with its mug

  Example Pattern from the Log

  >>> ENTER call #8 depth=5 opcode=8 bus=cell[mug=0x5cc2e3ff]
    >>> ENTER call #9 depth=6 opcode=8 bus=cell[mug=0x528e9b65]
    <<< EXIT  call #9 depth=6 returns=cell[mug=0x6d2bc01f]
    >>> ENTER call #10 depth=6 opcode=8 bus=cell[mug=0x479b42af]
    <<< EXIT  call #10 depth=6 returns=cell[mug=0x6d2bc01f]
  <<< EXIT  call #8 depth=5 returns=cell[mug=0x6d2bc01f]

  This shows: opcode 8 (Nock Eight = extend subject) calling itself recursively twice, both returning the same noun (mug 0x6d2bc01f).


#  on vere mugs
CLAUDE:

how can C output the mugs *so fast* even for huge nouns like the huge pill. You say it's caching them but it still has to compute them at first

● Excellent question! The answer is that C doesn't compute mugs on-demand - mugs are computed at allocation time and stored inline in the noun structure itself!

  In C's u3 system, every noun (atom or cell) has a mug field built into its memory layout. When you allocate a noun with u3i_* or u3nc(), the mug is computed immediately and stored. So u3r_mug() is just a
  memory read - instant!

  Here's the key difference:

  C (vere/pkg/noun/allocate.c):
  // Nouns store mugs inline:
  typedef struct {
    c3_w mug_w;  // ← Mug stored here!
    c3_w len_w;
    c3_w buf_w[];
  } u3a_atom;

  // u3r_mug() just reads the field:
  c3_w u3r_mug(u3_noun veb) {
    return u3a_to_pug(u3a_to_ptr(veb))->mug_w;  // Just a memory read!
  }

  OCaml (our current approach):
  type noun =
    | Atom of Z.t            (* No mug field! *)
    | Cell of noun * noun    (* No mug field! *)

  (* We compute mugs on-demand with caching *)
  let mug noun = (* ... expensive recursive computation ... *)

  To match C's performance, we'd need to change OCaml's noun type to:
  type noun =
    | Atom of { z: Z.t; mutable mug: int32 option }
    | Cell of { h: noun; t: noun; mutable mug: int32 option }

  This would let us compute once and store inline like C. Should we implement this now, or continue with the current cache approach?