summaryrefslogtreecommitdiff
path: root/ocaml/CLAUDE.md
blob: 491c73cfe8102f450cf5fb75b0dad5586383a234 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
# Overe - OCaml Port of Vere (Urbit Runtime)

## Project Goal

Port the Vere (Urbit C runtime) to OCaml piece by piece, starting with core components and comparing performance against the original C implementation.

## Completed Work

### 1. Nock Interpreter (`lib/nock.ml`)

**Status**: ✅ Complete and tested

- Implemented complete Nock interpreter with all 12 opcodes
- Direct port of `_n_nock_on` from C implementation in `pkg/noun/nock.c`
- Uses Zarith (GMP-backed) for arbitrary-precision integers
- All test cases pass (`test/test_nock.ml`)

**Key Functions**:
- `nock_on`: Main interpreter loop
- `slot`: Tree addressing using bit notation
- All 12 opcodes: 0 (slot), 1 (constant), 2 (nock), 3 (cell test), 4 (increment), 5 (equality), 6 (if-then-else), 7 (compose), 8 (push), 9 (call), 10 (hint), 11 (wish/deprecated)

### 2. Performance Benchmarking

**Status**: ✅ Complete with detailed results

Created comprehensive benchmark suite comparing C vs OCaml:
- `bench_nock.ml`: OCaml benchmarks
- `bench_simple.c`: Standalone C implementation for comparison
- `compare.sh`: Comparison script
- `BENCHMARKS.md`: Full results documentation

**Key Findings**:
- C is 2-5x faster on average
- OCaml is competitive or faster on allocation-heavy operations
- OCaml shows more consistent performance (less variance)
- For decrement: C 1.39x faster
- For tree navigation: C 4.87x faster
- For nock(decrement): OCaml 1.12x faster
- For deep recursion: C 2.41x faster

### 3. Noun Type System (`lib/noun.ml`)

**Status**: ✅ Complete

```ocaml
type noun =
  | Atom of Z.t
  | Cell of noun * noun
```

Helper functions:
- `atom`, `cell`: Constructors
- `is_atom`, `is_cell`: Type predicates
- `head`, `tail`: Cell accessors
- `slot`: Tree addressing (axis addressing)
- `atom_to_int`, `noun_to_string`: Conversions

### 4. Bitstream Utilities (`lib/bitstream.ml`)

**Status**: ✅ Complete

Implements bit-level reading/writing for jam/cue serialization:

**Writer**:
```ocaml
type writer = {
  buf: bytes ref;
  mutable bit_pos: int;
}
```
- `writer_create()`: Initialize writer
- `write_bit`: Write single bit
- `write_bits`: Write multiple bits from a Z.t value
- `writer_to_bytes`: Extract final bytes
- `writer_ensure`: Dynamic buffer resizing

**Reader**:
```ocaml
type reader = {
  bytes: bytes;
  mutable bit_pos: int;
}
```
- `reader_create`: Initialize from bytes
- `read_bit`: Read single bit
- `read_bits`: Read multiple bits into Z.t
- `reader_pos`: Get current bit position

**Critical Fix**: OCaml operator precedence issue
- Problem: `!(w.buf)` parsed as `(!w).buf`
- Solution: Use intermediate variable pattern:
  ```ocaml
  let buf_ref : bytes ref = w.buf in
  let buf : bytes = !buf_ref in
  ```

### 5. Jam/Cue Serialization (`lib/serial.ml`)

**Status**: ✅ Complete and tested

Implementation of noun serialization with backreference compression.

**Jam Encoding Format**:
- Atoms: tag bit `0`, then mat-encoded value
- Cells: tag bits `01`, then recursively encode head and tail
- Backrefs: tag bits `11`, then mat-encoded position

**Mat Encoding** (variable-length integers):
- For 0: single `0` bit
- For n > 0:
  - a = bit-width of n (`Z.numbits n`)
  - b = bit-width of a
  - Write b `1`-bits, then one `0`-bit
  - Write a in b-1 bits
  - Write n in a bits

**Current Implementation**:
```ocaml
let mat_encode w n =
  if Z.equal n Z.zero then
    write_bit w false
  else begin
    let a = Z.numbits n in
    let b = Z.numbits (Z.of_int a) in
    for _i = 1 to b do write_bit w true done;
    write_bit w false;
    write_bits w (Z.of_int a) (b - 1);  (* Write a, not a-1 *)
    write_bits w n a
  end
```

**Key Fixes**:
1. ✅ Mat encoding uses **leading 0 bits** followed by 1 bit (not the reverse!)
2. ✅ Mat decoding formula: `a = 2^(b-1) + bits_read` (not just bits_read)
3. ✅ Buffer initialization: Must use `Bytes.make` with zeros, not `Bytes.create`

**Test Status** (`test/test_serial.ml`):
- ✅ All atom roundtrip tests pass (0, 1, 2, 42, 255, 256, 65535, 1000000)
- ✅ All cell roundtrip tests pass
- ✅ All tree structure tests pass
- ✅ Backref tests pass
- ✅ Edge case tests pass (large atoms, deep nesting, wide trees)

**Performance** (`test/bench_serial.ml`):
- Small atom jam/cue: ~1 µs
- Balanced tree: ~3 µs
- Deep nesting (100 levels): ~76 µs
- Comparable or faster than C for simple cases
- See `BENCHMARKS_SERIAL.md` for detailed comparison

### 6. Python Urbit Interface (`urb.py`)

**Status**: ✅ Updated to Python 3

Script to communicate with running Urbit ship via lens port. Can be used to get correct jam/cue test vectors.

**Changes Made**:
- Shebang: `#!/usr/bin/env python3`
- String decoding: `.encode().decode('unicode_escape')`
- File write: mode `'wb'` for binary
- Logging: `warn` → `warning`
- Idiom: `'X' not in dict`

**Usage** (needs running Urbit ship):
```bash
python3 urb.py -d "(jam 42)"    # Get jam encoding of 42
python3 urb.py -d "(cue 42)"    # Decode jam encoding
```

## Project Structure

```
ocaml/
├── dune-project          # Project configuration (package: overe)
├── lib/
│   ├── dune             # Library stanza
│   ├── noun.ml          # Noun type system
│   ├── nock.ml          # Nock interpreter
│   ├── bitstream.ml     # Bit-level I/O
│   └── serial.ml        # Jam/cue serialization
├── test/
│   ├── dune             # Test executables
│   ├── test_nock.ml     # Nock tests (all passing)
│   └── test_serial.ml   # Jam/cue tests (failing at 4)
├── tmp/                 # Temporary test files
├── urb.py              # Python script to query Urbit ship
├── BENCHMARKS.md       # Performance comparison results
└── CLAUDE.md           # This file
```

## Next Steps

### Completed: Jam/Cue Serialization ✅

All jam/cue functionality is now complete and tested!

### Future Work

1. **Memory Management**:
   - Implement noun hash-consing/deduplication
   - Add reference counting or GC integration
   - Study u3 memory system from C implementation

3. **Jets**:
   - Port jet dashboard system
   - Implement critical jets for performance
   - Add jet registration and lookup

4. **Persistence**:
   - Implement snapshot system
   - Port event log handling
   - Add checkpoint/restore functionality

## Build Instructions

```bash
# Build library and tests
dune build

# Run Nock tests (all passing)
dune exec test/test_nock.exe

# Run jam/cue tests (currently failing at 4)
dune exec test/test_serial.exe

# Query running Urbit (requires ship running on port 12323 or with .http.ports)
python urb.py -d "(jam 42)"
```

## Technical Notes

### OCaml Challenges Encountered

1. **Operator precedence with refs**: `!(r.field)` doesn't work, need intermediate variable
2. **Zarith numbits**: Returns actual bit count, not bit index (e.g., 4 needs 3 bits, `numbits` returns 3)
3. **Mutable fields**: Need to use `mutable` keyword and assignment operator `:=`
4. **Bytes are mutable**: Use `Bytes.sub` to copy, not share

### Mat Encoding Details from C Code

From `pkg/noun/serial.c` line 126:
```c
// Write a_w (the width) in b_w-1 bits
u3r_chop(0, 0, b_w - 1, 0, &a_w, &bits);
```

This writes `a` (the width), NOT `a-1`. The OCaml implementation has been updated to match this.

### References

- Vere C implementation: `/home/y/code/urbit/vere/pkg/`
- Nock specification: https://urbit.org/docs/nock/reference/
- Jam/cue format: See official Urbit docs (provided in conversation)
- u3 system architecture: See docs on king/serf, jets, persistence

## Performance Philosophy

Goal is not to beat C in raw speed, but to:
- Provide maintainable, type-safe implementation
- Enable experimentation with runtime features
- Achieve competitive performance on real workloads
- Leverage OCaml's strengths (allocation, GC, pattern matching)

Current results show this is achievable: OCaml is within 2-5x of C, and actually faster for some allocation-heavy operations.