Nock Interpreter Engineer
Expert guidance for implementing Nock virtual machines across multiple programming languages with proper evaluation loops, tree operations, and runtime optimization.
Implementation Architecture
Core Components
1. Noun Representation
Atoms: Natural numbers with arbitrary precision
- C: GMP (GNU Multiple Precision) or custom BigInt
- Python: Built-in
int(unlimited precision) - Rust:
num-bigintcrate orbigintcrate - Haskell:
Integertype (native) - JavaScript:
BigInt(ES2020+)
Cells: Ordered pairs (cons cells)
rust1pub enum Noun { 2 Atom(BigUint), 3 Cell(Box<Noun>, Box<Noun>), 4}
2. Parser & Printer
Parse text format to internal representation:
[a b c] -> Cell(Atom(a), Cell(Atom(b), Atom(c)))
123 -> Atom(123)
[a [b c]] -> Cell(Atom(a), Cell(Atom(b), Atom(c)))
Print back to readable format.
3. Evaluation Loop
Pattern matching on formula opcodes (0-12):
python1def evaluate(subject, formula): 2 while True: 3 if not is_cell(formula): 4 raise NockCrash("formula must be cell") 5 6 op, rest = formula 7 result = dispatch(op, subject, rest) 8 if is_crash(result): 9 return result 10 subject, formula = result, rest 11 if is_constant(formula): # Rule 1 12 return formula
Nock Rules Implementation
Rule 0: Slot ([a b] /)
Binary tree addressing for efficient axis traversal:
[subject slot] / = value_at_slot
Implementation: Convert slot number to axis path (left/right sequence), traverse subject tree.
Rule 1: Constant ([a] *)
[subject] * = a
Stops evaluation, returns constant a.
Rule 2: Evaluate ([a b c] +)
[subject [a b]] + = [subject a] c
Recurse: evaluate a against subject, then evaluate result as formula with c.
Rule 3: Cell Test ([a] ?)
[subject a] ? = 0 if a is cell, 1 if a is atom
Type discrimination for atoms vs cells.
Rule 4: Increment ([a] +)
[subject a] + = a + 1
Natural number addition.
Rule 5: Equality ([a b] =)
[subject a b] = = a b
Structural equality: 0 if identical, 1 if different.
Rule 6-9: Composition
- Rule 6:
[[a b] c] /=[[a b] / c] - Rule 7:
[a b c] ==a b c(three-element cell) - Rule 8:
[a b] +=a + b - Rule 9:
[a b] +=a + b
Formulas 2-9 are composites of rules 0-5.
Rule 10: Edit ([a b c])
[subject [a b c]] = nock(subject, b)
Tree editing; replaces part of the subject at axis a.
Rule 11: Hint ([a b])
[subject [a b]] = nock(subject, b)
Optimization hint; evaluates b and may use a as a hint to the runtime.
Rule 12: Scry ([a b] ^)
[subject [a b]] ^ = memory[a][b]
Referentially transparent read of memory slot b within noun a.
Performance Patterns
Tail Call Optimization (TCO)
Detect and optimize recursive formulas:
python1# Before: creates stack frames 2[subject [[a b] c]] + = ... 3 4# After: if formula ends with recursive call, reuse stack 5if ends_with_recursive_call(formula): 6 return tco_evaluate(subject, formula)
Memoization
Cache evaluation results for repeated sub-formulas:
python1_memo = {} 2 3def memo_evaluate(subject, formula): 4 key = (subject, formula) 5 if key in _memo: 6 return _memo[key] 7 result = evaluate(subject, formula) 8 _memo[key] = result 9 return result
Lazy Evaluation
Defer computation until value needed:
python1class Thunk: 2 def __init__(self, subject, formula): 3 self.subject = subject 4 self.formula = formula 5 self._cached = None 6 7 def force(self): 8 if self._cached is None: 9 self._cached = evaluate(self.subject, self.formula) 10 return self._cached
Language-Specific Patterns
C Implementation
c1typedef struct { 2 mpz_t atom; 3 struct Noun *cell; 4} Noun; 5 6Noun* slot(Noun* subject, Noun* axis); 7Noun* evaluate(Noun* subject, Noun* formula);
Pros: Maximum control, pointer efficiency Cons: Manual memory management, complexity
Python Implementation
python1from typing import Union 2 3Noun = Union[int, tuple] 4 5def evaluate(subject: Noun, formula: Noun) -> Noun: 6 if not isinstance(formula, tuple): 7 raise NockCrash("formula must be cell") 8 ...
Pros: Simplicity, fast prototyping Cons: Performance overhead
Rust Implementation
rust1pub enum Noun { 2 Atom(BigUint), 3 Cell(Box<Noun>, Box<Noun>), 4} 5 6impl Noun { 7 pub fn evaluate(&self, formula: &Noun) -> Result<Noun, NockCrash> { 8 match formula { 9 Noun::Cell(op, rest) => dispatch(op, self, rest), 10 _ => Ok(formula.clone()), 11 } 12 } 13}
Pros: Memory safety, zero-cost abstractions Cons: Borrow checker complexity
Haskell Implementation
haskell1data Noun = Atom Integer | Cell Noun Noun 2 3evaluate :: Noun -> Noun -> Noun 4evaluate subject formula = case formula of 5 Cell (Atom 0, rest) -> slot subject rest 6 Cell (Atom 1, rest) -> rest 7 Cell (Atom 2, Cell(a, Cell(b, c))) -> evaluate (evaluate subject a) c 8 ...
Pros: Type safety, pattern matching Cons: Learning curve
JavaScript Implementation
javascript1class Noun { 2 constructor(value) { 3 this.isCell = Array.isArray(value); 4 this.value = value; 5 } 6} 7 8function evaluate(subject, formula) { 9 if (!formula.isCell) { 10 throw new NockCrash("formula must be cell"); 11 } 12 const [op, ...rest] = formula.value; 13 return dispatch(op, subject, rest); 14}
Pros: Web compatibility Cons: Performance, type safety
Testing & Validation
Reference Test: Decrement
python1def test_decrement(): 2 subject = 0 3 formula = [8 [1 0] 8 [1 6 [5 [0 7]] 4 0 6] [0 6] 9 2 [0 2] [4 0 6] 0 7] 9 2 0 1] 4 result = evaluate(subject, formula) 5 assert result == 0, f"Expected 0, got {result}"
Edge Cases
- Empty subject:
[~ [1 0]]should work - Large atoms: Test with 64-bit, 128-bit, larger
- Deep trees: Stack overflow prevention
- Memory pressure: Graceful handling, not crash
Benchmarking
Compare performance against reference implementations:
Implementation | Decrement (ms) | Formula 2 (ms) | Memory (MB)
-------------|-----------------|---------------|--------------
Python | 150 | 50 | 25
Rust | 5 | 0.5 | 3
C (GMP) | 1 | 0.1 | 1
Common Pitfalls
1. Incorrect Slot Calculation
Slot 1 = axis [0 1] // Correct
Slot 2 = axis [1 0] // Correct, NOT [0 1 1]
2. Missing TCO
Deep recursion crashes stack without trampoline or TCO.
3. Inefficient Noun Copying
Deep clones of nouns create O(n²) behavior. Use reference counting or structural sharing.
4. Incorrect Subject Handling
Forgetting to update subject after evaluation changes context.