A single-page reference for the rough edges of the current self-hosted
compiler (Grammar v1.7). Each entry is a real footgun that recurs often
enough to be worth memorising — most have an inline workaround that is
already used throughout stdlib/. If you are an LLM writing NURL, read
this first: it will save you several compile-test cycles.
This page covers active quirks of the current compiler. For deliberate
scope limitations (no GC, no sized types, etc.) see the Known
Limitations table in ../README.md.
| # | Gotcha | One-line fix |
|---|---|---|
| 1 | & and | are binary — & A B C is a parse-arity error | Chain via parens: & A & B C, or extract a __cond helper |
| 2 | Mutating a multi-field struct's i field via = does not propagate through closures | Back the state with a Vec[i] (heap handle); mutate via vec_set |
| 3 | : ~ MyEnum x … (mutable enum binding) miscompiles — i64 stored without insertvalue | Use a sentinel-flag pattern: : ~ b had_err F + last-known plain field |
| 4 | vec_get [MultiFieldStruct] returns a wrong default when index is out of bounds | Iterate via vec_data [T] + pointer indexing (*T data, . data k) |
| 5 | Bare @-fn names don't auto-coerce to a (@ R P*) closure parameter | Wrap in \ P* → R { ( fn args ) } (same as eq_int / c_int) |
| 6 | Multi-field structs can't ride the T arm of ! T E (Result) | Use a tagged-struct workaround ({ T value, b ok, E err }) — see ParsedHead |
| 7 | Direct shadowing in the same : line: : i z + z 719468 shadows the parameter z from that line forward | Rename to zz (or any new name) |
| 8 | Function calls require parens — __i_mod a b parses as register-then-loose-tokens | Always: ( __i_mod a b ) |
| 9 | Ternary nesting is arity-strict — a missing/extra operand cascades into "unexpected token" at the next statement | Count operands left-to-right; one wrong arity poisons the line + a few after |
| 10 | vec_clone is deliberately absent — bitwise clone would alias owned heap buffers | Roll your own: vec_each + vec_push + per-element clone closure |
& and | are binary, not n-ary// ✗ Looks like 4 conditions but parses as `& a b`, then `c d` are stray
? & a b c d { ... }
// ✓ Explicit pairing
? & a & b & c d { ... }
// ✓ Or extract a helper (preferred when expressions are long)
@ __ok i status → b {
^ & >= status 200 & < status 300 != status 226
}
Why: the parser greedily binds exactly two operands per & / |
operator. There is no variadic form. Trailing operands become bare
expressions in the surrounding context — and the parse error is reported
at the next token, not at the operator, which is why this hides easily.
Real example: stdlib/ext/http_middleware.nu:54 chains four range
checks via four separate ? arms instead of one big & expression.
// ✗ Counter struct — `=` updates the local copy, not the captured one
: Counter { i n, i max }
@ make_inc Counter c → ( @ v ) {
^ \ → v { = . c n + . c n 1 } // .n stays at 0 from the caller's view
}
// ✓ Back state with a Vec[i] handle (heap-allocated, mutations survive)
: Counter { ( Vec i ) slots } // slot 0 = n, slot 1 = max
@ make_inc Counter c → ( @ v ) {
^ \ → v {
: *i d ( vec_data [i] . c slots )
( vec_set [i] . c slots 0 + . d 0 1 )
}
}
Why: multi-field structs are passed by value. Closures snapshot
captured values at construction. A = write to a struct field hits the
captured copy on the closure's environment frame, not the caller's
binding. Vec[T] is a 1-pointer handle to a heap buffer — the handle is
copied, but the buffer is shared, so vec_set is observed by everyone.
Real example: stdlib/ext/http_middleware.nu:86–125 — Metrics is
a single-field { ( Vec i ) counters } heap-backed struct precisely
because the original { i requests, i in_flight, ... } shape was lost
across the with_metrics closure boundary.
: | NetErr { NetClosed | NetAccept | NetOther }
// ✗ codegen emits an i64 store without the `insertvalue` wrapper —
// subsequent reads see garbage
: ~ NetErr last_err NetOther
// ✓ Sentinel-flag pattern: track *whether* we have an error in a bool,
// then re-derive the variant on the recovery path
: ~ b had_err F
~ ! done {
: ! v NetErr r ( server_run_once s )
?? r {
T _ → {}
F e → { = had_err T = done T }
}
}
? had_err {
// re-issue the failing call to recover the variant cheaply
: ! v NetErr r2 ( server_run_once s )
^ r2
} { ... }
Why: the codegen path for : ~ (mutable binding) on enum values
stores the i64 tag directly into the alloca instead of wrapping it in
insertvalue for the enum's {i64, [N x i8]} LLVM shape. Reads then
extract the wrong field.
Real example: stdlib/ext/http_server.nu:329–360 — server_run's
loop uses had_err: b + a deliberate re-issue rather than carrying the
last NetErr directly.
vec_get [MultiFieldStruct] default is wrong: Header { String name, String value }
// ✗ When `k` is out-of-bounds, the default `# Header 0` emits an
// `i64 0` into the first %String slot — caller sees an invalid handle
?? ( vec_get [Header] hs k ) {
T h → { ... }
F → { ... }
}
// ✓ Iterate via raw pointer + length
: i n ( vec_len [Header] hs )
: *Header data ( vec_data [Header] hs )
: ~ i k 0
~ < k n {
: Header h . data k
...
= k + k 1
}
Why: vec_get [T] returns ? T, and the None branch synthesises
a zero T for the payload slot. For single-int / single-pointer T this
is fine; for multi-field structs the synthesised zero is a flat i64 0
where a {%String, %String} is expected.
Real example: stdlib/ext/http_request.nu:291–305 (header_get)
and stdlib/ext/http_router.nu:69 both note the workaround.
@-fn names don't auto-coerce to closure params@ eq_int i a i b → b { ^ == a b }
: ( Vec i ) v ( vec_with_cap [i] 4 )
// ✗ `eq_int` parses as a local register lookup → link-time miss
( vec_contains [i] v 42 eq_int )
// ✓ Wrap in a thin closure
( vec_contains [i] v 42 \ i a i b → b { ^ ( eq_int a b ) } )
Why: the parser treats bare identifiers in argument position as
local-symbol lookups, not as references to @-defined functions. The
@-fn name only resolves at the call site ( name args ). A closure
literal makes the intent explicit and adopts the right LLVM shape
({ fn-ptr, env-ptr }).
Helper convention: a few ergonomic wrappers exist already —
eq_int, eq_string, cmp_int, cmp_string, c_int (sort
comparator) — but they all still need \ ... { ( eq_int a b ) } style
wrapping when handed to a closure parameter.
Real example: any sort_by / vec_contains / binary_search
call site in stdlib/.
! T E Ok arms// ✗ Multi-field T — `! ParsedHead HttpReqErr` works at the type level
// but the Ok payload codegen emits a wrong `extractvalue`
@ parse_request_head ( Vec u ) buf → ! ParsedHead HttpReqErr { ... }
// ✓ Tagged-struct workaround — both payloads always present
: ParsedHead {
HttpRequest head
i consumed
b ok
HttpReqErr err
}
@ parse_request_head ( Vec u ) buf → ParsedHead { ... }
Why: the ! T E enum is encoded as {i1, i64} — payloads must
fit in an i64. Single ints, single pointers, opaque-handle
({ s ctl } 1-field) structs work; multi-field structs don't.
Standard escape hatch: wrap the multi-field type in an
opaque handle ({ s raw } or { s ctl }) backed by a heap-allocated
impl struct. This is what Regex, Channel, Mutex, TcpListener,
McpClient all do — see stdlib/ext/regex.nu for the canonical
pattern.
@ days_since_epoch i z → Time {
// ✗ The `+ z 719468` reads parameter z, then the `:` introduces a
// NEW immutable z that shadows it from this line forward
: i z + z 719468
// any further read of `z` now sees the era-shifted value
}
// ✓ Rename
@ days_since_epoch i z → Time {
: i zz + z 719468 // fine — original `z` still in scope
...
}
Why: : introduces a new binding immediately after the right-hand
side is evaluated. The new name is in scope for the rest of the
function, including any subsequent reads — which silently rebind to
the new value. No warning.
Real example: caught during time_from_unix development — see
stdlib/std/time.nu.
// ✗ `__i_mod a b` parses as: register `__i_mod` plus stray tokens
: i d __i_mod a b
// ✓ Always
: i d ( __i_mod a b )
Why: there is no implicit-call form. ( fn args ) is the only
call syntax. A bare identifier is always a name lookup — and the
following tokens are then parsed in the surrounding context (often as
operator operands), so the error surfaces several tokens later as
"unexpected token".
// ✗ Missing one operand from a nested ternary
: i x ? cond1 a ? cond2 b // missing else for the inner ?
// → "unexpected token" at the *next* statement, not at the missing
// operand. The outer parser ate `b` as the inner ternary's value
// and then can't find an else for the outer.
Why: prefix notation has no closing token. Parsers count operands left-to-right and a missing operand silently consumes the next token that should have started a new statement. The diagnostic always points at the wrong line.
Debugging tip: when you see "unexpected token" on a line that
looks fine, count operands on every ?, &, |, !, =, + etc.
on the previous line.
vec_clone is intentionally absent// ✗ No such function — would alias owned heap buffers
: ( Vec String ) copy ( vec_clone [String] src )
// ✓ Shallow clone via vec_each + per-element clone closure
: ( Vec String ) copy ( vec_with_cap [String] ( vec_len [String] src ) )
( vec_each [String] src \ String s → v {
( vec_push [String] copy ( string_clone s ) )
} )
Why: vec_clone would do a bitwise copy of the underlying buffer,
duplicating owned-pointer fields without telling the auto-drop
machinery. The result would double-free every owned element at scope
exit. Per-element clone is explicit and respects ownership.
stdlib/ext/http_request.nu, http_response.nu, http_router.nu,
http_server.nu, http_middleware.nu, regex.nu, mcp_client.nu.../ROADMAP.md
§1 (multi-field Option Some-arm, generic propagation through
closures, multi-field struct mutability, mutable enum bindings).