5.0 KiB
+++ title = "§18 What's Next: Extensions and Further Reading" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++
§18 What's Next: Extensions and Further Reading — Stub to fill
File: edu/src/lisp-compiler.md, section ### 18. What's Next: Extensions and Further Reading
Replace the stub line with full content. Target 600–800 words. Survey the directions the compiler can be taken and provide a curated reading list. Reading-only, no code.
Learning objectives
- Understand what limitations the current compiler has and why
- Know the conceptual approaches for each major extension
- Have a reading list for going deeper into compiler theory and Lisp implementation
Content to write
Congratulations — and what you skipped
Open by acknowledging what the reader has built: a complete compiler with a lexer, parser, semantic analyser, code generator, and test suite. Then honestly catalog what was left out:
Extension 1: Closures and Lambda Lifting
The current compiler does not support closures — lambdas cannot capture variables from enclosing functions. Adding closures requires lambda lifting: transforming each lambda that captures free variables into a top-level function that takes those variables as extra parameters. This is a classical technique. Real Lisp runtimes use closure records (a struct containing the function pointer and captured values) allocated on the heap.
Extension 2: Tail-Call Optimisation (TCO)
(define (loop n) (if (= n 0) n (loop (- n 1)))) will stack overflow for large n in the current compiler because each recursive call pushes a new C stack frame. TCO transforms tail calls into jumps. In C, this can be approximated with the __attribute__((optimize("O2"))) pragma or by using a trampoline pattern. A proper solution requires detecting tail-call position during code generation and emitting a goto loop.
Extension 3: A Type System
Add type inference (Hindley-Milner or a simpler Hindley-style bidirectional checker) so that type errors are caught before C is emitted. This would allow the code generator to choose the correct ml_display_* variant and generate proper C function signatures for string-returning functions.
Extension 4: Pairs, Lists, and a Runtime
(cons a b), (car p), (cdr p) require heap-allocated pair objects — a proper C struct. This opens the door to proper Lisp list processing. Once you have heap allocation, you need a garbage collector. The simplest GC is reference counting; a more robust approach is mark-and-sweep.
Extension 5: Macros
Lisp macros transform code before compilation. A simple approach is syntax transformers: functions that run at compile time and return transformed AST nodes. This requires a small interpreter for the macro language. Hygienic macros (as in Scheme) are significantly more complex.
Extension 6: A REPL
A read-eval-print loop compiles and runs one expression at a time. This requires either an interpreter (easier) or incremental native code emission (harder). An interpreter over the AST is a natural extension once the parser is complete — it's essentially the code generator replaced with a recursive evaluator.
Extension 7: Self-Hosting
The ultimate milestone: rewrite the MiniLisp compiler in MiniLisp itself. This requires the language to be expressive enough (strings, I/O, some form of list processing) and the compiler to be complete enough to compile itself. Self-hosting is the proof that you've really built something.
Further Reading
Compiler theory:
- Crafting Interpreters by Robert Nystrom — free online; builds a language in two complete implementations (tree-walking and bytecode)
- Modern Compiler Implementation in ML/Java/C by Andrew Appel — classic academic compiler textbook
- Engineering a Compiler by Cooper & Torczon — comprehensive modern treatment
Lisp implementation:
- Structure and Interpretation of Computer Programs (SICP) — chapters 4 and 5 cover interpreters and compilers for Scheme
- Lisp in Small Pieces by Christian Queinnec — 11 different Lisp implementations, from interpreter to compiler
- Build Your Own Lisp by Daniel Holden — free online, C implementation
Parsing:
- Parsing Techniques by Grune & Jacobs — comprehensive reference (free PDF)
- nom documentation and recipes: https://github.com/rust-bakery/nom/tree/main/doc
Rust and compilers:
- The
craneliftcrate — a code generator backend (Rust, used in Wasmtime) inkwellcrate — safe Rust bindings to LLVM for native code generation
Style notes
- Open warmly — the reader has accomplished something real
- Each extension should be a paragraph: what it is, why it is non-trivial, and the key technique
- The reading list is the most durable part of this section; keep it current and annotated
- Close the course with encouragement: the concepts learned here (parsing, AST manipulation, code generation) apply to every compiler, transpiler, and language tool the reader will ever build