You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

5.0 KiB

+++ title = "§18 What's Next: Extensions and Further Reading" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++

§18 What's Next: Extensions and Further Reading — Stub to fill

File: edu/src/lisp-compiler.md, section ### 18. What's Next: Extensions and Further Reading

Replace the stub line with full content. Target 600800 words. Survey the directions the compiler can be taken and provide a curated reading list. Reading-only, no code.

Learning objectives

  • Understand what limitations the current compiler has and why
  • Know the conceptual approaches for each major extension
  • Have a reading list for going deeper into compiler theory and Lisp implementation

Content to write

Congratulations — and what you skipped

Open by acknowledging what the reader has built: a complete compiler with a lexer, parser, semantic analyser, code generator, and test suite. Then honestly catalog what was left out:

Extension 1: Closures and Lambda Lifting

The current compiler does not support closures — lambdas cannot capture variables from enclosing functions. Adding closures requires lambda lifting: transforming each lambda that captures free variables into a top-level function that takes those variables as extra parameters. This is a classical technique. Real Lisp runtimes use closure records (a struct containing the function pointer and captured values) allocated on the heap.

Extension 2: Tail-Call Optimisation (TCO)

(define (loop n) (if (= n 0) n (loop (- n 1)))) will stack overflow for large n in the current compiler because each recursive call pushes a new C stack frame. TCO transforms tail calls into jumps. In C, this can be approximated with the __attribute__((optimize("O2"))) pragma or by using a trampoline pattern. A proper solution requires detecting tail-call position during code generation and emitting a goto loop.

Extension 3: A Type System

Add type inference (Hindley-Milner or a simpler Hindley-style bidirectional checker) so that type errors are caught before C is emitted. This would allow the code generator to choose the correct ml_display_* variant and generate proper C function signatures for string-returning functions.

Extension 4: Pairs, Lists, and a Runtime

(cons a b), (car p), (cdr p) require heap-allocated pair objects — a proper C struct. This opens the door to proper Lisp list processing. Once you have heap allocation, you need a garbage collector. The simplest GC is reference counting; a more robust approach is mark-and-sweep.

Extension 5: Macros

Lisp macros transform code before compilation. A simple approach is syntax transformers: functions that run at compile time and return transformed AST nodes. This requires a small interpreter for the macro language. Hygienic macros (as in Scheme) are significantly more complex.

Extension 6: A REPL

A read-eval-print loop compiles and runs one expression at a time. This requires either an interpreter (easier) or incremental native code emission (harder). An interpreter over the AST is a natural extension once the parser is complete — it's essentially the code generator replaced with a recursive evaluator.

Extension 7: Self-Hosting

The ultimate milestone: rewrite the MiniLisp compiler in MiniLisp itself. This requires the language to be expressive enough (strings, I/O, some form of list processing) and the compiler to be complete enough to compile itself. Self-hosting is the proof that you've really built something.

Further Reading

Compiler theory:

  • Crafting Interpreters by Robert Nystrom — free online; builds a language in two complete implementations (tree-walking and bytecode)
  • Modern Compiler Implementation in ML/Java/C by Andrew Appel — classic academic compiler textbook
  • Engineering a Compiler by Cooper & Torczon — comprehensive modern treatment

Lisp implementation:

  • Structure and Interpretation of Computer Programs (SICP) — chapters 4 and 5 cover interpreters and compilers for Scheme
  • Lisp in Small Pieces by Christian Queinnec — 11 different Lisp implementations, from interpreter to compiler
  • Build Your Own Lisp by Daniel Holden — free online, C implementation

Parsing:

Rust and compilers:

  • The cranelift crate — a code generator backend (Rust, used in Wasmtime)
  • inkwell crate — safe Rust bindings to LLVM for native code generation

Style notes

  • Open warmly — the reader has accomplished something real
  • Each extension should be a paragraph: what it is, why it is non-trivial, and the key technique
  • The reading list is the most durable part of this section; keep it current and annotated
  • Close the course with encouragement: the concepts learned here (parsing, AST manipulation, code generation) apply to every compiler, transpiler, and language tool the reader will ever build