--- # edu-n9ap title: '§3 Compiler Architecture: The Pipeline' status: completed type: task priority: normal created_at: 2026-03-10T23:30:00Z updated_at: 2026-03-10T23:30:00Z --- ## §3 Compiler Architecture: The Pipeline — Stub to fill File: `edu/src/lisp-compiler.md`, section `### 3. Compiler Architecture: The Pipeline` Replace the stub line with full content. Target 500–700 words. Design overview — ASCII diagrams, brief stage descriptions, Rust module layout, error philosophy. No code yet. ## Learning objectives - Understand the four compilation stages and what each produces - Know the Rust type that flows between each stage - Understand where errors originate and how they are reported - See the module structure before writing any code ## Content to write ### Pipeline Diagram ``` Source text (&str) │ ▼ ┌──────────┐ │ Parser │ src/parser.rs └──────────┘ │ Vec ▼ ┌───────────────────┐ │ Semantic Analyser │ src/analyser.rs └───────────────────┘ │ Vec (validated) ▼ ┌──────────────────┐ │ Code Generator │ src/codegen.rs └──────────────────┘ │ String (C source) ▼ stdout / output file ``` ### Stage Descriptions **Parser** (`src/parser.rs`). Accepts `&str` and produces `Vec`. Uses nom combinators. Fails on syntax errors: unmatched parentheses, invalid tokens, unexpected EOF. **Semantic Analyser** (`src/analyser.rs`). Walks `Vec` and checks: every symbol reference resolves to a definition, every special form has the correct shape and arity, lambda bodies are non-empty. Returns the same `Vec` on success; returns `CompileError` on failure. Does not do type inference — type errors surface as C compiler errors. **Code Generator** (`src/codegen.rs`). Walks validated `Vec` and produces a `String` of C source. This stage is pure — it cannot fail for valid input. Emits the preamble, forward declarations, and top-level forms in order. **Error type** (`src/error.rs`). A `CompileError` enum with variants for each stage. Uniform error handling across the pipeline. Each variant carries enough context for a useful message (e.g., the undefined symbol name). ### Module Layout ``` src/ ├── main.rs # CLI: read input, call compile(), write output ├── ast.rs # Expr enum and Display impl ├── parser.rs # nom parsers → Vec ├── analyser.rs # scope checking and form validation ├── codegen.rs # AST → C string └── error.rs # CompileError enum ``` ### The `compile` Function Show the top-level function signature the reader will implement in §16: ```rust pub fn compile(source: &str) -> Result { let exprs = parser::parse(source)?; let exprs = analyser::analyse(exprs)?; let c_source = codegen::generate(exprs); Ok(c_source) } ``` This makes explicit that parsing and analysis are fallible but code generation is not. ### Error Reporting Philosophy The compiler reports the first error it encounters and stops. It does not attempt to recover and continue after a syntax error. nom's `cut` combinator is used at commit points to produce better error messages. A production compiler would collect multiple errors — this is a deliberate simplification. Errors include enough context to be actionable: - Syntax errors: what character was unexpected and approximately where - Semantic errors: the name of the undefined symbol or the malformed form ### How Sections Map to the Diagram Tell the reader: Sections 4–9 fill in the parser box. Sections 10–11 fill in the analyser box. Sections 12–15 fill in the code generator box. Section 16 wires them together. ## Style notes - Open with the pipeline diagram; it's the most information-dense single element in the section - Keep prose tight — the diagram does the heavy lifting - The `compile` function signature is the key insight: two fallible stages, one infallible stage