|
|
---
|
|
|
# edu-n9ap
|
|
|
title: '§3 Compiler Architecture: The Pipeline'
|
|
|
status: completed
|
|
|
type: task
|
|
|
priority: normal
|
|
|
created_at: 2026-03-10T23:30:00Z
|
|
|
updated_at: 2026-03-10T23:30:00Z
|
|
|
---
|
|
|
|
|
|
## §3 Compiler Architecture: The Pipeline — Stub to fill
|
|
|
|
|
|
File: `edu/src/lisp-compiler.md`, section `### 3. Compiler Architecture: The Pipeline`
|
|
|
|
|
|
Replace the stub line with full content. Target 500–700 words. Design overview — ASCII diagrams, brief stage descriptions, Rust module layout, error philosophy. No code yet.
|
|
|
|
|
|
## Learning objectives
|
|
|
|
|
|
- Understand the four compilation stages and what each produces
|
|
|
- Know the Rust type that flows between each stage
|
|
|
- Understand where errors originate and how they are reported
|
|
|
- See the module structure before writing any code
|
|
|
|
|
|
## Content to write
|
|
|
|
|
|
### Pipeline Diagram
|
|
|
|
|
|
```
|
|
|
Source text (&str)
|
|
|
│
|
|
|
▼
|
|
|
┌──────────┐
|
|
|
│ Parser │ src/parser.rs
|
|
|
└──────────┘
|
|
|
│ Vec<Expr>
|
|
|
▼
|
|
|
┌───────────────────┐
|
|
|
│ Semantic Analyser │ src/analyser.rs
|
|
|
└───────────────────┘
|
|
|
│ Vec<Expr> (validated)
|
|
|
▼
|
|
|
┌──────────────────┐
|
|
|
│ Code Generator │ src/codegen.rs
|
|
|
└──────────────────┘
|
|
|
│ String (C source)
|
|
|
▼
|
|
|
stdout / output file
|
|
|
```
|
|
|
|
|
|
### Stage Descriptions
|
|
|
|
|
|
**Parser** (`src/parser.rs`). Accepts `&str` and produces `Vec<Expr>`. Uses nom combinators. Fails on syntax errors: unmatched parentheses, invalid tokens, unexpected EOF.
|
|
|
|
|
|
**Semantic Analyser** (`src/analyser.rs`). Walks `Vec<Expr>` and checks: every symbol reference resolves to a definition, every special form has the correct shape and arity, lambda bodies are non-empty. Returns the same `Vec<Expr>` on success; returns `CompileError` on failure. Does not do type inference — type errors surface as C compiler errors.
|
|
|
|
|
|
**Code Generator** (`src/codegen.rs`). Walks validated `Vec<Expr>` and produces a `String` of C source. This stage is pure — it cannot fail for valid input. Emits the preamble, forward declarations, and top-level forms in order.
|
|
|
|
|
|
**Error type** (`src/error.rs`). A `CompileError` enum with variants for each stage. Uniform error handling across the pipeline. Each variant carries enough context for a useful message (e.g., the undefined symbol name).
|
|
|
|
|
|
### Module Layout
|
|
|
|
|
|
```
|
|
|
src/
|
|
|
├── main.rs # CLI: read input, call compile(), write output
|
|
|
├── ast.rs # Expr enum and Display impl
|
|
|
├── parser.rs # nom parsers → Vec<Expr>
|
|
|
├── analyser.rs # scope checking and form validation
|
|
|
├── codegen.rs # AST → C string
|
|
|
└── error.rs # CompileError enum
|
|
|
```
|
|
|
|
|
|
### The `compile` Function
|
|
|
|
|
|
Show the top-level function signature the reader will implement in §16:
|
|
|
|
|
|
```rust
|
|
|
pub fn compile(source: &str) -> Result<String, CompileError> {
|
|
|
let exprs = parser::parse(source)?;
|
|
|
let exprs = analyser::analyse(exprs)?;
|
|
|
let c_source = codegen::generate(exprs);
|
|
|
Ok(c_source)
|
|
|
}
|
|
|
```
|
|
|
|
|
|
This makes explicit that parsing and analysis are fallible but code generation is not.
|
|
|
|
|
|
### Error Reporting Philosophy
|
|
|
|
|
|
The compiler reports the first error it encounters and stops. It does not attempt to recover and continue after a syntax error. nom's `cut` combinator is used at commit points to produce better error messages. A production compiler would collect multiple errors — this is a deliberate simplification.
|
|
|
|
|
|
Errors include enough context to be actionable:
|
|
|
- Syntax errors: what character was unexpected and approximately where
|
|
|
- Semantic errors: the name of the undefined symbol or the malformed form
|
|
|
|
|
|
### How Sections Map to the Diagram
|
|
|
|
|
|
Tell the reader: Sections 4–9 fill in the parser box. Sections 10–11 fill in the analyser box. Sections 12–15 fill in the code generator box. Section 16 wires them together.
|
|
|
|
|
|
## Style notes
|
|
|
|
|
|
- Open with the pipeline diagram; it's the most information-dense single element in the section
|
|
|
- Keep prose tight — the diagram does the heavy lifting
|
|
|
- The `compile` function signature is the key insight: two fallible stages, one infallible stage
|