|
|
+++
|
|
|
title = "§15 Generating C: Control Flow and Sequencing"
|
|
|
priority = 5
|
|
|
status = "done"
|
|
|
ticket_type = "task"
|
|
|
dependencies = []
|
|
|
+++
|
|
|
|
|
|
## §15 Generating C: Control Flow and Sequencing — Stub to fill
|
|
|
|
|
|
File: `edu/src/lisp-compiler.md`, section `### 15. Generating C: Control Flow and Sequencing`
|
|
|
|
|
|
Replace the stub line with full content. Target 700–900 words. Handle the remaining forms: `let`, `begin`, and `display`/`newline`/`error` as statements. Introduces the expression-vs-statement distinction in code generation.
|
|
|
|
|
|
## Learning objectives
|
|
|
|
|
|
- Understand when to emit C expressions vs. C statements
|
|
|
- Implement `gen_stmt` for side-effecting expressions
|
|
|
- Generate `let` as a C block with local variable declarations
|
|
|
- Generate `begin` as a sequence of statements with the last value forwarded
|
|
|
- Generate `display`, `newline`, `error` as C function calls
|
|
|
|
|
|
## Content to write
|
|
|
|
|
|
### The expression-vs-statement problem
|
|
|
|
|
|
`gen_expr` from §13 generates C *expressions* — code that produces a value. But some MiniLisp constructs are used for their *side effects*: `display` prints something; `begin` sequences multiple expressions; `let` introduces a new scope. These map more naturally to C *statements*.
|
|
|
|
|
|
The solution: introduce `gen_stmt(expr: &Expr) -> String` that generates a C statement (terminated with `;` or wrapped in `{}`) for forms that are used in statement position. `gen_expr` handles forms in expression position. Some forms (like `if`) can appear in either position and need both paths.
|
|
|
|
|
|
### `gen_stmt` — the statement generator
|
|
|
|
|
|
```rust
|
|
|
/// Generate a C statement from a MiniLisp expression.
|
|
|
///
|
|
|
/// Used for: body expressions in functions, let bodies, begin sequences.
|
|
|
pub fn gen_stmt(expr: &Expr) -> String {
|
|
|
match expr {
|
|
|
// Side-effecting built-ins
|
|
|
Expr::Call { func, args } if is_builtin_stmt(func) => gen_display_stmt(func, args),
|
|
|
// Everything else: evaluate as an expression and discard the value
|
|
|
_ => format!("(void){};", gen_expr(expr)),
|
|
|
}
|
|
|
}
|
|
|
|
|
|
fn is_builtin_stmt(func: &Expr) -> bool {
|
|
|
matches!(func, Expr::Symbol(s) if matches!(s.as_str(), "display" | "newline" | "error"))
|
|
|
}
|
|
|
```
|
|
|
|
|
|
### Generating `display`, `newline`, `error`
|
|
|
|
|
|
```rust
|
|
|
fn gen_display_stmt(func: &Expr, args: &[Expr]) -> String {
|
|
|
match func {
|
|
|
Expr::Symbol(s) => match s.as_str() {
|
|
|
"display" => {
|
|
|
// We emit ml_display_int for all non-string arguments.
|
|
|
// A type-aware compiler would choose ml_display_str for string expressions.
|
|
|
let arg = gen_expr(&args[0]);
|
|
|
match &args[0] {
|
|
|
Expr::Str(_) => format!("ml_display_str({});", arg),
|
|
|
Expr::Bool(_) => format!("ml_display_bool({});", arg),
|
|
|
_ => format!("ml_display_int({});", arg),
|
|
|
}
|
|
|
}
|
|
|
"newline" => "ml_newline();".to_string(),
|
|
|
"error" => format!("ml_error({});", gen_expr(&args[0])),
|
|
|
_ => unreachable!(),
|
|
|
}
|
|
|
_ => unreachable!(),
|
|
|
}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Note the simplification: `display` picks the C variant based on the *static* form of the argument. `(display x)` where `x` is a symbol always emits `ml_display_int(ml_x)`, even if `x` holds a boolean at runtime. For the programs in this course, this is acceptable. A production compiler would use a tagged union or a format string approach.
|
|
|
|
|
|
### Generating `let`
|
|
|
|
|
|
`let` compiles to a C block with local variable declarations:
|
|
|
|
|
|
```lisp
|
|
|
(let ((x 1) (y 2)) (+ x y))
|
|
|
```
|
|
|
|
|
|
→
|
|
|
|
|
|
```c
|
|
|
({
|
|
|
ml_int ml_x = 1;
|
|
|
ml_int ml_y = 2;
|
|
|
(ml_x + ml_y);
|
|
|
})
|
|
|
```
|
|
|
|
|
|
This uses GCC's *statement expression* extension: `({ ... })` is a block that returns the value of its last statement. This extension is supported by GCC and Clang but is not standard C99. Discuss the trade-off and the alternative (using a helper function per `let`).
|
|
|
|
|
|
```rust
|
|
|
fn gen_let(bindings: &[(String, Expr)], body: &[Expr]) -> String {
|
|
|
let mut out = String::from("({\n");
|
|
|
for (name, val) in bindings {
|
|
|
out.push_str(&format!(" ml_int {} = {};\n", mangle(name), gen_expr(val)));
|
|
|
}
|
|
|
for expr in &body[..body.len() - 1] {
|
|
|
out.push_str(&format!(" {};\n", gen_stmt(expr)));
|
|
|
}
|
|
|
out.push_str(&format!(" {};\n", gen_expr(body.last().unwrap())));
|
|
|
out.push_str("})");
|
|
|
out
|
|
|
}
|
|
|
```
|
|
|
|
|
|
### Generating `begin`
|
|
|
|
|
|
`begin` in expression position uses the C comma operator; in statement position it is a sequence of statements:
|
|
|
|
|
|
```rust
|
|
|
fn gen_begin_expr(exprs: &[Expr]) -> String {
|
|
|
// Comma operator: (e1, e2, ..., eN) evaluates all, returns eN
|
|
|
let parts: Vec<String> = exprs.iter().map(gen_expr).collect();
|
|
|
format!("({})", parts.join(", "))
|
|
|
}
|
|
|
```
|
|
|
|
|
|
In `gen_expr`, add:
|
|
|
```rust
|
|
|
Expr::Begin(exprs) => gen_begin_expr(exprs),
|
|
|
Expr::Let { bindings, body } => gen_let(bindings, body),
|
|
|
```
|
|
|
|
|
|
### Tests
|
|
|
|
|
|
```rust
|
|
|
#[test]
|
|
|
fn test_gen_let() {
|
|
|
let src = "(define (f) (let ((x 1) (y 2)) (+ x y)))";
|
|
|
let c = generate(parse(src).unwrap());
|
|
|
assert!(c.contains("ml_int ml_x = 1"));
|
|
|
assert!(c.contains("ml_int ml_y = 2"));
|
|
|
}
|
|
|
|
|
|
#[test]
|
|
|
fn test_gen_begin() {
|
|
|
let src = "(define (f) (begin (display 1) (display 2) 3))";
|
|
|
let c = generate(parse(src).unwrap());
|
|
|
assert!(c.contains("ml_display_int(1)"));
|
|
|
assert!(c.contains("ml_display_int(2)"));
|
|
|
assert!(c.contains("return 3"));
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## Style notes
|
|
|
|
|
|
- The expression-vs-statement distinction is the key concept here — explain it at the top before any code
|
|
|
- The statement expression `({...})` extension for `let` is a real trade-off — acknowledge it honestly
|
|
|
- The `display` type dispatch simplification should be called out clearly — readers will ask "what if I display a boolean stored in a variable?"
|
|
|
- End with a checkpoint: generate C for the complete factorial example; it should be correct and compilable
|