+++ title = "§15 Generating C: Control Flow and Sequencing" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++ ## §15 Generating C: Control Flow and Sequencing — Stub to fill File: `edu/src/lisp-compiler.md`, section `### 15. Generating C: Control Flow and Sequencing` Replace the stub line with full content. Target 700–900 words. Handle the remaining forms: `let`, `begin`, and `display`/`newline`/`error` as statements. Introduces the expression-vs-statement distinction in code generation. ## Learning objectives - Understand when to emit C expressions vs. C statements - Implement `gen_stmt` for side-effecting expressions - Generate `let` as a C block with local variable declarations - Generate `begin` as a sequence of statements with the last value forwarded - Generate `display`, `newline`, `error` as C function calls ## Content to write ### The expression-vs-statement problem `gen_expr` from §13 generates C *expressions* — code that produces a value. But some MiniLisp constructs are used for their *side effects*: `display` prints something; `begin` sequences multiple expressions; `let` introduces a new scope. These map more naturally to C *statements*. The solution: introduce `gen_stmt(expr: &Expr) -> String` that generates a C statement (terminated with `;` or wrapped in `{}`) for forms that are used in statement position. `gen_expr` handles forms in expression position. Some forms (like `if`) can appear in either position and need both paths. ### `gen_stmt` — the statement generator ```rust /// Generate a C statement from a MiniLisp expression. /// /// Used for: body expressions in functions, let bodies, begin sequences. pub fn gen_stmt(expr: &Expr) -> String { match expr { // Side-effecting built-ins Expr::Call { func, args } if is_builtin_stmt(func) => gen_display_stmt(func, args), // Everything else: evaluate as an expression and discard the value _ => format!("(void){};", gen_expr(expr)), } } fn is_builtin_stmt(func: &Expr) -> bool { matches!(func, Expr::Symbol(s) if matches!(s.as_str(), "display" | "newline" | "error")) } ``` ### Generating `display`, `newline`, `error` ```rust fn gen_display_stmt(func: &Expr, args: &[Expr]) -> String { match func { Expr::Symbol(s) => match s.as_str() { "display" => { // We emit ml_display_int for all non-string arguments. // A type-aware compiler would choose ml_display_str for string expressions. let arg = gen_expr(&args[0]); match &args[0] { Expr::Str(_) => format!("ml_display_str({});", arg), Expr::Bool(_) => format!("ml_display_bool({});", arg), _ => format!("ml_display_int({});", arg), } } "newline" => "ml_newline();".to_string(), "error" => format!("ml_error({});", gen_expr(&args[0])), _ => unreachable!(), } _ => unreachable!(), } } ``` Note the simplification: `display` picks the C variant based on the *static* form of the argument. `(display x)` where `x` is a symbol always emits `ml_display_int(ml_x)`, even if `x` holds a boolean at runtime. For the programs in this course, this is acceptable. A production compiler would use a tagged union or a format string approach. ### Generating `let` `let` compiles to a C block with local variable declarations: ```lisp (let ((x 1) (y 2)) (+ x y)) ``` → ```c ({ ml_int ml_x = 1; ml_int ml_y = 2; (ml_x + ml_y); }) ``` This uses GCC's *statement expression* extension: `({ ... })` is a block that returns the value of its last statement. This extension is supported by GCC and Clang but is not standard C99. Discuss the trade-off and the alternative (using a helper function per `let`). ```rust fn gen_let(bindings: &[(String, Expr)], body: &[Expr]) -> String { let mut out = String::from("({\n"); for (name, val) in bindings { out.push_str(&format!(" ml_int {} = {};\n", mangle(name), gen_expr(val))); } for expr in &body[..body.len() - 1] { out.push_str(&format!(" {};\n", gen_stmt(expr))); } out.push_str(&format!(" {};\n", gen_expr(body.last().unwrap()))); out.push_str("})"); out } ``` ### Generating `begin` `begin` in expression position uses the C comma operator; in statement position it is a sequence of statements: ```rust fn gen_begin_expr(exprs: &[Expr]) -> String { // Comma operator: (e1, e2, ..., eN) evaluates all, returns eN let parts: Vec = exprs.iter().map(gen_expr).collect(); format!("({})", parts.join(", ")) } ``` In `gen_expr`, add: ```rust Expr::Begin(exprs) => gen_begin_expr(exprs), Expr::Let { bindings, body } => gen_let(bindings, body), ``` ### Tests ```rust #[test] fn test_gen_let() { let src = "(define (f) (let ((x 1) (y 2)) (+ x y)))"; let c = generate(parse(src).unwrap()); assert!(c.contains("ml_int ml_x = 1")); assert!(c.contains("ml_int ml_y = 2")); } #[test] fn test_gen_begin() { let src = "(define (f) (begin (display 1) (display 2) 3))"; let c = generate(parse(src).unwrap()); assert!(c.contains("ml_display_int(1)")); assert!(c.contains("ml_display_int(2)")); assert!(c.contains("return 3")); } ``` ## Style notes - The expression-vs-statement distinction is the key concept here — explain it at the top before any code - The statement expression `({...})` extension for `let` is a real trade-off — acknowledge it honestly - The `display` type dispatch simplification should be called out clearly — readers will ask "what if I display a boolean stored in a variable?" - End with a checkpoint: generate C for the complete factorial example; it should be correct and compilable