+++ title = "§14 Generating C: Definitions and Functions" priority = 5 status = "todo" ticket_type = "task" dependencies = [] +++ ## §14 Generating C: Definitions and Functions — Stub to fill File: `edu/src/lisp-compiler.md`, section `### 14. Generating C: Definitions and Functions` Replace the stub line with full content. Target 700–900 words. Implement code generation for top-level `define` forms and `lambda` expressions, including forward declarations for mutual recursion. ## Learning objectives - Emit forward declarations for all functions before their definitions - Generate a correct C function signature from a `Lambda` with named parameters - Handle variable `define` vs. function `define` - Understand C's requirement for forward declarations and why MiniLisp needs them ## Content to write ### Why forward declarations? In C, a function must be declared before it is called. If `even?` calls `odd?` and `odd?` calls `even?`, whichever is defined first will try to call a symbol that has not yet been declared. Forward declarations — just the function signature with no body — solve this by telling the C compiler the signature exists before the definition appears. MiniLisp makes no guarantees about definition order, so we emit forward declarations for every top-level function before any definition. ### Two-pass code generation The code generator uses two passes over the top-level `Vec`: 1. **Forward declaration pass**: emit `ml_int ml_name(ml_int param1, ...);` for every top-level `define` that wraps a `lambda`. 2. **Definition pass**: emit the full function body (or variable initializer) for every top-level `define`. ### Type signatures MiniLisp has no type annotations. All values compile to `ml_int` (which is `int64_t`). This includes: - Integers: trivially `ml_int` - Booleans: stored as `ml_int` (0 or 1) - Strings: a limitation — string-returning functions are declared as `ml_int` too, which is technically wrong but will compile for our simple programs. Acknowledge this simplification. A more honest approach would be to use `void*` or a tagged union — note this in the "What's Next" section. ### Generating a forward declaration ```rust fn gen_forward_decl(name: &str, lambda: &Expr) -> String { if let Expr::Lambda { params, .. } = lambda { let c_name = mangle(name); let param_list: Vec = params.iter() .map(|p| format!("ml_int {}", mangle(p))) .collect(); format!("ml_int {}({});\n", c_name, param_list.join(", ")) } else { String::new() // variable define; no forward declaration needed } } ``` ### Generating a function definition ```rust fn gen_function_def(name: &str, params: &[String], body: &[Expr]) -> String { let c_name = mangle(name); let param_list: Vec = params.iter() .map(|p| format!("ml_int {}", mangle(p))) .collect(); let mut out = format!("ml_int {}({}) {{\n", c_name, param_list.join(", ")); // All body expressions except the last are statements (side effects) for expr in &body[..body.len() - 1] { out.push_str(&format!(" {};\n", gen_stmt(expr))); } // Last body expression is the return value let last = body.last().unwrap(); out.push_str(&format!(" return {};\n", gen_expr(last))); out.push_str("}\n"); out } ``` Explain the idiom: all but the last body expression are evaluated as statements (for side effects like `display`); the last is used as the return value. This mirrors Lisp's implicit return of the last expression. ### Generating a variable definition ```rust fn gen_variable_def(name: &str, value: &Expr) -> String { format!("ml_int {} = {};\n", mangle(name), gen_expr(value)) } ``` Variable definitions at top level become global C variables. ### The full `generate` function ```rust pub fn generate(exprs: Vec) -> String { let mut out = String::new(); out.push_str(PREAMBLE); // Pass 1: forward declarations for all top-level functions for expr in &exprs { if let Expr::Define { name, value } = expr { out.push_str(&gen_forward_decl(name, value)); } } out.push('\n'); // Pass 2: definitions for expr in &exprs { match expr { Expr::Define { name, value } => match value.as_ref() { Expr::Lambda { params, body } => out.push_str(&gen_function_def(name, params, body)), _ => out.push_str(&gen_variable_def(name, value)), } // Top-level non-define expressions: emit in main() _ => {} // handled in §16 } } out } ``` ### Tests ```rust #[test] fn test_simple_function() { let src = "(define (square x) (* x x))"; let exprs = parse(src).unwrap(); let c = generate(exprs); assert!(c.contains("ml_int ml_square(ml_int ml_x)")); assert!(c.contains("return (ml_x * ml_x)")); } #[test] fn test_forward_decl_present() { let src = "(define (f x) (g x))\n(define (g x) x)"; let c = generate(parse(src).unwrap()); // f's forward decl must appear before g's definition let fwd_pos = c.find("ml_int ml_f(").unwrap(); let def_pos = c.find("ml_int ml_g(ml_int ml_x) {").unwrap(); assert!(fwd_pos < def_pos); } ``` ## Style notes - Lead with the forward declaration problem — it's the "aha" moment of this section - The two-pass structure is conceptually important; diagram it clearly - Acknowledge the "everything is ml_int" simplification explicitly; readers will notice it - The `body[..body.len()-1]` slice for all-but-last is a small Rust trick worth calling out