5.6 KiB
+++ title = "§14 Generating C: Definitions and Functions" priority = 5 status = "todo" ticket_type = "task" dependencies = [] +++
§14 Generating C: Definitions and Functions — Stub to fill
File: edu/src/lisp-compiler.md, section ### 14. Generating C: Definitions and Functions
Replace the stub line with full content. Target 700–900 words. Implement code generation for top-level define forms and lambda expressions, including forward declarations for mutual recursion.
Learning objectives
- Emit forward declarations for all functions before their definitions
- Generate a correct C function signature from a
Lambdawith named parameters - Handle variable
definevs. functiondefine - Understand C's requirement for forward declarations and why MiniLisp needs them
Content to write
Why forward declarations?
In C, a function must be declared before it is called. If even? calls odd? and odd? calls even?, whichever is defined first will try to call a symbol that has not yet been declared. Forward declarations — just the function signature with no body — solve this by telling the C compiler the signature exists before the definition appears.
MiniLisp makes no guarantees about definition order, so we emit forward declarations for every top-level function before any definition.
Two-pass code generation
The code generator uses two passes over the top-level Vec<Expr>:
- Forward declaration pass: emit
ml_int ml_name(ml_int param1, ...);for every top-leveldefinethat wraps alambda. - Definition pass: emit the full function body (or variable initializer) for every top-level
define.
Type signatures
MiniLisp has no type annotations. All values compile to ml_int (which is int64_t). This includes:
- Integers: trivially
ml_int - Booleans: stored as
ml_int(0 or 1) - Strings: a limitation — string-returning functions are declared as
ml_inttoo, which is technically wrong but will compile for our simple programs. Acknowledge this simplification.
A more honest approach would be to use void* or a tagged union — note this in the "What's Next" section.
Generating a forward declaration
fn gen_forward_decl(name: &str, lambda: &Expr) -> String {
if let Expr::Lambda { params, .. } = lambda {
let c_name = mangle(name);
let param_list: Vec<String> = params.iter()
.map(|p| format!("ml_int {}", mangle(p)))
.collect();
format!("ml_int {}({});\n", c_name, param_list.join(", "))
} else {
String::new() // variable define; no forward declaration needed
}
}
Generating a function definition
fn gen_function_def(name: &str, params: &[String], body: &[Expr]) -> String {
let c_name = mangle(name);
let param_list: Vec<String> = params.iter()
.map(|p| format!("ml_int {}", mangle(p)))
.collect();
let mut out = format!("ml_int {}({}) {{\n", c_name, param_list.join(", "));
// All body expressions except the last are statements (side effects)
for expr in &body[..body.len() - 1] {
out.push_str(&format!(" {};\n", gen_stmt(expr)));
}
// Last body expression is the return value
let last = body.last().unwrap();
out.push_str(&format!(" return {};\n", gen_expr(last)));
out.push_str("}\n");
out
}
Explain the idiom: all but the last body expression are evaluated as statements (for side effects like display); the last is used as the return value. This mirrors Lisp's implicit return of the last expression.
Generating a variable definition
fn gen_variable_def(name: &str, value: &Expr) -> String {
format!("ml_int {} = {};\n", mangle(name), gen_expr(value))
}
Variable definitions at top level become global C variables.
The full generate function
pub fn generate(exprs: Vec<Expr>) -> String {
let mut out = String::new();
out.push_str(PREAMBLE);
// Pass 1: forward declarations for all top-level functions
for expr in &exprs {
if let Expr::Define { name, value } = expr {
out.push_str(&gen_forward_decl(name, value));
}
}
out.push('\n');
// Pass 2: definitions
for expr in &exprs {
match expr {
Expr::Define { name, value } => match value.as_ref() {
Expr::Lambda { params, body } =>
out.push_str(&gen_function_def(name, params, body)),
_ =>
out.push_str(&gen_variable_def(name, value)),
}
// Top-level non-define expressions: emit in main()
_ => {} // handled in §16
}
}
out
}
Tests
#[test]
fn test_simple_function() {
let src = "(define (square x) (* x x))";
let exprs = parse(src).unwrap();
let c = generate(exprs);
assert!(c.contains("ml_int ml_square(ml_int ml_x)"));
assert!(c.contains("return (ml_x * ml_x)"));
}
#[test]
fn test_forward_decl_present() {
let src = "(define (f x) (g x))\n(define (g x) x)";
let c = generate(parse(src).unwrap());
// f's forward decl must appear before g's definition
let fwd_pos = c.find("ml_int ml_f(").unwrap();
let def_pos = c.find("ml_int ml_g(ml_int ml_x) {").unwrap();
assert!(fwd_pos < def_pos);
}
Style notes
- Lead with the forward declaration problem — it's the "aha" moment of this section
- The two-pass structure is conceptually important; diagram it clearly
- Acknowledge the "everything is ml_int" simplification explicitly; readers will notice it
- The
body[..body.len()-1]slice for all-but-last is a small Rust trick worth calling out