|
|
---
|
|
|
# edu-pyue
|
|
|
title: '§14 Generating C: Definitions and Functions'
|
|
|
status: completed
|
|
|
type: task
|
|
|
priority: normal
|
|
|
created_at: 2026-03-10T23:30:02Z
|
|
|
updated_at: 2026-03-10T23:30:02Z
|
|
|
---
|
|
|
|
|
|
## §14 Generating C: Definitions and Functions — Stub to fill
|
|
|
|
|
|
File: `edu/src/lisp-compiler.md`, section `### 14. Generating C: Definitions and Functions`
|
|
|
|
|
|
Replace the stub line with full content. Target 700–900 words. Implement code generation for top-level `define` forms and `lambda` expressions, including forward declarations for mutual recursion.
|
|
|
|
|
|
## Learning objectives
|
|
|
|
|
|
- Emit forward declarations for all functions before their definitions
|
|
|
- Generate a correct C function signature from a `Lambda` with named parameters
|
|
|
- Handle variable `define` vs. function `define`
|
|
|
- Understand C's requirement for forward declarations and why MiniLisp needs them
|
|
|
|
|
|
## Content to write
|
|
|
|
|
|
### Why forward declarations?
|
|
|
|
|
|
In C, a function must be declared before it is called. If `even?` calls `odd?` and `odd?` calls `even?`, whichever is defined first will try to call a symbol that has not yet been declared. Forward declarations — just the function signature with no body — solve this by telling the C compiler the signature exists before the definition appears.
|
|
|
|
|
|
MiniLisp makes no guarantees about definition order, so we emit forward declarations for every top-level function before any definition.
|
|
|
|
|
|
### Two-pass code generation
|
|
|
|
|
|
The code generator uses two passes over the top-level `Vec<Expr>`:
|
|
|
|
|
|
1. **Forward declaration pass**: emit `ml_int ml_name(ml_int param1, ...);` for every top-level `define` that wraps a `lambda`.
|
|
|
2. **Definition pass**: emit the full function body (or variable initializer) for every top-level `define`.
|
|
|
|
|
|
### Type signatures
|
|
|
|
|
|
MiniLisp has no type annotations. All values compile to `ml_int` (which is `int64_t`). This includes:
|
|
|
- Integers: trivially `ml_int`
|
|
|
- Booleans: stored as `ml_int` (0 or 1)
|
|
|
- Strings: a limitation — string-returning functions are declared as `ml_int` too, which is technically wrong but will compile for our simple programs. Acknowledge this simplification.
|
|
|
|
|
|
A more honest approach would be to use `void*` or a tagged union — note this in the "What's Next" section.
|
|
|
|
|
|
### Generating a forward declaration
|
|
|
|
|
|
```rust
|
|
|
fn gen_forward_decl(name: &str, lambda: &Expr) -> String {
|
|
|
if let Expr::Lambda { params, .. } = lambda {
|
|
|
let c_name = mangle(name);
|
|
|
let param_list: Vec<String> = params.iter()
|
|
|
.map(|p| format!("ml_int {}", mangle(p)))
|
|
|
.collect();
|
|
|
format!("ml_int {}({});\n", c_name, param_list.join(", "))
|
|
|
} else {
|
|
|
String::new() // variable define; no forward declaration needed
|
|
|
}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
### Generating a function definition
|
|
|
|
|
|
```rust
|
|
|
fn gen_function_def(name: &str, params: &[String], body: &[Expr]) -> String {
|
|
|
let c_name = mangle(name);
|
|
|
let param_list: Vec<String> = params.iter()
|
|
|
.map(|p| format!("ml_int {}", mangle(p)))
|
|
|
.collect();
|
|
|
let mut out = format!("ml_int {}({}) {{\n", c_name, param_list.join(", "));
|
|
|
|
|
|
// All body expressions except the last are statements (side effects)
|
|
|
for expr in &body[..body.len() - 1] {
|
|
|
out.push_str(&format!(" {};\n", gen_stmt(expr)));
|
|
|
}
|
|
|
// Last body expression is the return value
|
|
|
let last = body.last().unwrap();
|
|
|
out.push_str(&format!(" return {};\n", gen_expr(last)));
|
|
|
out.push_str("}\n");
|
|
|
out
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Explain the idiom: all but the last body expression are evaluated as statements (for side effects like `display`); the last is used as the return value. This mirrors Lisp's implicit return of the last expression.
|
|
|
|
|
|
### Generating a variable definition
|
|
|
|
|
|
```rust
|
|
|
fn gen_variable_def(name: &str, value: &Expr) -> String {
|
|
|
format!("ml_int {} = {};\n", mangle(name), gen_expr(value))
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Variable definitions at top level become global C variables.
|
|
|
|
|
|
### The full `generate` function
|
|
|
|
|
|
```rust
|
|
|
pub fn generate(exprs: Vec<Expr>) -> String {
|
|
|
let mut out = String::new();
|
|
|
out.push_str(PREAMBLE);
|
|
|
|
|
|
// Pass 1: forward declarations for all top-level functions
|
|
|
for expr in &exprs {
|
|
|
if let Expr::Define { name, value } = expr {
|
|
|
out.push_str(&gen_forward_decl(name, value));
|
|
|
}
|
|
|
}
|
|
|
out.push('\n');
|
|
|
|
|
|
// Pass 2: definitions
|
|
|
for expr in &exprs {
|
|
|
match expr {
|
|
|
Expr::Define { name, value } => match value.as_ref() {
|
|
|
Expr::Lambda { params, body } =>
|
|
|
out.push_str(&gen_function_def(name, params, body)),
|
|
|
_ =>
|
|
|
out.push_str(&gen_variable_def(name, value)),
|
|
|
}
|
|
|
// Top-level non-define expressions: emit in main()
|
|
|
_ => {} // handled in §16
|
|
|
}
|
|
|
}
|
|
|
|
|
|
out
|
|
|
}
|
|
|
```
|
|
|
|
|
|
### Tests
|
|
|
|
|
|
```rust
|
|
|
#[test]
|
|
|
fn test_simple_function() {
|
|
|
let src = "(define (square x) (* x x))";
|
|
|
let exprs = parse(src).unwrap();
|
|
|
let c = generate(exprs);
|
|
|
assert!(c.contains("ml_int ml_square(ml_int ml_x)"));
|
|
|
assert!(c.contains("return (ml_x * ml_x)"));
|
|
|
}
|
|
|
|
|
|
#[test]
|
|
|
fn test_forward_decl_present() {
|
|
|
let src = "(define (f x) (g x))\n(define (g x) x)";
|
|
|
let c = generate(parse(src).unwrap());
|
|
|
// f's forward decl must appear before g's definition
|
|
|
let fwd_pos = c.find("ml_int ml_f(").unwrap();
|
|
|
let def_pos = c.find("ml_int ml_g(ml_int ml_x) {").unwrap();
|
|
|
assert!(fwd_pos < def_pos);
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## Style notes
|
|
|
|
|
|
- Lead with the forward declaration problem — it's the "aha" moment of this section
|
|
|
- The two-pass structure is conceptually important; diagram it clearly
|
|
|
- Acknowledge the "everything is ml_int" simplification explicitly; readers will notice it
|
|
|
- The `body[..body.len()-1]` slice for all-but-last is a small Rust trick worth calling out
|