You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

158 lines
5.6 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

+++
title = "§14 Generating C: Definitions and Functions"
priority = 5
status = "done"
ticket_type = "task"
dependencies = []
+++
## §14 Generating C: Definitions and Functions — Stub to fill
File: `edu/src/lisp-compiler.md`, section `### 14. Generating C: Definitions and Functions`
Replace the stub line with full content. Target 700900 words. Implement code generation for top-level `define` forms and `lambda` expressions, including forward declarations for mutual recursion.
## Learning objectives
- Emit forward declarations for all functions before their definitions
- Generate a correct C function signature from a `Lambda` with named parameters
- Handle variable `define` vs. function `define`
- Understand C's requirement for forward declarations and why MiniLisp needs them
## Content to write
### Why forward declarations?
In C, a function must be declared before it is called. If `even?` calls `odd?` and `odd?` calls `even?`, whichever is defined first will try to call a symbol that has not yet been declared. Forward declarations — just the function signature with no body — solve this by telling the C compiler the signature exists before the definition appears.
MiniLisp makes no guarantees about definition order, so we emit forward declarations for every top-level function before any definition.
### Two-pass code generation
The code generator uses two passes over the top-level `Vec<Expr>`:
1. **Forward declaration pass**: emit `ml_int ml_name(ml_int param1, ...);` for every top-level `define` that wraps a `lambda`.
2. **Definition pass**: emit the full function body (or variable initializer) for every top-level `define`.
### Type signatures
MiniLisp has no type annotations. All values compile to `ml_int` (which is `int64_t`). This includes:
- Integers: trivially `ml_int`
- Booleans: stored as `ml_int` (0 or 1)
- Strings: a limitation — string-returning functions are declared as `ml_int` too, which is technically wrong but will compile for our simple programs. Acknowledge this simplification.
A more honest approach would be to use `void*` or a tagged union — note this in the "What's Next" section.
### Generating a forward declaration
```rust
fn gen_forward_decl(name: &str, lambda: &Expr) -> String {
if let Expr::Lambda { params, .. } = lambda {
let c_name = mangle(name);
let param_list: Vec<String> = params.iter()
.map(|p| format!("ml_int {}", mangle(p)))
.collect();
format!("ml_int {}({});\n", c_name, param_list.join(", "))
} else {
String::new() // variable define; no forward declaration needed
}
}
```
### Generating a function definition
```rust
fn gen_function_def(name: &str, params: &[String], body: &[Expr]) -> String {
let c_name = mangle(name);
let param_list: Vec<String> = params.iter()
.map(|p| format!("ml_int {}", mangle(p)))
.collect();
let mut out = format!("ml_int {}({}) {{\n", c_name, param_list.join(", "));
// All body expressions except the last are statements (side effects)
for expr in &body[..body.len() - 1] {
out.push_str(&format!(" {};\n", gen_stmt(expr)));
}
// Last body expression is the return value
let last = body.last().unwrap();
out.push_str(&format!(" return {};\n", gen_expr(last)));
out.push_str("}\n");
out
}
```
Explain the idiom: all but the last body expression are evaluated as statements (for side effects like `display`); the last is used as the return value. This mirrors Lisp's implicit return of the last expression.
### Generating a variable definition
```rust
fn gen_variable_def(name: &str, value: &Expr) -> String {
format!("ml_int {} = {};\n", mangle(name), gen_expr(value))
}
```
Variable definitions at top level become global C variables.
### The full `generate` function
```rust
pub fn generate(exprs: Vec<Expr>) -> String {
let mut out = String::new();
out.push_str(PREAMBLE);
// Pass 1: forward declarations for all top-level functions
for expr in &exprs {
if let Expr::Define { name, value } = expr {
out.push_str(&gen_forward_decl(name, value));
}
}
out.push('\n');
// Pass 2: definitions
for expr in &exprs {
match expr {
Expr::Define { name, value } => match value.as_ref() {
Expr::Lambda { params, body } =>
out.push_str(&gen_function_def(name, params, body)),
_ =>
out.push_str(&gen_variable_def(name, value)),
}
// Top-level non-define expressions: emit in main()
_ => {} // handled in §16
}
}
out
}
```
### Tests
```rust
#[test]
fn test_simple_function() {
let src = "(define (square x) (* x x))";
let exprs = parse(src).unwrap();
let c = generate(exprs);
assert!(c.contains("ml_int ml_square(ml_int ml_x)"));
assert!(c.contains("return (ml_x * ml_x)"));
}
#[test]
fn test_forward_decl_present() {
let src = "(define (f x) (g x))\n(define (g x) x)";
let c = generate(parse(src).unwrap());
// f's forward decl must appear before g's definition
let fwd_pos = c.find("ml_int ml_f(").unwrap();
let def_pos = c.find("ml_int ml_g(ml_int ml_x) {").unwrap();
assert!(fwd_pos < def_pos);
}
```
## Style notes
- Lead with the forward declaration problem — it's the "aha" moment of this section
- The two-pass structure is conceptually important; diagram it clearly
- Acknowledge the "everything is ml_int" simplification explicitly; readers will notice it
- The `body[..body.len()-1]` slice for all-but-last is a small Rust trick worth calling out