|
|
+++
|
|
|
title = "§13 Generating C: Atoms and Expressions"
|
|
|
priority = 5
|
|
|
status = "todo"
|
|
|
ticket_type = "task"
|
|
|
dependencies = []
|
|
|
+++
|
|
|
|
|
|
## §13 Generating C: Atoms and Expressions — Stub to fill
|
|
|
|
|
|
File: `edu/src/lisp-compiler.md`, section `### 13. Generating C: Atoms and Expressions`
|
|
|
|
|
|
Replace the stub line with full content. Target 700–900 words. Implement the expression code generator — the recursive function that turns any `Expr` into a C expression string.
|
|
|
|
|
|
## Learning objectives
|
|
|
|
|
|
- Implement `gen_expr` as a recursive function over `Expr`
|
|
|
- Know how each atom type maps to a C literal
|
|
|
- Understand how binary operator calls map to C infix expressions
|
|
|
- Handle `display`, `newline`, `error` as special cases in call generation
|
|
|
- Understand why all output is C *expressions* (not statements) at this level
|
|
|
|
|
|
## Content to write
|
|
|
|
|
|
### Expressions, not statements
|
|
|
|
|
|
In C, everything that produces a value is an expression. At this stage, the code generator works entirely with expressions — `gen_expr` always returns a C expression string that can appear on the right-hand side of an assignment or as a function argument. Statement generation (for sequencing and side effects) comes in §15.
|
|
|
|
|
|
### `gen_expr` — the core function
|
|
|
|
|
|
```rust
|
|
|
/// Generate a C expression from a MiniLisp `Expr`.
|
|
|
///
|
|
|
/// Returns a `String` containing valid C code that evaluates to the
|
|
|
/// same value as the original expression.
|
|
|
pub fn gen_expr(expr: &Expr) -> String {
|
|
|
match expr {
|
|
|
Expr::Int(n) => n.to_string(),
|
|
|
Expr::Bool(b) => if *b { "ML_TRUE".to_string() } else { "ML_FALSE".to_string() },
|
|
|
Expr::Str(s) => format!("\"{}\"", s.escape_default()),
|
|
|
Expr::Symbol(name) => mangle(name),
|
|
|
Expr::If { cond, then, else_ } =>
|
|
|
format!("({} ? {} : {})", gen_expr(cond), gen_expr(then), gen_expr(else_)),
|
|
|
Expr::Call { func, args } => gen_call(func, args),
|
|
|
// These should not appear at expression level — handled as statements in §15
|
|
|
Expr::Begin(_) | Expr::Define { .. } | Expr::Lambda { .. } | Expr::Let { .. } =>
|
|
|
panic!("gen_expr called on a statement-level form"),
|
|
|
}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Walk through each arm:
|
|
|
|
|
|
**`Int(n)`** → decimal string. `42` → `"42"`, `-7` → `"-7"`.
|
|
|
|
|
|
**`Bool(b)`** → `"ML_TRUE"` or `"ML_FALSE"` (the `#define`s from the preamble).
|
|
|
|
|
|
**`Str(s)`** → a C string literal. Use Rust's `escape_default()` to re-escape the string, then wrap in double quotes. This safely handles embedded newlines and quotes.
|
|
|
|
|
|
**`Symbol(name)`** → `mangle(name)`. A symbol in expression position is a variable reference; mangling produces the correct C identifier.
|
|
|
|
|
|
**`If { cond, then, else_ }`** → C ternary: `(cond ? then : else)`. Parenthesised to avoid operator precedence issues.
|
|
|
|
|
|
**`Call { func, args }`** → delegate to `gen_call`.
|
|
|
|
|
|
### `gen_call` — operator and function calls
|
|
|
|
|
|
```rust
|
|
|
fn gen_call(func: &Expr, args: &[Expr]) -> String {
|
|
|
// Built-in binary operators
|
|
|
if let Expr::Symbol(op) = func {
|
|
|
match op.as_str() {
|
|
|
"+" | "-" | "*" | "/" => {
|
|
|
let a = gen_expr(&args[0]);
|
|
|
let b = gen_expr(&args[1]);
|
|
|
return format!("({} {} {})", a, op, b);
|
|
|
}
|
|
|
"=" => return format!("({} == {})", gen_expr(&args[0]), gen_expr(&args[1])),
|
|
|
"<" | ">" | "<=" | ">=" => {
|
|
|
return format!("({} {} {})", gen_expr(&args[0]), op, gen_expr(&args[1]));
|
|
|
}
|
|
|
"not" => return format!("(!{})", gen_expr(&args[0])),
|
|
|
// display / newline / error are statement-level; handled in gen_stmt
|
|
|
"display" | "newline" | "error" => {
|
|
|
// When called in expression position (inside an if branch, etc.),
|
|
|
// emit as a comma expression: (side_effect, 0)
|
|
|
return format!("({}, 0)", gen_display_stmt(&args[0]));
|
|
|
}
|
|
|
_ => {}
|
|
|
}
|
|
|
}
|
|
|
// General function call
|
|
|
let func_c = gen_expr(func);
|
|
|
let args_c: Vec<String> = args.iter().map(gen_expr).collect();
|
|
|
format!("{}({})", func_c, args_c.join(", "))
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Explain the "comma expression" trick for `display` in expression position: `(printf(...), 0)` is valid C — the comma operator evaluates both sides and returns the right-hand value (0 here, which acts as a placeholder integer).
|
|
|
|
|
|
Note that the arity guarantees from §11 mean we can safely index `args[0]` and `args[1]` without bounds checking.
|
|
|
|
|
|
### String escaping
|
|
|
|
|
|
Show the `escape_for_c` helper that the string code path uses:
|
|
|
|
|
|
```rust
|
|
|
fn escape_for_c(s: &str) -> String {
|
|
|
s.chars().flat_map(|c| match c {
|
|
|
'"' => vec!['\\', '"'],
|
|
|
'\\' => vec!['\\', '\\'],
|
|
|
'\n' => vec!['\\', 'n'],
|
|
|
'\t' => vec!['\\', 't'],
|
|
|
c => vec![c],
|
|
|
}).collect()
|
|
|
}
|
|
|
```
|
|
|
|
|
|
Use this instead of `escape_default()` which uses Rust escape syntax (`\u{...}`) that is not valid C.
|
|
|
|
|
|
### Tests
|
|
|
|
|
|
```rust
|
|
|
#[test]
|
|
|
fn test_gen_int() {
|
|
|
assert_eq!(gen_expr(&Expr::Int(42)), "42");
|
|
|
assert_eq!(gen_expr(&Expr::Int(-7)), "-7");
|
|
|
}
|
|
|
|
|
|
#[test]
|
|
|
fn test_gen_add() {
|
|
|
let expr = Expr::Call {
|
|
|
func: Box::new(Expr::Symbol("+".into())),
|
|
|
args: vec![Expr::Int(1), Expr::Int(2)],
|
|
|
};
|
|
|
assert_eq!(gen_expr(&expr), "(1 + 2)");
|
|
|
}
|
|
|
|
|
|
#[test]
|
|
|
fn test_gen_if() {
|
|
|
let expr = Expr::If {
|
|
|
cond: Box::new(Expr::Bool(true)),
|
|
|
then: Box::new(Expr::Int(1)),
|
|
|
else_: Box::new(Expr::Int(0)),
|
|
|
};
|
|
|
assert_eq!(gen_expr(&expr), "(ML_TRUE ? 1 : 0)");
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## Style notes
|
|
|
|
|
|
- Emphasise "C expressions only" at the top — this is the key architectural decision for this section
|
|
|
- Walk through each operator conversion explicitly; readers need to see the `=` → `==` translation noted
|
|
|
- The comma-expression trick for `display` in expression position is an interesting C technique — explain it clearly
|
|
|
- Note that the `panic!` for statement-level forms is a programming error guard, not a user-facing error
|