You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/archive/edu-63ze--13-generating-c-a...

158 lines
5.9 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
# edu-63ze
title: '§13 Generating C: Atoms and Expressions'
status: completed
type: task
priority: normal
created_at: 2026-03-10T23:30:00Z
updated_at: 2026-03-10T23:30:00Z
---
## §13 Generating C: Atoms and Expressions — Stub to fill
File: `edu/src/lisp-compiler.md`, section `### 13. Generating C: Atoms and Expressions`
Replace the stub line with full content. Target 700900 words. Implement the expression code generator — the recursive function that turns any `Expr` into a C expression string.
## Learning objectives
- Implement `gen_expr` as a recursive function over `Expr`
- Know how each atom type maps to a C literal
- Understand how binary operator calls map to C infix expressions
- Handle `display`, `newline`, `error` as special cases in call generation
- Understand why all output is C *expressions* (not statements) at this level
## Content to write
### Expressions, not statements
In C, everything that produces a value is an expression. At this stage, the code generator works entirely with expressions — `gen_expr` always returns a C expression string that can appear on the right-hand side of an assignment or as a function argument. Statement generation (for sequencing and side effects) comes in §15.
### `gen_expr` — the core function
```rust
/// Generate a C expression from a MiniLisp `Expr`.
///
/// Returns a `String` containing valid C code that evaluates to the
/// same value as the original expression.
pub fn gen_expr(expr: &Expr) -> String {
match expr {
Expr::Int(n) => n.to_string(),
Expr::Bool(b) => if *b { "ML_TRUE".to_string() } else { "ML_FALSE".to_string() },
Expr::Str(s) => format!("\"{}\"", s.escape_default()),
Expr::Symbol(name) => mangle(name),
Expr::If { cond, then, else_ } =>
format!("({} ? {} : {})", gen_expr(cond), gen_expr(then), gen_expr(else_)),
Expr::Call { func, args } => gen_call(func, args),
// These should not appear at expression level — handled as statements in §15
Expr::Begin(_) | Expr::Define { .. } | Expr::Lambda { .. } | Expr::Let { .. } =>
panic!("gen_expr called on a statement-level form"),
}
}
```
Walk through each arm:
**`Int(n)`** → decimal string. `42``"42"`, `-7``"-7"`.
**`Bool(b)`** → `"ML_TRUE"` or `"ML_FALSE"` (the `#define`s from the preamble).
**`Str(s)`** → a C string literal. Use Rust's `escape_default()` to re-escape the string, then wrap in double quotes. This safely handles embedded newlines and quotes.
**`Symbol(name)`** → `mangle(name)`. A symbol in expression position is a variable reference; mangling produces the correct C identifier.
**`If { cond, then, else_ }`** → C ternary: `(cond ? then : else)`. Parenthesised to avoid operator precedence issues.
**`Call { func, args }`** → delegate to `gen_call`.
### `gen_call` — operator and function calls
```rust
fn gen_call(func: &Expr, args: &[Expr]) -> String {
// Built-in binary operators
if let Expr::Symbol(op) = func {
match op.as_str() {
"+" | "-" | "*" | "/" => {
let a = gen_expr(&args[0]);
let b = gen_expr(&args[1]);
return format!("({} {} {})", a, op, b);
}
"=" => return format!("({} == {})", gen_expr(&args[0]), gen_expr(&args[1])),
"<" | ">" | "<=" | ">=" => {
return format!("({} {} {})", gen_expr(&args[0]), op, gen_expr(&args[1]));
}
"not" => return format!("(!{})", gen_expr(&args[0])),
// display / newline / error are statement-level; handled in gen_stmt
"display" | "newline" | "error" => {
// When called in expression position (inside an if branch, etc.),
// emit as a comma expression: (side_effect, 0)
return format!("({}, 0)", gen_display_stmt(&args[0]));
}
_ => {}
}
}
// General function call
let func_c = gen_expr(func);
let args_c: Vec<String> = args.iter().map(gen_expr).collect();
format!("{}({})", func_c, args_c.join(", "))
}
```
Explain the "comma expression" trick for `display` in expression position: `(printf(...), 0)` is valid C — the comma operator evaluates both sides and returns the right-hand value (0 here, which acts as a placeholder integer).
Note that the arity guarantees from §11 mean we can safely index `args[0]` and `args[1]` without bounds checking.
### String escaping
Show the `escape_for_c` helper that the string code path uses:
```rust
fn escape_for_c(s: &str) -> String {
s.chars().flat_map(|c| match c {
'"' => vec!['\\', '"'],
'\\' => vec!['\\', '\\'],
'\n' => vec!['\\', 'n'],
'\t' => vec!['\\', 't'],
c => vec![c],
}).collect()
}
```
Use this instead of `escape_default()` which uses Rust escape syntax (`\u{...}`) that is not valid C.
### Tests
```rust
#[test]
fn test_gen_int() {
assert_eq!(gen_expr(&Expr::Int(42)), "42");
assert_eq!(gen_expr(&Expr::Int(-7)), "-7");
}
#[test]
fn test_gen_add() {
let expr = Expr::Call {
func: Box::new(Expr::Symbol("+".into())),
args: vec![Expr::Int(1), Expr::Int(2)],
};
assert_eq!(gen_expr(&expr), "(1 + 2)");
}
#[test]
fn test_gen_if() {
let expr = Expr::If {
cond: Box::new(Expr::Bool(true)),
then: Box::new(Expr::Int(1)),
else_: Box::new(Expr::Int(0)),
};
assert_eq!(gen_expr(&expr), "(ML_TRUE ? 1 : 0)");
}
```
## Style notes
- Emphasise "C expressions only" at the top — this is the key architectural decision for this section
- Walk through each operator conversion explicitly; readers need to see the `=``==` translation noted
- The comma-expression trick for `display` in expression position is an interesting C technique — explain it clearly
- Note that the `panic!` for statement-level forms is a programming error guard, not a user-facing error