5.8 KiB
+++ title = "§13 Generating C: Atoms and Expressions" priority = 5 status = "todo" ticket_type = "task" dependencies = [] +++
§13 Generating C: Atoms and Expressions — Stub to fill
File: edu/src/lisp-compiler.md, section ### 13. Generating C: Atoms and Expressions
Replace the stub line with full content. Target 700–900 words. Implement the expression code generator — the recursive function that turns any Expr into a C expression string.
Learning objectives
- Implement
gen_expras a recursive function overExpr - Know how each atom type maps to a C literal
- Understand how binary operator calls map to C infix expressions
- Handle
display,newline,erroras special cases in call generation - Understand why all output is C expressions (not statements) at this level
Content to write
Expressions, not statements
In C, everything that produces a value is an expression. At this stage, the code generator works entirely with expressions — gen_expr always returns a C expression string that can appear on the right-hand side of an assignment or as a function argument. Statement generation (for sequencing and side effects) comes in §15.
gen_expr — the core function
/// Generate a C expression from a MiniLisp `Expr`.
///
/// Returns a `String` containing valid C code that evaluates to the
/// same value as the original expression.
pub fn gen_expr(expr: &Expr) -> String {
match expr {
Expr::Int(n) => n.to_string(),
Expr::Bool(b) => if *b { "ML_TRUE".to_string() } else { "ML_FALSE".to_string() },
Expr::Str(s) => format!("\"{}\"", s.escape_default()),
Expr::Symbol(name) => mangle(name),
Expr::If { cond, then, else_ } =>
format!("({} ? {} : {})", gen_expr(cond), gen_expr(then), gen_expr(else_)),
Expr::Call { func, args } => gen_call(func, args),
// These should not appear at expression level — handled as statements in §15
Expr::Begin(_) | Expr::Define { .. } | Expr::Lambda { .. } | Expr::Let { .. } =>
panic!("gen_expr called on a statement-level form"),
}
}
Walk through each arm:
Int(n) → decimal string. 42 → "42", -7 → "-7".
Bool(b) → "ML_TRUE" or "ML_FALSE" (the #defines from the preamble).
Str(s) → a C string literal. Use Rust's escape_default() to re-escape the string, then wrap in double quotes. This safely handles embedded newlines and quotes.
Symbol(name) → mangle(name). A symbol in expression position is a variable reference; mangling produces the correct C identifier.
If { cond, then, else_ } → C ternary: (cond ? then : else). Parenthesised to avoid operator precedence issues.
Call { func, args } → delegate to gen_call.
gen_call — operator and function calls
fn gen_call(func: &Expr, args: &[Expr]) -> String {
// Built-in binary operators
if let Expr::Symbol(op) = func {
match op.as_str() {
"+" | "-" | "*" | "/" => {
let a = gen_expr(&args[0]);
let b = gen_expr(&args[1]);
return format!("({} {} {})", a, op, b);
}
"=" => return format!("({} == {})", gen_expr(&args[0]), gen_expr(&args[1])),
"<" | ">" | "<=" | ">=" => {
return format!("({} {} {})", gen_expr(&args[0]), op, gen_expr(&args[1]));
}
"not" => return format!("(!{})", gen_expr(&args[0])),
// display / newline / error are statement-level; handled in gen_stmt
"display" | "newline" | "error" => {
// When called in expression position (inside an if branch, etc.),
// emit as a comma expression: (side_effect, 0)
return format!("({}, 0)", gen_display_stmt(&args[0]));
}
_ => {}
}
}
// General function call
let func_c = gen_expr(func);
let args_c: Vec<String> = args.iter().map(gen_expr).collect();
format!("{}({})", func_c, args_c.join(", "))
}
Explain the "comma expression" trick for display in expression position: (printf(...), 0) is valid C — the comma operator evaluates both sides and returns the right-hand value (0 here, which acts as a placeholder integer).
Note that the arity guarantees from §11 mean we can safely index args[0] and args[1] without bounds checking.
String escaping
Show the escape_for_c helper that the string code path uses:
fn escape_for_c(s: &str) -> String {
s.chars().flat_map(|c| match c {
'"' => vec!['\\', '"'],
'\\' => vec!['\\', '\\'],
'\n' => vec!['\\', 'n'],
'\t' => vec!['\\', 't'],
c => vec![c],
}).collect()
}
Use this instead of escape_default() which uses Rust escape syntax (\u{...}) that is not valid C.
Tests
#[test]
fn test_gen_int() {
assert_eq!(gen_expr(&Expr::Int(42)), "42");
assert_eq!(gen_expr(&Expr::Int(-7)), "-7");
}
#[test]
fn test_gen_add() {
let expr = Expr::Call {
func: Box::new(Expr::Symbol("+".into())),
args: vec![Expr::Int(1), Expr::Int(2)],
};
assert_eq!(gen_expr(&expr), "(1 + 2)");
}
#[test]
fn test_gen_if() {
let expr = Expr::If {
cond: Box::new(Expr::Bool(true)),
then: Box::new(Expr::Int(1)),
else_: Box::new(Expr::Int(0)),
};
assert_eq!(gen_expr(&expr), "(ML_TRUE ? 1 : 0)");
}
Style notes
- Emphasise "C expressions only" at the top — this is the key architectural decision for this section
- Walk through each operator conversion explicitly; readers need to see the
=→==translation noted - The comma-expression trick for
displayin expression position is an interesting C technique — explain it clearly - Note that the
panic!for statement-level forms is a programming error guard, not a user-facing error