You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

5.6 KiB

+++ title = "§7 The Abstract Syntax Tree" priority = 5 status = "todo" ticket_type = "task" dependencies = [] +++

§7 The Abstract Syntax Tree — Stub to fill

File: edu/src/lisp-compiler.md, section ### 7. The Abstract Syntax Tree

Replace the stub line with full content. Target 600800 words. Define the complete Expr enum, explain the design choices, implement Display, and show how a MiniLisp program maps to the AST.

Learning objectives

  • Understand what an AST is and why it is separate from the concrete syntax
  • Know the complete Expr enum and all its variants
  • Understand the design trade-off between a generic List variant and specific special-form variants
  • Implement Display for Expr to enable debugging

Content to write

What is an AST?

An Abstract Syntax Tree strips away syntactic noise — parentheses, whitespace, comments — and represents only the semantic structure of a program. Two programs with different whitespace or comment placement produce identical ASTs. The AST is the compiler's internal representation from the parser forward.

Design Decision: Generic vs. Specific Variants

Two approaches for representing Lisp forms in the AST:

Option A — Generic list: everything is either an atom or a List(Vec<Expr>). Special forms are recognized during semantic analysis or code generation.

Option B — Specific variants: each special form (Define, If, Lambda, etc.) gets its own enum variant, recognized during parsing.

We use Option B. It means the parser does more work, but the analyser and code generator deal with well-typed data rather than raw lists. Exhaustive pattern matching catches missed cases at compile time.

The Expr Enum

Define in src/ast.rs:

/// A MiniLisp expression — the core AST node type.
#[derive(Debug, Clone, PartialEq)]
pub enum Expr {
    /// Integer literal: `42`, `-7`
    Int(i64),
    /// Boolean literal: `#t`, `#f`
    Bool(bool),
    /// String literal: `"hello"`
    Str(String),
    /// Symbol (variable name or operator): `x`, `+`, `my-var`
    Symbol(String),
    /// Variable binding: `(define name expr)`
    Define {
        name: String,
        value: Box<Expr>,
    },
    /// Function definition shorthand: `(define (name params...) body...)`
    /// Desugared by the parser into a `Define` wrapping a `Lambda`.
    /// (No separate variant needed.)

    /// Anonymous function: `(lambda (params...) body...)`
    Lambda {
        params: Vec<String>,
        body: Vec<Expr>,
    },
    /// Conditional: `(if cond then else)`
    If {
        cond: Box<Expr>,
        then: Box<Expr>,
        else_: Box<Expr>,
    },
    /// Local bindings: `(let ((x 1) (y 2)) body...)`
    Let {
        bindings: Vec<(String, Expr)>,
        body: Vec<Expr>,
    },
    /// Sequencing: `(begin expr1 expr2 ...)`
    Begin(Vec<Expr>),
    /// Function or operator call: `(f arg1 arg2 ...)`
    Call {
        func: Box<Expr>,
        args: Vec<Expr>,
    },
}

For each variant, explain:

  • What MiniLisp syntax it represents
  • Why Box<Expr> is needed for recursive fields (Rust requires known size for enum variants)
  • Why body in Lambda and Let is Vec<Expr> (multiple expressions, last one is the return value)

Display implementation

Implement Display for Expr so you can print ASTs during development:

impl std::fmt::Display for Expr {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Expr::Int(n)    => write!(f, "{}", n),
            Expr::Bool(b)   => write!(f, "{}", if *b { "#t" } else { "#f" }),
            Expr::Str(s)    => write!(f, "\"{}\"", s.escape_default()),
            Expr::Symbol(s) => write!(f, "{}", s),
            Expr::Define { name, value } => write!(f, "(define {} {})", name, value),
            Expr::Lambda { params, body } => {
                write!(f, "(lambda ({}) ", params.join(" "))?;
                for (i, e) in body.iter().enumerate() {
                    if i > 0 { write!(f, " ")?; }
                    write!(f, "{}", e)?;
                }
                write!(f, ")")
            }
            Expr::If { cond, then, else_ } => write!(f, "(if {} {} {})", cond, then, else_),
            Expr::Let { bindings, body } => {
                write!(f, "(let (")?;
                for (name, val) in bindings {
                    write!(f, "({} {})", name, val)?;
                }
                write!(f, ") ")?;
                for e in body { write!(f, "{}", e)?; }
                write!(f, ")")
            }
            Expr::Begin(exprs) => {
                write!(f, "(begin ")?;
                for (i, e) in exprs.iter().enumerate() {
                    if i > 0 { write!(f, " ")?; }
                    write!(f, "{}", e)?;
                }
                write!(f, ")")
            }
            Expr::Call { func, args } => {
                write!(f, "({}", func)?;
                for a in args { write!(f, " {}", a)?; }
                write!(f, ")")
            }
        }
    }
}

Mapping Example

Show how the factorial program from §1 maps to AST values. Write out the Expr tree for (define (factorial n) (if (= n 0) 1 (* n (factorial (- n 1))))) in Rust Expr literal notation. This makes the structure concrete.

Style notes

  • The design-decision discussion (generic vs. specific) should come before the code — readers should understand why we chose specific variants
  • Every variant should have a comment showing the corresponding MiniLisp syntax
  • The Display impl is a debugging aid; note that it is not tested for correctness beyond "it does not panic"