+++ title = "§10 Symbol Tables and Scope" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++ ## §10 Symbol Tables and Scope — Stub to fill File: `edu/src/lisp-compiler.md`, section `### 10. Symbol Tables and Scope` Replace the stub line with full content. Target 700–900 words. Build the environment chain that represents lexical scope, then write the scope-checking traversal. Reading-heavy with moderate code. ## Learning objectives - Understand what a symbol table is and why it is needed - Implement an environment chain (linked scope structure) in Rust - Write an AST traversal that resolves all symbol references - Produce clear `SemanticError` messages for undefined names ## Content to write ### What is a symbol table? A symbol table maps names to information about them — where they are defined, their type (if we had types), and any other metadata. Our symbol table is simple: a set of names that are currently in scope. We just need to know *whether* a name is defined; we do not need to know *what* it is (no type information). ### Lexical scope In MiniLisp, scope is lexical (also called static): a name's binding is determined by the syntactic structure of the program, not by the runtime call stack. When you write `(lambda (x) x)`, `x` is in scope inside the lambda body regardless of what `x` means in the surrounding context. ### The environment chain Represent scope as a chain of `HashSet`, one per scope level. Looking up a name means searching from innermost to outermost. ```rust use std::collections::HashSet; /// A chain of scopes representing the current lexical environment. pub struct Env<'a> { names: HashSet, parent: Option<&'a Env<'a>>, } impl<'a> Env<'a> { pub fn new() -> Self { Env { names: HashSet::new(), parent: None } } pub fn child(&'a self) -> Env<'a> { Env { names: HashSet::new(), parent: Some(self) } } pub fn define(&mut self, name: &str) { self.names.insert(name.to_string()); } pub fn is_defined(&self, name: &str) -> bool { self.names.contains(name) || self.parent.map_or(false, |p| p.is_defined(name)) } } ``` Explain the lifetime `'a`: the child env borrows the parent env. Since children always have shorter lifetimes than parents (they go out of scope at the closing `)` of a `lambda` or `let`), this is safe. ### Pre-populating the global environment Built-in operators and functions (`+`, `-`, `*`, `/`, `=`, `<`, `>`, `<=`, `>=`, `not`, `display`, `newline`, `error`) must be defined in the global env from the start — they are always available without a `define`. ```rust pub fn global_env() -> Env<'static> { let mut env = Env::new(); for name in ["+", "-", "*", "/", "=", "<", ">", "<=", ">=", "not", "display", "newline", "error"] { env.define(name); } env } ``` ### The scope-checking traversal Walk the `Vec` and call `check_expr` on each. The `check_expr` function pattern-matches on each `Expr` variant: ```rust pub fn check_expr(expr: &Expr, env: &Env) -> Result<(), CompileError> { match expr { Expr::Symbol(name) => { if !env.is_defined(name) { return Err(CompileError::SemanticError( format!("undefined symbol: `{}`", name) )); } Ok(()) } Expr::Define { name, value } => { check_expr(value, env)?; // Note: we don't add `name` to env here because top-level defines // are processed in a first pass (see below). Ok(()) } Expr::Lambda { params, body } => { let mut child = env.child(); for p in params { child.define(p); } for e in body { check_expr(e, &child)?; } Ok(()) } Expr::If { cond, then, else_ } => { check_expr(cond, env)?; check_expr(then, env)?; check_expr(else_, env) } // ... Let, Begin, Call, atoms (atoms other than Symbol always pass) _ => Ok(()) } } ``` ### Two-pass processing for mutual recursion Top-level `define` forms can reference each other mutually (e.g., `even?` calling `odd?` and vice versa). A single left-to-right pass would reject the second function because the first is not yet defined. Solution: a two-pass approach. 1. First pass: scan all top-level `Expr::Define` forms and add their names to the global env. 2. Second pass: check every expression with the fully-populated global env. Show this in the `analyse` entry point: ```rust pub fn analyse(exprs: Vec) -> Result, CompileError> { let mut env = global_env(); // First pass: register all top-level names for expr in &exprs { if let Expr::Define { name, .. } = expr { env.define(name); } } // Second pass: check all expressions for expr in &exprs { check_expr(expr, &env)?; } Ok(exprs) } ``` ### Unit tests Test: undefined symbol rejected, mutually recursive defines accepted, lambda scope is isolated, let bindings are in scope inside body. ## Style notes - Motivate the environment chain before defining it — readers who have not seen this technique before will find it conceptually elegant once explained - The two-pass trick is a genuine insight — give it appropriate emphasis - Note that we return `Ok(exprs)` unchanged — the analyser is purely a checker; it does not transform the AST