5.4 KiB
+++ title = "§10 Symbol Tables and Scope" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++
§10 Symbol Tables and Scope — Stub to fill
File: edu/src/lisp-compiler.md, section ### 10. Symbol Tables and Scope
Replace the stub line with full content. Target 700–900 words. Build the environment chain that represents lexical scope, then write the scope-checking traversal. Reading-heavy with moderate code.
Learning objectives
- Understand what a symbol table is and why it is needed
- Implement an environment chain (linked scope structure) in Rust
- Write an AST traversal that resolves all symbol references
- Produce clear
SemanticErrormessages for undefined names
Content to write
What is a symbol table?
A symbol table maps names to information about them — where they are defined, their type (if we had types), and any other metadata. Our symbol table is simple: a set of names that are currently in scope. We just need to know whether a name is defined; we do not need to know what it is (no type information).
Lexical scope
In MiniLisp, scope is lexical (also called static): a name's binding is determined by the syntactic structure of the program, not by the runtime call stack. When you write (lambda (x) x), x is in scope inside the lambda body regardless of what x means in the surrounding context.
The environment chain
Represent scope as a chain of HashSet<String>, one per scope level. Looking up a name means searching from innermost to outermost.
use std::collections::HashSet;
/// A chain of scopes representing the current lexical environment.
pub struct Env<'a> {
names: HashSet<String>,
parent: Option<&'a Env<'a>>,
}
impl<'a> Env<'a> {
pub fn new() -> Self {
Env { names: HashSet::new(), parent: None }
}
pub fn child(&'a self) -> Env<'a> {
Env { names: HashSet::new(), parent: Some(self) }
}
pub fn define(&mut self, name: &str) {
self.names.insert(name.to_string());
}
pub fn is_defined(&self, name: &str) -> bool {
self.names.contains(name) || self.parent.map_or(false, |p| p.is_defined(name))
}
}
Explain the lifetime 'a: the child env borrows the parent env. Since children always have shorter lifetimes than parents (they go out of scope at the closing ) of a lambda or let), this is safe.
Pre-populating the global environment
Built-in operators and functions (+, -, *, /, =, <, >, <=, >=, not, display, newline, error) must be defined in the global env from the start — they are always available without a define.
pub fn global_env() -> Env<'static> {
let mut env = Env::new();
for name in ["+", "-", "*", "/", "=", "<", ">", "<=", ">=",
"not", "display", "newline", "error"] {
env.define(name);
}
env
}
The scope-checking traversal
Walk the Vec<Expr> and call check_expr on each. The check_expr function pattern-matches on each Expr variant:
pub fn check_expr(expr: &Expr, env: &Env) -> Result<(), CompileError> {
match expr {
Expr::Symbol(name) => {
if !env.is_defined(name) {
return Err(CompileError::SemanticError(
format!("undefined symbol: `{}`", name)
));
}
Ok(())
}
Expr::Define { name, value } => {
check_expr(value, env)?;
// Note: we don't add `name` to env here because top-level defines
// are processed in a first pass (see below).
Ok(())
}
Expr::Lambda { params, body } => {
let mut child = env.child();
for p in params { child.define(p); }
for e in body { check_expr(e, &child)?; }
Ok(())
}
Expr::If { cond, then, else_ } => {
check_expr(cond, env)?;
check_expr(then, env)?;
check_expr(else_, env)
}
// ... Let, Begin, Call, atoms (atoms other than Symbol always pass)
_ => Ok(())
}
}
Two-pass processing for mutual recursion
Top-level define forms can reference each other mutually (e.g., even? calling odd? and vice versa). A single left-to-right pass would reject the second function because the first is not yet defined.
Solution: a two-pass approach.
- First pass: scan all top-level
Expr::Defineforms and add their names to the global env. - Second pass: check every expression with the fully-populated global env.
Show this in the analyse entry point:
pub fn analyse(exprs: Vec<Expr>) -> Result<Vec<Expr>, CompileError> {
let mut env = global_env();
// First pass: register all top-level names
for expr in &exprs {
if let Expr::Define { name, .. } = expr {
env.define(name);
}
}
// Second pass: check all expressions
for expr in &exprs {
check_expr(expr, &env)?;
}
Ok(exprs)
}
Unit tests
Test: undefined symbol rejected, mutually recursive defines accepted, lambda scope is isolated, let bindings are in scope inside body.
Style notes
- Motivate the environment chain before defining it — readers who have not seen this technique before will find it conceptually elegant once explained
- The two-pass trick is a genuine insight — give it appropriate emphasis
- Note that we return
Ok(exprs)unchanged — the analyser is purely a checker; it does not transform the AST