You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/edu-mmbr--8-parsing-atoms-w...

4.3 KiB

title status type priority created_at updated_at
§8 Parsing Atoms with nom completed task normal 2026-03-10T23:30:02Z 2026-03-10T23:30:02Z

§8 Parsing Atoms with nom — Stub to fill

File: edu/src/lisp-compiler.md, section ### 8. Parsing Atoms with nom

Replace the stub line with full content. Target 600800 words. This section takes the individual atom parsers from §6 and the AST from §7 and combines them into a single parse_atom function that returns IResult<&str, Expr>. Includes tests.

Learning objectives

  • Combine individual atom parsers into a single alt using map to produce Expr values
  • Understand how to add src/parser.rs to the project properly
  • Write comprehensive unit tests for atom parsing
  • Handle the ordering constraint in alt: integers before symbols

Content to write

The parse_atom function

In src/parser.rs, import the atom parsers from §6 and the Expr type from src/ast.rs, then combine them:

use nom::{IResult, branch::alt, combinator::map};
use crate::ast::Expr;

/// Parse any MiniLisp atom: integer, boolean, string, or symbol.
pub fn parse_atom(input: &str) -> IResult<&str, Expr> {
    alt((
        map(parse_integer, Expr::Int),
        map(parse_bool,    Expr::Bool),
        map(parse_string,  Expr::Str),
        map(parse_symbol,  |s: &str| Expr::Symbol(s.to_string())),
    ))(input)
}

Explain the ordering:

  1. Integer before symbol: -7 must match as integer, not as a symbol starting with -. Because parse_integer consumes the full -7 before parse_symbol is tried, the ordering ensures correct behavior.
  2. Boolean before symbol: #t and #f are not valid symbols (since # is not a symbol-start character), so ordering here does not matter — but it is cleaner to try booleans first.
  3. String last among atoms: no overlap with the others since strings start with ".

Module organisation

Show the complete src/parser.rs header at this point:

//! Parser for MiniLisp source code.
//!
//! Entry point: [`parse`] which accepts a `&str` and returns `Vec<Expr>`.

use nom::{
    IResult,
    branch::alt,
    bytes::complete::{escaped_transform, is_not, tag, take_while, take_while_m_n},
    character::complete::{char, digit1, multispace0, line_ending},
    combinator::{map, map_res, opt, recognize, value},
    sequence::{delimited, pair},
};

use crate::ast::Expr;

Whitespace-aware atom parser

Wrap parse_atom in the ws combinator so callers do not have to think about surrounding whitespace:

pub fn parse_atom_ws(input: &str) -> IResult<&str, Expr> {
    ws(parse_atom)(input)
}

Unit tests

Write a #[cfg(test)] module in src/parser.rs testing every atom type with multiple cases:

#[cfg(test)]
mod tests {
    use super::*;
    use crate::ast::Expr;

    #[test]
    fn test_integer_atom() {
        assert_eq!(parse_atom("42"), Ok(("", Expr::Int(42))));
        assert_eq!(parse_atom("-7 "), Ok((" ", Expr::Int(-7))));
        assert_eq!(parse_atom("0"), Ok(("", Expr::Int(0))));
    }

    #[test]
    fn test_bool_atom() {
        assert_eq!(parse_atom("#t"), Ok(("", Expr::Bool(true))));
        assert_eq!(parse_atom("#f"), Ok(("", Expr::Bool(false))));
    }

    #[test]
    fn test_string_atom() {
        assert_eq!(parse_atom(r#""hello""#), Ok(("", Expr::Str("hello".into()))));
        assert_eq!(parse_atom(r#""a\nb""#), Ok(("", Expr::Str("a\nb".into()))));
    }

    #[test]
    fn test_symbol_atom() {
        assert_eq!(parse_atom("my-var"), Ok(("", Expr::Symbol("my-var".into()))));
        assert_eq!(parse_atom("+"), Ok(("", Expr::Symbol("+".into()))));
        assert_eq!(parse_atom("factorial rest"), Ok((" rest", Expr::Symbol("factorial".into()))));
    }

    #[test]
    fn test_negative_integer_vs_symbol() {
        // -7 must be an integer, not a symbol
        assert_eq!(parse_atom("-7"), Ok(("", Expr::Int(-7))));
        // lone - is a symbol
        assert_eq!(parse_atom("- "), Ok((" ", Expr::Symbol("-".into()))));
    }
}

Run the tests

cargo test parser

All tests should pass before proceeding to §9.

Style notes

  • The ordering section is the most important teaching moment here — make it explicit
  • Show how map is used to lift a primitive value into an Expr variant
  • The test for -7 vs - (lone minus) is critical — flag it as something to get right