You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4.6 KiB

+++ title = "§12 The C Runtime Preamble" priority = 5 status = "todo" ticket_type = "task" dependencies = [] +++

§12 The C Runtime Preamble — Stub to fill

File: edu/src/lisp-compiler.md, section ### 12. The C Runtime Preamble

Replace the stub line with full content. Target 500700 words. Design and write the C preamble string. Explain every line so the reader understands what the generated C file contains before their code begins.

Learning objectives

  • Understand what the preamble provides and why each part is needed
  • Know the C type system used for MiniLisp values
  • See the complete runtime helper functions for display, newline, error
  • Understand name mangling: why all generated names are prefixed with ml_

Content to write

What the preamble does

Every MiniLisp program compiles to a single C file. The preamble is a fixed block of C text emitted at the top of that file before any user-defined code. It provides:

  1. Standard library includes
  2. Type definitions
  3. Boolean constants
  4. Runtime helper functions for built-ins

The complete preamble

Present as a Rust const string:

pub const PREAMBLE: &str = r#"#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>

/* MiniLisp runtime types */
typedef int64_t  ml_int;
typedef int      ml_bool;
typedef char*    ml_str;

#define ML_TRUE  1
#define ML_FALSE 0

/* ml_display: print a value to stdout */
static void ml_display_int(ml_int v)  { printf("%ld", v); }
static void ml_display_bool(ml_bool v) { printf("%s", v ? "true" : "false"); }
static void ml_display_str(ml_str v)  { printf("%s", v); }

/* ml_newline: print a newline */
static void ml_newline(void) { printf("\n"); }

/* ml_error: print a message to stderr and exit */
static void ml_error(ml_str msg) {
    fprintf(stderr, "error: %s\n", msg);
    exit(1);
}
"#;

Explain each section:

Type definitions. ml_int is int64_t (64-bit signed integer). ml_bool is int (C does not have a native boolean type; 0 is false, non-zero is true — we use 1 for true). ml_str is char*. The ml_ prefix prevents name collisions with any C standard library names.

Boolean constants. ML_TRUE and ML_FALSE are #define macros. Arithmetic operations on booleans (e.g., (+ #t 1)) are undefined in MiniLisp but will silently work in the generated C — this is acceptable for a minimal compiler.

Display functions. There are three variants because C does not have dynamic dispatch. The code generator picks the right variant based on the expression being displayed (or, since we have no type inference, it may emit ml_display_int for all non-string, non-bool expressions). Note: this design decision belongs to §13 — mention that the code generator chooses the variant.

ml_error. Writes to stderr and calls exit(1). The message is always a string literal in MiniLisp.

static linkage. The helper functions are declared static to prevent symbol conflicts if the generated C file is ever linked with other files.

Name mangling

All generated C identifiers for user-defined symbols are prefixed with ml_ and have hyphens replaced with underscores (since - is not valid in a C identifier). For example:

  • factorialml_factorial
  • my-varml_my_var
  • even?ml_even_p (using _p suffix for ?)
  • set!ml_set_e (using _e suffix for !)

Present the mangling rules as a table and show the Rust function that performs the translation:

/// Mangle a MiniLisp symbol name into a valid C identifier.
pub fn mangle(name: &str) -> String {
    let mut result = String::from("ml_");
    for c in name.chars() {
        match c {
            '-' => result.push('_'),
            '?' => result.push_str("_p"),
            '!' => result.push_str("_e"),
            c if c.is_alphanumeric() || c == '_' => result.push(c),
            _ => result.push_str(&format!("_{:x}", c as u32)),
        }
    }
    result
}

Explain: this function is used throughout codegen.rs whenever a symbol reference or definition needs to be emitted.

Emitting the preamble

In codegen.rs:

pub fn generate(exprs: Vec<Expr>) -> String {
    let mut out = String::new();
    out.push_str(PREAMBLE);
    // ... rest of generation in §1315
    out
}

Style notes

  • Show the full preamble string as a verbatim block — readers need to see exactly what gets emitted
  • The name-mangling table is the most referenceable thing in this section; present it prominently
  • Note that the display overloading decision (three variants) is a simplification — a real Lisp runtime would use tagged unions or polymorphism