You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/archive/edu-4kkb--12-the-c-runtime-...

128 lines
4.6 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
# edu-4kkb
title: §12 The C Runtime Preamble
status: completed
type: task
priority: normal
created_at: 2026-03-10T23:30:00Z
updated_at: 2026-03-10T23:30:00Z
---
## §12 The C Runtime Preamble — Stub to fill
File: `edu/src/lisp-compiler.md`, section `### 12. The C Runtime Preamble`
Replace the stub line with full content. Target 500700 words. Design and write the C preamble string. Explain every line so the reader understands what the generated C file contains before their code begins.
## Learning objectives
- Understand what the preamble provides and why each part is needed
- Know the C type system used for MiniLisp values
- See the complete runtime helper functions for `display`, `newline`, `error`
- Understand name mangling: why all generated names are prefixed with `ml_`
## Content to write
### What the preamble does
Every MiniLisp program compiles to a single C file. The preamble is a fixed block of C text emitted at the top of that file before any user-defined code. It provides:
1. Standard library includes
2. Type definitions
3. Boolean constants
4. Runtime helper functions for built-ins
### The complete preamble
Present as a Rust `const` string:
```rust
pub const PREAMBLE: &str = r#"#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
/* MiniLisp runtime types */
typedef int64_t ml_int;
typedef int ml_bool;
typedef char* ml_str;
#define ML_TRUE 1
#define ML_FALSE 0
/* ml_display: print a value to stdout */
static void ml_display_int(ml_int v) { printf("%ld", v); }
static void ml_display_bool(ml_bool v) { printf("%s", v ? "true" : "false"); }
static void ml_display_str(ml_str v) { printf("%s", v); }
/* ml_newline: print a newline */
static void ml_newline(void) { printf("\n"); }
/* ml_error: print a message to stderr and exit */
static void ml_error(ml_str msg) {
fprintf(stderr, "error: %s\n", msg);
exit(1);
}
"#;
```
Explain each section:
**Type definitions.** `ml_int` is `int64_t` (64-bit signed integer). `ml_bool` is `int` (C does not have a native boolean type; 0 is false, non-zero is true — we use 1 for true). `ml_str` is `char*`. The `ml_` prefix prevents name collisions with any C standard library names.
**Boolean constants.** `ML_TRUE` and `ML_FALSE` are `#define` macros. Arithmetic operations on booleans (e.g., `(+ #t 1)`) are undefined in MiniLisp but will silently work in the generated C — this is acceptable for a minimal compiler.
**Display functions.** There are three variants because C does not have dynamic dispatch. The code generator picks the right variant based on the expression being displayed (or, since we have no type inference, it may emit `ml_display_int` for all non-string, non-bool expressions). Note: this design decision belongs to §13 — mention that the code generator chooses the variant.
**`ml_error`.** Writes to stderr and calls `exit(1)`. The message is always a string literal in MiniLisp.
**`static` linkage.** The helper functions are declared `static` to prevent symbol conflicts if the generated C file is ever linked with other files.
### Name mangling
All generated C identifiers for user-defined symbols are prefixed with `ml_` and have hyphens replaced with underscores (since `-` is not valid in a C identifier). For example:
- `factorial``ml_factorial`
- `my-var``ml_my_var`
- `even?``ml_even_p` (using `_p` suffix for `?`)
- `set!``ml_set_e` (using `_e` suffix for `!`)
Present the mangling rules as a table and show the Rust function that performs the translation:
```rust
/// Mangle a MiniLisp symbol name into a valid C identifier.
pub fn mangle(name: &str) -> String {
let mut result = String::from("ml_");
for c in name.chars() {
match c {
'-' => result.push('_'),
'?' => result.push_str("_p"),
'!' => result.push_str("_e"),
c if c.is_alphanumeric() || c == '_' => result.push(c),
_ => result.push_str(&format!("_{:x}", c as u32)),
}
}
result
}
```
Explain: this function is used throughout `codegen.rs` whenever a symbol reference or definition needs to be emitted.
### Emitting the preamble
In `codegen.rs`:
```rust
pub fn generate(exprs: Vec<Expr>) -> String {
let mut out = String::new();
out.push_str(PREAMBLE);
// ... rest of generation in §1315
out
}
```
## Style notes
- Show the full preamble string as a verbatim block — readers need to see exactly what gets emitted
- The name-mangling table is the most referenceable thing in this section; present it prominently
- Note that the `display` overloading decision (three variants) is a simplification — a real Lisp runtime would use tagged unions or polymorphism