A main interface
We have a minimal working program now, but we need to package it in a way that the end user can build
safe programs on top of it. In this section, we'll implement a main interface like the one standard
Rust programs use.
First, we'll convert our binary crate into a library crate:
$ mv src/main.rs src/lib.rs
And then rename it to rt which stands for "runtime".
$ sed -i s/app/rt/ Cargo.toml
$ head -n4 Cargo.toml
[package]
edition = "2024"
name = "rt" # <-
version = "0.1.0"
The first change is to have the reset handler call an external main function:
$ head -n13 src/lib.rs
#![no_std] use core::panic::PanicInfo; // CHANGED! #[unsafe(no_mangle)] pub unsafe extern "C" fn Reset() -> ! { unsafe extern "Rust" { safe fn main() -> !; } main() }
We also drop the #![no_main] attribute as it has no effect on library crates.
There's an orthogonal question that arises at this stage: Should the
rtlibrary provide a standard panicking behavior, or should it not provide a#[panic_handler]function and leave the end user to choose the panicking behavior? This document won't delve into that question and for simplicity will leave the dummy#[panic_handler]function in thertcrate. However, we wanted to inform the reader that there are other options.
The second change involves providing the linker script we wrote before to the application crate. The linker will search for linker scripts in the library search path (-L) and in the directory
from which it's invoked. The application crate shouldn't need to carry around a copy of link.x so
we'll have the rt crate put the linker script in the library search path using a build script.
$ # create a build.rs file in the root of `rt` with these contents
$ cat build.rs
use std::{env, error::Error, fs::File, io::Write, path::PathBuf}; fn main() -> Result<(), Box<dyn Error>> { // build directory for this crate let out_dir = PathBuf::from(env::var_os("OUT_DIR").unwrap()); // extend the library search path println!("cargo:rustc-link-search={}", out_dir.display()); // put `link.x` in the build directory File::create(out_dir.join("link.x"))?.write_all(include_bytes!("link.x"))?; Ok(()) }
Now the user can write an application that exposes the main symbol and link it to the rt crate.
The rt will take care of giving the program the right memory layout.
$ cd ..
$ cargo new --edition 2024 --bin app
$ cd app
$ # modify Cargo.toml to include the `rt` crate as a dependency
$ tail -n2 Cargo.toml
[dependencies]
rt = { path = "../rt" }
$ # copy over the config file that sets a default target and tweaks the linker invocation
$ cp -r ../rt/.cargo .
$ # change the contents of `main.rs` to
$ cat src/main.rs
#![no_std] #![no_main] extern crate rt; #[unsafe(no_mangle)] pub fn main() -> ! { let _x = 42; loop {} }
The disassembly will be similar but will now include the user main function.
$ cargo objdump --bin app -- -d --no-show-raw-insn
app: file format elf32-littlearm
Disassembly of section .text:
<main>:
push {r7, lr}
mov r7, sp
sub sp, #0x4
movs r0, #0x2a
str r0, [sp]
b 0x14 <main+0xc> @ imm = #-0x2
b 0x14 <main+0xc> @ imm = #-0x4
<Reset>:
push {r7, lr}
mov r7, sp
bl 0x8 <main> @ imm = #-0x16
Making it type safe
The main interface works, but it's easy to get it wrong. For example, the user could write main
as a non-divergent function, and they would get no compile time error and undefined behavior (the
compiler will misoptimize the program).
We can add type safety by exposing a macro to the user instead of the symbol interface. In the
rt crate, we can write this macro:
$ tail -n12 ../rt/src/lib.rs
#![allow(unused)] fn main() { #[macro_export] macro_rules! entry { ($path:path) => { #[unsafe(export_name = "main")] pub unsafe fn __main() -> ! { // type check the given path let f: fn() -> ! = $path; f() } }; } }
Then the application writers can invoke it like this:
$ cat src/main.rs
#![no_std] #![no_main] use rt::entry; entry!(main); fn main() -> ! { let _x = 42; loop {} }
Now the author will get an error if they change the signature of main to be
non divergent function, e.g. fn().
Life before main
rt is looking good but it's not feature complete! Applications written against it can't use
static variables or string literals because rt's linker script doesn't define the standard
.bss, .data and .rodata sections. Let's fix that!
The first step is to define these sections in the linker script:
$ # showing just a fragment of the file
$ sed -n 25,46p ../rt/link.x
.text :
{
*(.text .text.*);
} > FLASH
/* NEW! */
.rodata :
{
*(.rodata .rodata.*);
} > FLASH
.bss :
{
*(.bss .bss.*);
} > RAM
.data :
{
*(.data .data.*);
} > RAM
/DISCARD/ :
They just re-export the input sections and specify in which memory region each output section will go.
With these changes, the following program will compile:
#![no_std] #![no_main] use rt::entry; entry!(main); static RODATA: &[u8] = b"Hello, world!"; static mut BSS: u8 = 0; static mut DATA: u16 = 1; #[allow(static_mut_refs)] fn main() -> ! { let _x = RODATA; let _y = unsafe { &BSS }; let _z = unsafe { &DATA }; loop {} }
However if you run this program on real hardware and debug it, you'll observe that the static
variables BSS and DATA don't have the values 0 and 1 by the time main has been reached.
Instead, these variables will have junk values. The problem is that the contents of RAM are
random after powering up the device. You won't be able to observe this effect if you run the
program in QEMU.
As things stand if your program reads any static variable before performing a write to it then
your program has undefined behavior. Let's fix that by initializing all static variables before
calling main.
We'll need to tweak the linker script a bit more to do the RAM initialization:
$ # showing just a fragment of the file
$ sed -n 25,52p ../rt/link.x
.text :
{
*(.text .text.*);
} > FLASH
/* CHANGED! */
.rodata :
{
*(.rodata .rodata.*);
} > FLASH
.bss :
{
_sbss = .;
*(.bss .bss.*);
_ebss = .;
} > RAM
.data : AT(ADDR(.rodata) + SIZEOF(.rodata))
{
_sdata = .;
*(.data .data.*);
_edata = .;
} > RAM
_sidata = LOADADDR(.data);
/DISCARD/ :
Let's go into the details of these changes:
_sbss = .;
_ebss = .;
_sdata = .;
_edata = .;
We associate symbols to the start and end addresses of the .bss and .data sections, which we'll
later use to initialize them.
.data : AT(ADDR(.rodata) + SIZEOF(.rodata))
We set the Load Memory Address (LMA) of the .data section to the end of the .rodata
section. The .data contains static variables with a non-zero initial value; the Virtual Memory
Address (VMA) of the .data section is somewhere in RAM -- this is where the static variables are
located. The initial values of those static variables, however, must be allocated in non volatile
memory (Flash); the LMA is where in Flash those initial values are stored.
_sidata = LOADADDR(.data);
Finally, we associate a symbol to the LMA of .data.
Using our initialization code, we zero the .bss section and initialize the .data section. We can reference
the symbols we created in the linker script from the code. The addresses1 of these symbols are
the boundaries of the .bss and .data sections.
We could write the initialization .bss and .data section code in pure Rust code. In fact, earlier
versions of this book did so. However, several soundness questions have been raised over time,
and it is no longer considered good practice to initialize them in Rust code. See the
Why don't we initialize .data and .bss using Rust section of the book for more details.
We will write the initialization code using the global_asm! macro to define our reset handler.
The updated reset handler, now written in Thumb-2 assembly, is shown below:
$ head -n53 ../rt/src/lib.rs
#![allow(unused)] #![no_std] fn main() { use core::panic::PanicInfo; use core::arch::global_asm; global_asm!( ".text .syntax unified .global _sbss .global _ebss .global _sdata .global _edata .global _sidata .global main .global Reset .type Reset,%function .thumb_func Reset: _init_bss: movs r2, #0 ldr r0, =_sbss ldr r1, =_ebss 1: cmp r1, r0 beq _init_data strb r2, [r0] add r0, #1 b 1b _init_data: ldr r0, =_sdata ldr r1, =_edata ldr r2, =_sidata 1: cmp r0, r1 beq _main_trampoline ldrb r3, [r2] strb r3, [r0] add r0, #1 add r2, #1 b 1b _main_trampoline: ldr r0, =main bx r0" ); }
Now end users can directly and indirectly make use of static variables without running into
undefined behavior!
In the code above we performed the memory initialization in a bytewise fashion. It's possible to force the
.bssand.datasections to be aligned to, say, 4 bytes. This fact can then be used in the Rust code to perform the initialization wordwise while omitting alignment checks. If you are interested in learning how this can be achieved check thecortex-m-rtcrate.