Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rust Patterns & Engineering How-Tos

Speaker Intro

  • Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
  • Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
  • Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since

A practical guide to intermediate-and-above Rust patterns that arise in real codebases. This is not a language tutorial — it assumes you can write basic Rust and want to level up. Each chapter isolates one concept, explains when and why to use it, and provides compilable examples with inline exercises.

Who This Is For

  • Developers who have finished The Rust Programming Language but struggle with “how do I actually design this?”
  • C++/C# engineers translating production systems into Rust
  • Anyone who has hit a wall with generics, trait bounds, or lifetime errors and wants a systematic toolkit

Prerequisites

Before starting, you should be comfortable with:

  • Ownership, borrowing, and lifetimes (basic level)
  • Enums, pattern matching, and Option/Result
  • Structs, methods, and basic traits (Display, Debug, Clone)
  • Cargo basics: cargo build, cargo test, cargo run

How to Use This Book

Difficulty Legend

Each chapter is tagged with a difficulty level:

SymbolLevelMeaning
🟢FundamentalsCore concepts every Rust developer needs
🟡IntermediatePatterns used in production codebases
🔴AdvancedDeep language mechanics — revisit as needed

Pacing Guide

ChaptersTopicSuggested TimeCheckpoint
Part I: Type-Level Patterns
1. Generics 🟢Monomorphization, const generics, const fn1–2 hoursCan explain when dyn Trait beats generics
2. Traits 🟡Associated types, GATs, blanket impls, vtables3–4 hoursCan design a trait with associated types
3. Newtype & Type-State 🟡Zero-cost safety, compile-time FSMs2–3 hoursCan build a type-state builder pattern
4. PhantomData 🔴Lifetime branding, variance, drop check2–3 hoursCan explain why PhantomData<fn(T)> differs from PhantomData<T>
Part II: Concurrency & Runtime
5. Channels 🟢mpsc, crossbeam, select!, actors1–2 hoursCan implement a channel-based worker pool
6. Concurrency 🟡Threads, rayon, Mutex, RwLock, atomics2–3 hoursCan pick the right sync primitive for a scenario
7. Closures 🟢Fn/FnMut/FnOnce, combinators1–2 hoursCan write a higher-order function that accepts closures
8. Smart Pointers 🟡Box, Rc, Arc, RefCell, Cow, Pin2–3 hoursCan explain when to use each smart pointer
Part III: Systems & Production
9. Error Handling 🟢thiserror, anyhow, ? operator1–2 hoursCan design an error type hierarchy
10. Serialization 🟡serde, zero-copy, binary data2–3 hoursCan write a custom serde deserializer
11. Unsafe 🔴Superpowers, FFI, UB pitfalls, allocators2–3 hoursCan wrap unsafe code in a sound safe API
12. Macros 🟡macro_rules!, proc macros, syn/quote2–3 hoursCan write a declarative macro with tt munching
13. Testing 🟢Unit/integration/doc tests, proptest, criterion1–2 hoursCan set up property-based tests
14. API Design 🟡Module layout, ergonomic APIs, feature flags2–3 hoursCan apply the “parse, don’t validate” pattern
15. Async 🔴Futures, Tokio, common pitfalls1–2 hoursCan identify async anti-patterns
Appendices
Reference CardQuick-look trait bounds, lifetimes, patternsAs needed
Capstone ProjectType-safe task scheduler4–6 hoursSubmit a working implementation

Total estimated time: 30–45 hours for thorough study with exercises.

Working Through Exercises

Every chapter ends with a hands-on exercise. For maximum learning:

  1. Try it yourself first — spend at least 15 minutes before opening the solution
  2. Type the code — don’t copy-paste; typing builds muscle memory
  3. Modify the solution — add a feature, change a constraint, break something on purpose
  4. Check cross-references — most exercises combine patterns from multiple chapters

The capstone project (Appendix) ties together patterns from across the book into a single, production-quality system.

Table of Contents

Part I: Type-Level Patterns

1. Generics — The Full Picture 🟢 Monomorphization, code bloat trade-offs, generics vs enums vs trait objects, const generics, const fn.

2. Traits In Depth 🟡 Associated types, GATs, blanket impls, marker traits, vtables, HRTBs, extension traits, enum dispatch.

3. The Newtype and Type-State Patterns 🟡 Zero-cost type safety, compile-time state machines, builder patterns, config traits.

4. PhantomData — Types That Carry No Data 🔴 Lifetime branding, unit-of-measure pattern, drop check, variance.

Part II: Concurrency & Runtime

5. Channels and Message Passing 🟢 std::sync::mpsc, crossbeam, select!, backpressure, actor pattern.

6. Concurrency vs Parallelism vs Threads 🟡 OS threads, scoped threads, rayon, Mutex/RwLock/Atomics, Condvar, OnceLock, lock-free patterns.

7. Closures and Higher-Order Functions 🟢 Fn/FnMut/FnOnce, closures as parameters/return values, combinators, higher-order APIs.

8. Smart Pointers and Interior Mutability 🟡 Box, Rc, Arc, Weak, Cell/RefCell, Cow, Pin, ManuallyDrop.

Part III: Systems & Production

9. Error Handling Patterns 🟢 thiserror vs anyhow, #[from], .context(), ? operator, panics.

10. Serialization, Zero-Copy, and Binary Data 🟡 serde fundamentals, enum representations, zero-copy deserialization, repr(C), bytes::Bytes.

11. Unsafe Rust — Controlled Danger 🔴 Five superpowers, sound abstractions, FFI, UB pitfalls, arena/slab allocators.

12. Macros — Code That Writes Code 🟡 macro_rules!, when (not) to use macros, proc macros, derive macros, syn/quote.

13. Testing and Benchmarking Patterns 🟢 Unit/integration/doc tests, proptest, criterion, mocking strategies.

14. Crate Architecture and API Design 🟡 Module layout, API design checklist, ergonomic parameters, feature flags, workspaces.

15. Async/Await Essentials 🔴 Futures, Tokio quick-start, common pitfalls. (For deep async coverage, see our Async Rust Training.)

Appendices

Summary and Reference Card Pattern decision guide, trait bounds cheat sheet, lifetime elision rules, further reading.

Capstone Project: Type-Safe Task Scheduler Integrate generics, traits, typestate, channels, error handling, and testing into a complete system.


1. Generics — The Full Picture 🟢

What you’ll learn:

  • How monomorphization gives zero-cost generics — and when it causes code bloat
  • The decision framework: generics vs enums vs trait objects
  • Const generics for compile-time array sizes and const fn for compile-time evaluation
  • When to trade static dispatch for dynamic dispatch on cold paths

Monomorphization and Zero Cost

Generics in Rust are monomorphized — the compiler generates a specialized copy of each generic function for every concrete type it’s used with. This is the opposite of Java/C# where generics are erased at runtime.

fn max_of<T: PartialOrd>(a: T, b: T) -> T {
    if a >= b { a } else { b }
}

fn main() {
    max_of(3_i32, 5_i32);     // Compiler generates max_of_i32
    max_of(2.0_f64, 7.0_f64); // Compiler generates max_of_f64
    max_of("a", "z");         // Compiler generates max_of_str
}

What the compiler actually produces (conceptually):

#![allow(unused)]
fn main() {
// Three separate functions — no runtime dispatch, no vtable:
fn max_of_i32(a: i32, b: i32) -> i32 { if a >= b { a } else { b } }
fn max_of_f64(a: f64, b: f64) -> f64 { if a >= b { a } else { b } }
fn max_of_str<'a>(a: &'a str, b: &'a str) -> &'a str { if a >= b { a } else { b } }
}

Why does max_of_str need <'a> but max_of_i32 doesn’t? i32 and f64 are Copy types — the function returns an owned value. But &str is a reference, so the compiler must know the returned reference’s lifetime. The <'a> annotation says “the returned &str lives at least as long as both inputs.”

Advantages: Zero runtime cost — identical to hand-written specialized code. The optimizer can inline, vectorize, and specialize each copy independently.

Comparison with C++: Rust generics work like C++ templates but with one crucial difference — bounds checking happens at definition, not instantiation. In C++, a template compiles only when used with a specific type, leading to cryptic error messages deep in library code. In Rust, T: PartialOrd is checked when you define the function, so errors are caught early and messages are clear.

#![allow(unused)]
fn main() {
// Rust: error at definition site — "T doesn't implement Display"
fn broken<T>(val: T) {
    println!("{val}"); // ❌ Error: T doesn't implement Display
}

// Fix: add the bound
fn fixed<T: std::fmt::Display>(val: T) {
    println!("{val}"); // ✅
}
}

When Generics Hurt: Code Bloat

Monomorphization has a cost — binary size. Each unique instantiation duplicates the function body:

#![allow(unused)]
fn main() {
// This innocent function...
fn serialize<T: serde::Serialize>(value: &T) -> Vec<u8> {
    serde_json::to_vec(value).unwrap()
}

// ...used with 50 different types → 50 copies in the binary.
}

Mitigation strategies:

#![allow(unused)]
fn main() {
// 1. Extract the non-generic core ("outline" pattern)
fn serialize<T: serde::Serialize>(value: &T) -> Result<Vec<u8>, serde_json::Error> {
    // Generic part: only the serialization call
    let json_value = serde_json::to_value(value)?;
    // Non-generic part: extracted into a separate function
    serialize_value(json_value)
}

fn serialize_value(value: serde_json::Value) -> Result<Vec<u8>, serde_json::Error> {
    // This function exists only ONCE in the binary
    serde_json::to_vec(&value)
}

// 2. Use trait objects (dynamic dispatch) when inlining isn't critical
fn log_item(item: &dyn std::fmt::Display) {
    // One copy — uses vtable for dispatch
    println!("[LOG] {item}");
}
}

Rule of thumb: Use generics for hot paths where inlining matters. Use dyn Trait for cold paths (error handling, logging, configuration) where a vtable call is negligible.

Generics vs Enums vs Trait Objects — Decision Guide

Three ways to handle “different types, same interface” in Rust:

ApproachDispatchKnown atExtensible?Overhead
Generics (impl Trait / <T: Trait>)Static (monomorphized)Compile time✅ (open set)Zero — inlined
EnumMatch armCompile time❌ (closed set)Zero — no vtable
Trait object (dyn Trait)Dynamic (vtable)Runtime✅ (open set)Vtable pointer + indirect call
#![allow(unused)]
fn main() {
// --- GENERICS: Open set, zero cost, compile-time ---
fn process<H: Handler>(handler: H, request: Request) -> Response {
    handler.handle(request) // Monomorphized — one copy per H
}

// --- ENUM: Closed set, zero cost, exhaustive matching ---
enum Shape {
    Circle(f64),
    Rect(f64, f64),
    Triangle(f64, f64, f64),
}

impl Shape {
    fn area(&self) -> f64 {
        match self {
            Shape::Circle(r) => std::f64::consts::PI * r * r,
            Shape::Rect(w, h) => w * h,
            Shape::Triangle(a, b, c) => {
                let s = (a + b + c) / 2.0;
                (s * (s - a) * (s - b) * (s - c)).sqrt()
            }
        }
    }
}
// Adding a new variant forces updating ALL match arms — the compiler
// enforces exhaustiveness. Great for "I control all the variants."

// --- TRAIT OBJECT: Open set, runtime cost, extensible ---
fn log_all(items: &[Box<dyn std::fmt::Display>]) {
    for item in items {
        println!("{item}"); // vtable dispatch
    }
}
}

Decision flowchart:

flowchart TD
    A["Do you know ALL<br>possible types at<br>compile time?"]
    A -->|"Yes, small<br>closed set"| B["Enum"]
    A -->|"Yes, but set<br>is open"| C["Generics<br>(monomorphized)"]
    A -->|"No — types<br>determined at runtime"| D["dyn Trait"]

    C --> E{"Hot path?<br>(millions of calls)"}
    E -->|Yes| F["Generics<br>(inlineable)"]
    E -->|No| G["dyn Trait<br>is fine"]

    D --> H{"Need mixed types<br>in one collection?"}
    H -->|Yes| I["Vec&lt;Box&lt;dyn Trait&gt;&gt;"]
    H -->|No| C

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#d4efdf,stroke:#27ae60,color:#000
    style C fill:#d4efdf,stroke:#27ae60,color:#000
    style D fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#d4efdf,stroke:#27ae60,color:#000
    style G fill:#fdebd0,stroke:#e67e22,color:#000
    style I fill:#fdebd0,stroke:#e67e22,color:#000
    style E fill:#fef9e7,stroke:#f1c40f,color:#000
    style H fill:#fef9e7,stroke:#f1c40f,color:#000

Const Generics

Since Rust 1.51, you can parameterize types and functions over constant values, not just types:

#![allow(unused)]
fn main() {
// Array wrapper parameterized over size
struct Matrix<const ROWS: usize, const COLS: usize> {
    data: [[f64; COLS]; ROWS],
}

impl<const ROWS: usize, const COLS: usize> Matrix<ROWS, COLS> {
    fn new() -> Self {
        Matrix { data: [[0.0; COLS]; ROWS] }
    }

    fn transpose(&self) -> Matrix<COLS, ROWS> {
        let mut result = Matrix::<COLS, ROWS>::new();
        for r in 0..ROWS {
            for c in 0..COLS {
                result.data[c][r] = self.data[r][c];
            }
        }
        result
    }
}

// The compiler enforces dimensional correctness:
fn multiply<const M: usize, const N: usize, const P: usize>(
    a: &Matrix<M, N>,
    b: &Matrix<N, P>, // N must match!
) -> Matrix<M, P> {
    let mut result = Matrix::<M, P>::new();
    for i in 0..M {
        for j in 0..P {
            for k in 0..N {
                result.data[i][j] += a.data[i][k] * b.data[k][j];
            }
        }
    }
    result
}

// Usage:
let a = Matrix::<2, 3>::new(); // 2×3
let b = Matrix::<3, 4>::new(); // 3×4
let c = multiply(&a, &b);      // 2×4 ✅

// let d = Matrix::<5, 5>::new();
// multiply(&a, &d); // ❌ Compile error: expected Matrix<3, _>, got Matrix<5, 5>
}

C++ comparison: This is similar to template<int N> in C++, but Rust const generics are type-checked eagerly and don’t suffer from SFINAE complexity.

Const Functions (const fn)

const fn marks a function as evaluable at compile time — Rust’s equivalent of C++ constexpr. The result can be used in const and static contexts:

#![allow(unused)]
fn main() {
// Basic const fn — evaluated at compile time when used in const context
const fn celsius_to_fahrenheit(c: f64) -> f64 {
    c * 9.0 / 5.0 + 32.0
}

const BOILING_F: f64 = celsius_to_fahrenheit(100.0); // Computed at compile time
const FREEZING_F: f64 = celsius_to_fahrenheit(0.0);  // 32.0

// Const constructors — create statics without lazy_static!
struct BitMask(u32);

impl BitMask {
    const fn new(bit: u32) -> Self {
        BitMask(1 << bit)
    }

    const fn or(self, other: BitMask) -> Self {
        BitMask(self.0 | other.0)
    }

    const fn contains(&self, bit: u32) -> bool {
        self.0 & (1 << bit) != 0
    }
}

// Static lookup table — no runtime cost, no lazy initialization
const GPIO_INPUT:  BitMask = BitMask::new(0);
const GPIO_OUTPUT: BitMask = BitMask::new(1);
const GPIO_IRQ:    BitMask = BitMask::new(2);
const GPIO_IO:     BitMask = GPIO_INPUT.or(GPIO_OUTPUT);

// Register maps as const arrays:
const SENSOR_THRESHOLDS: [u16; 4] = {
    let mut table = [0u16; 4];
    table[0] = 50;   // Warning
    table[1] = 70;   // High
    table[2] = 85;   // Critical
    table[3] = 100;  // Shutdown
    table
};
// The entire table exists in the binary — no heap, no runtime init.
}

What you CAN do in const fn (as of Rust 1.79+):

  • Arithmetic, bit operations, comparisons
  • if/else, match, loop, while (control flow)
  • Creating and modifying local variables (let mut)
  • Calling other const fns
  • References (&, &mut — within the const context)
  • panic!() (becomes a compile error if reached at compile time)

What you CANNOT do (yet):

  • Heap allocation (Box, Vec, String)
  • Trait method calls (only inherent methods)
  • Floating-point in some contexts (stabilized for basic ops)
  • I/O or side effects
#![allow(unused)]
fn main() {
// const fn with panic — becomes a compile-time error:
const fn checked_div(a: u32, b: u32) -> u32 {
    if b == 0 {
        panic!("division by zero"); // Compile error if b is 0 at const time
    }
    a / b
}

const RESULT: u32 = checked_div(100, 4);  // ✅ 25
// const BAD: u32 = checked_div(100, 0);  // ❌ Compile error: "division by zero"
}

C++ comparison: const fn is Rust’s constexpr. The key difference: Rust’s version is opt-in and the compiler rigorously verifies that only const-compatible operations are used. In C++, constexpr functions can silently fall back to runtime evaluation — in Rust, a const context requires compile-time evaluation or it’s a hard error.

Practical advice: Make constructors and simple utility functions const fn whenever possible — it costs nothing and enables callers to use them in const contexts. For hardware diagnostic code, const fn is ideal for register definitions, bitmask construction, and threshold tables.

Key Takeaways — Generics

  • Monomorphization gives zero-cost abstractions but can cause code bloat — use dyn Trait for cold paths
  • Const generics ([T; N]) replace C++ template tricks with compile-time–checked array sizes
  • const fn eliminates lazy_static! for compile-time–computable values

See also: Ch 2 — Traits In Depth for trait bounds, associated types, and trait objects. Ch 4 — PhantomData for zero-sized generic markers.


Exercise: Generic Cache with Eviction ★★ (~30 min)

Build a generic Cache<K, V> struct that stores key-value pairs with a configurable maximum capacity. When full, the oldest entry is evicted (FIFO). Requirements:

  • fn new(capacity: usize) -> Self
  • fn insert(&mut self, key: K, value: V) — evicts the oldest if at capacity
  • fn get(&self, key: &K) -> Option<&V>
  • fn len(&self) -> usize
  • Constrain K: Eq + Hash + Clone
🔑 Solution
use std::collections::{HashMap, VecDeque};
use std::hash::Hash;

struct Cache<K, V> {
    map: HashMap<K, V>,
    order: VecDeque<K>,
    capacity: usize,
}

impl<K: Eq + Hash + Clone, V> Cache<K, V> {
    fn new(capacity: usize) -> Self {
        Cache {
            map: HashMap::with_capacity(capacity),
            order: VecDeque::with_capacity(capacity),
            capacity,
        }
    }

    fn insert(&mut self, key: K, value: V) {
        if self.map.contains_key(&key) {
            self.map.insert(key, value);
            return;
        }
        if self.map.len() >= self.capacity {
            if let Some(oldest) = self.order.pop_front() {
                self.map.remove(&oldest);
            }
        }
        self.order.push_back(key.clone());
        self.map.insert(key, value);
    }

    fn get(&self, key: &K) -> Option<&V> {
        self.map.get(key)
    }

    fn len(&self) -> usize {
        self.map.len()
    }
}

fn main() {
    let mut cache = Cache::new(3);
    cache.insert("a", 1);
    cache.insert("b", 2);
    cache.insert("c", 3);
    assert_eq!(cache.len(), 3);

    cache.insert("d", 4); // Evicts "a"
    assert_eq!(cache.get(&"a"), None);
    assert_eq!(cache.get(&"d"), Some(&4));
    println!("Cache works! len = {}", cache.len());
}

2. Traits In Depth 🟡

What you’ll learn:

  • Associated types vs generic parameters — and when to use each
  • GATs, blanket impls, marker traits, and trait object safety rules
  • How vtables and fat pointers work under the hood
  • Extension traits, enum dispatch, and typed command patterns

Associated Types vs Generic Parameters

Both let a trait work with different types, but they serve different purposes:

#![allow(unused)]
fn main() {
// --- ASSOCIATED TYPE: One implementation per type ---
trait Iterator {
    type Item; // Each iterator produces exactly ONE kind of item

    fn next(&mut self) -> Option<Self::Item>;
}

// A custom iterator that always yields i32 — there's no choice
struct Counter { max: i32, current: i32 }

impl Iterator for Counter {
    type Item = i32; // Exactly one Item type per implementation
    fn next(&mut self) -> Option<i32> {
        if self.current < self.max {
            self.current += 1;
            Some(self.current)
        } else {
            None
        }
    }
}

// --- GENERIC PARAMETER: Multiple implementations per type ---
trait Convert<T> {
    fn convert(&self) -> T;
}

// A single type can implement Convert for MANY target types:
impl Convert<f64> for i32 {
    fn convert(&self) -> f64 { *self as f64 }
}
impl Convert<String> for i32 {
    fn convert(&self) -> String { self.to_string() }
}
}

When to use which:

UseWhen
Associated typeThere’s exactly ONE natural output/result per implementing type. Iterator::Item, Deref::Target, Add::Output
Generic parameterA type can meaningfully implement the trait for MANY different types. From<T>, AsRef<T>, PartialEq<Rhs>

Intuition: If it makes sense to ask “what is the Item of this iterator?”, use associated type. If it makes sense to ask “can this convert to f64? to String? to bool?”, use a generic parameter.

#![allow(unused)]
fn main() {
// Real-world example: std::ops::Add
trait Add<Rhs = Self> {
    type Output; // Associated type — addition has ONE result type
    fn add(self, rhs: Rhs) -> Self::Output;
}

// Rhs is a generic parameter — you can add different types to Meters:
struct Meters(f64);
struct Centimeters(f64);

impl Add<Meters> for Meters {
    type Output = Meters;
    fn add(self, rhs: Meters) -> Meters { Meters(self.0 + rhs.0) }
}
impl Add<Centimeters> for Meters {
    type Output = Meters;
    fn add(self, rhs: Centimeters) -> Meters { Meters(self.0 + rhs.0 / 100.0) }
}
}

Generic Associated Types (GATs)

Since Rust 1.65, associated types can have generic parameters of their own. This enables lending iterators — iterators that return references tied to the iterator rather than to the underlying collection:

#![allow(unused)]
fn main() {
// Without GATs — impossible to express a lending iterator:
// trait LendingIterator {
//     type Item<'a>;  // ← This was rejected before 1.65
// }

// With GATs (Rust 1.65+):
trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next(&mut self) -> Option<Self::Item<'_>>;
}

// Example: an iterator that yields overlapping windows
struct WindowIter<'data> {
    data: &'data [u8],
    pos: usize,
    window_size: usize,
}

impl<'data> LendingIterator for WindowIter<'data> {
    type Item<'a> = &'a [u8] where Self: 'a;

    fn next(&mut self) -> Option<&[u8]> {
        if self.pos + self.window_size <= self.data.len() {
            let window = &self.data[self.pos..self.pos + self.window_size];
            self.pos += 1;
            Some(window)
        } else {
            None
        }
    }
}
}

When you need GATs: Lending iterators, streaming parsers, or any trait where the associated type’s lifetime depends on the &self borrow. For most code, plain associated types are sufficient.

Supertraits and Trait Hierarchies

Traits can require other traits as prerequisites, forming hierarchies:

graph BT
    Display["Display"]
    Debug["Debug"]
    Error["Error"]
    Clone["Clone"]
    Copy["Copy"]
    PartialEq["PartialEq"]
    Eq["Eq"]
    PartialOrd["PartialOrd"]
    Ord["Ord"]

    Error --> Display
    Error --> Debug
    Copy --> Clone
    Eq --> PartialEq
    Ord --> Eq
    Ord --> PartialOrd
    PartialOrd --> PartialEq

    style Display fill:#e8f4f8,stroke:#2980b9,color:#000
    style Debug fill:#e8f4f8,stroke:#2980b9,color:#000
    style Error fill:#fdebd0,stroke:#e67e22,color:#000
    style Clone fill:#d4efdf,stroke:#27ae60,color:#000
    style Copy fill:#d4efdf,stroke:#27ae60,color:#000
    style PartialEq fill:#fef9e7,stroke:#f1c40f,color:#000
    style Eq fill:#fef9e7,stroke:#f1c40f,color:#000
    style PartialOrd fill:#fef9e7,stroke:#f1c40f,color:#000
    style Ord fill:#fef9e7,stroke:#f1c40f,color:#000

Arrows point from subtrait to supertrait: implementing Error requires Display + Debug.

A trait can require that implementors also implement other traits:

#![allow(unused)]
fn main() {
use std::fmt;

// Display is a supertrait of Error
trait Error: fmt::Display + fmt::Debug {
    fn source(&self) -> Option<&(dyn Error + 'static)> { None }
}
// Any type implementing Error MUST also implement Display and Debug

// Build your own hierarchies:
trait Identifiable {
    fn id(&self) -> u64;
}

trait Timestamped {
    fn created_at(&self) -> chrono::DateTime<chrono::Utc>;
}

// Entity requires both:
trait Entity: Identifiable + Timestamped {
    fn is_active(&self) -> bool;
}

// Implementing Entity forces you to implement all three:
struct User { id: u64, name: String, created: chrono::DateTime<chrono::Utc> }

impl Identifiable for User {
    fn id(&self) -> u64 { self.id }
}
impl Timestamped for User {
    fn created_at(&self) -> chrono::DateTime<chrono::Utc> { self.created }
}
impl Entity for User {
    fn is_active(&self) -> bool { true }
}
}

Blanket Implementations

Implement a trait for ALL types that satisfy some bound:

#![allow(unused)]
fn main() {
// std does this: any type that implements Display automatically gets ToString
impl<T: fmt::Display> ToString for T {
    fn to_string(&self) -> String {
        format!("{self}")
    }
}
// Now i32, &str, your custom types — anything with Display — gets to_string() for free.

// Your own blanket impl:
trait Loggable {
    fn log(&self);
}

// Every Debug type is automatically Loggable:
impl<T: std::fmt::Debug> Loggable for T {
    fn log(&self) {
        eprintln!("[LOG] {self:?}");
    }
}

// Now ANY Debug type has .log():
// 42.log();              // [LOG] 42
// "hello".log();         // [LOG] "hello"
// vec![1, 2, 3].log();   // [LOG] [1, 2, 3]
}

Caution: Blanket impls are powerful but irreversible — you can’t add a more specific impl for a type that’s already covered by a blanket impl (orphan rules + coherence). Design them carefully.

Marker Traits

Traits with no methods — they mark a type as having some property:

#![allow(unused)]
fn main() {
// Standard library marker traits:
// Send    — safe to transfer between threads
// Sync    — safe to share (&T) between threads
// Unpin   — safe to move after pinning
// Sized   — has a known size at compile time
// Copy    — can be duplicated with memcpy

// Your own marker trait:
/// Marker: this sensor has been factory-calibrated
trait Calibrated {}

struct RawSensor { reading: f64 }
struct CalibratedSensor { reading: f64 }

impl Calibrated for CalibratedSensor {}

// Only calibrated sensors can be used in production:
fn record_measurement<S: Calibrated>(sensor: &S) {
    // ...
}
// record_measurement(&RawSensor { reading: 0.0 }); // ❌ Compile error
// record_measurement(&CalibratedSensor { reading: 0.0 }); // ✅
}

This connects directly to the type-state pattern in Chapter 3.

Trait Object Safety Rules

Not every trait can be used as dyn Trait. A trait is object-safe only if:

  1. No Self: Sized bound on the trait itself
  2. No generic type parameters on methods
  3. No use of Self in return position (except via indirection like Box<Self>)
  4. No associated functions (methods must have &self, &mut self, or self)
#![allow(unused)]
fn main() {
// ✅ Object-safe — can be used as dyn Drawable
trait Drawable {
    fn draw(&self);
    fn bounding_box(&self) -> (f64, f64, f64, f64);
}

let shapes: Vec<Box<dyn Drawable>> = vec![/* ... */]; // ✅ Works

// ❌ NOT object-safe — uses Self in return position
trait Cloneable {
    fn clone_self(&self) -> Self;
    //                       ^^^^ Can't know the concrete size at runtime
}
// let items: Vec<Box<dyn Cloneable>> = ...; // ❌ Compile error

// ❌ NOT object-safe — generic method
trait Converter {
    fn convert<T>(&self) -> T;
    //        ^^^ The vtable can't contain infinite monomorphizations
}

// ❌ NOT object-safe — associated function (no self)
trait Factory {
    fn create() -> Self;
    // No &self — how would you call this through a trait object?
}
}

Workarounds:

#![allow(unused)]
fn main() {
// Add `where Self: Sized` to exclude a method from the vtable:
trait MyTrait {
    fn regular_method(&self); // Included in vtable

    fn generic_method<T>(&self) -> T
    where
        Self: Sized; // Excluded from vtable — can't be called via dyn MyTrait
}

// Now dyn MyTrait is valid, but generic_method can only be called
// when the concrete type is known.
}

Rule of thumb: If you plan to use dyn Trait, keep methods simple — no generics, no Self in return types, no Sized bounds. When in doubt, try let _: Box<dyn YourTrait>; and let the compiler tell you.

Trait Objects Under the Hood — vtables and Fat Pointers

A &dyn Trait (or Box<dyn Trait>) is a fat pointer — two machine words:

┌──────────────────────────────────────────────────┐
│  &dyn Drawable (on 64-bit: 16 bytes total)       │
├──────────────┬───────────────────────────────────┤
│  data_ptr    │  vtable_ptr                       │
│  (8 bytes)   │  (8 bytes)                        │
│  ↓           │  ↓                                │
│  ┌─────────┐ │  ┌──────────────────────────────┐ │
│  │ Circle  │ │  │ vtable for <Circle as        │ │
│  │ {       │ │  │           Drawable>           │ │
│  │  r: 5.0 │ │  │                              │ │
│  │ }       │ │  │  drop_in_place: 0x7f...a0    │ │
│  └─────────┘ │  │  size:           8            │ │
│              │  │  align:          8            │ │
│              │  │  draw:          0x7f...b4     │ │
│              │  │  bounding_box:  0x7f...c8     │ │
│              │  └──────────────────────────────┘ │
└──────────────┴───────────────────────────────────┘

How a vtable call works (e.g., shape.draw()):

  1. Load vtable_ptr from the fat pointer (second word)
  2. Index into the vtable to find the draw function pointer
  3. Call it, passing data_ptr as the self argument

This is similar to C++ virtual dispatch in cost (one pointer indirection per call), but Rust stores the vtable pointer in the fat pointer rather than inside the object — so a plain Circle on the stack carries no vtable pointer at all.

trait Drawable {
    fn draw(&self);
    fn area(&self) -> f64;
}

struct Circle { radius: f64 }

impl Drawable for Circle {
    fn draw(&self) { println!("Drawing circle r={}", self.radius); }
    fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius }
}

struct Square { side: f64 }

impl Drawable for Square {
    fn draw(&self) { println!("Drawing square s={}", self.side); }
    fn area(&self) -> f64 { self.side * self.side }
}

fn main() {
    let shapes: Vec<Box<dyn Drawable>> = vec![
        Box::new(Circle { radius: 5.0 }),
        Box::new(Square { side: 3.0 }),
    ];

    // Each element is a fat pointer: (data_ptr, vtable_ptr)
    // The vtable for Circle and Square are DIFFERENT
    for shape in &shapes {
        shape.draw();  // vtable dispatch → Circle::draw or Square::draw
        println!("  area = {:.2}", shape.area());
    }

    // Size comparison:
    println!("size_of::<&Circle>()        = {}", std::mem::size_of::<&Circle>());
    // → 8 bytes (one pointer — the compiler knows the type)
    println!("size_of::<&dyn Drawable>()  = {}", std::mem::size_of::<&dyn Drawable>());
    // → 16 bytes (data_ptr + vtable_ptr)
}

Performance cost model:

AspectStatic dispatch (impl Trait / generics)Dynamic dispatch (dyn Trait)
Call overheadZero — inlined by LLVMOne pointer indirection per call
Inlining✅ Compiler can inline❌ Opaque function pointer
Binary sizeLarger (one copy per type)Smaller (one shared function)
Pointer sizeThin (1 word)Fat (2 words)
Heterogeneous collectionsVec<Box<dyn Trait>>

When vtable cost matters: In tight loops calling a trait method millions of times, the indirection and inability to inline can be significant (2-10× slower). For cold paths, configuration, or plugin architectures, the flexibility of dyn Trait is worth the small cost.

Higher-Ranked Trait Bounds (HRTBs)

Sometimes you need a function that works with references of any lifetime, not a specific one. This is where for<'a> syntax appears:

// Problem: this function needs a closure that can process
// references with ANY lifetime, not just one specific lifetime.

// ❌ This is too restrictive — 'a is fixed by the caller:
// fn apply<'a, F: Fn(&'a str) -> &'a str>(f: F, data: &'a str) -> &'a str

// ✅ HRTB: F must work for ALL possible lifetimes:
fn apply<F>(f: F, data: &str) -> &str
where
    F: for<'a> Fn(&'a str) -> &'a str,
{
    f(data)
}

fn main() {
    let result = apply(|s| s.trim(), "  hello  ");
    println!("{result}"); // "hello"
}

When you encounter HRTBs:

  • Fn(&T) -> &U traits — the compiler infers for<'a> automatically in most cases
  • Custom trait implementations that must work across different borrows
  • Deserialization with serde: for<'de> Deserialize<'de>
// serde's DeserializeOwned is defined as:
// trait DeserializeOwned: for<'de> Deserialize<'de> {}
// Meaning: "can be deserialized from data with ANY lifetime"
// (i.e., the result doesn't borrow from the input)

use serde::de::DeserializeOwned;

fn parse_json<T: DeserializeOwned>(input: &str) -> T {
    serde_json::from_str(input).unwrap()
}

Practical advice: You’ll rarely write for<'a> yourself. It mostly appears in trait bounds on closure parameters, where the compiler handles it implicitly. But recognizing it in error messages (“expected a for<'a> Fn(&'a ...) bound”) helps you understand what the compiler is asking for.

impl Trait — Argument Position vs Return Position

impl Trait appears in two positions with different semantics:

#![allow(unused)]
fn main() {
// --- Argument-Position impl Trait (APIT) ---
// "Caller chooses the type" — syntactic sugar for a generic parameter
fn print_all(items: impl Iterator<Item = i32>) {
    for item in items { println!("{item}"); }
}
// Equivalent to:
fn print_all_verbose<I: Iterator<Item = i32>>(items: I) {
    for item in items { println!("{item}"); }
}
// Caller decides: print_all(vec![1,2,3].into_iter())
//                 print_all(0..10)

// --- Return-Position impl Trait (RPIT) ---
// "Callee chooses the type" — the function picks one concrete type
fn evens(limit: i32) -> impl Iterator<Item = i32> {
    (0..limit).filter(|x| x % 2 == 0)
    // The concrete type is Filter<Range<i32>, Closure>
    // but the caller only sees "some Iterator<Item = i32>"
}
}

Key difference:

APIT (fn foo(x: impl T))RPIT (fn foo() -> impl T)
Who picks the type?CallerCallee (function body)
Monomorphized?Yes — one copy per typeYes — one concrete type
Turbofish?No (foo::<X>() not allowed)N/A
Equivalent tofn foo<X: T>(x: X)Existential type

RPIT in Trait Definitions (RPITIT)

Since Rust 1.75, you can use -> impl Trait directly in trait definitions:

#![allow(unused)]
fn main() {
trait Container {
    fn items(&self) -> impl Iterator<Item = &str>;
    //                 ^^^^ Each implementor returns its own concrete type
}

struct CsvRow {
    fields: Vec<String>,
}

impl Container for CsvRow {
    fn items(&self) -> impl Iterator<Item = &str> {
        self.fields.iter().map(String::as_str)
    }
}

struct FixedFields;

impl Container for FixedFields {
    fn items(&self) -> impl Iterator<Item = &str> {
        ["host", "port", "timeout"].into_iter()
    }
}
}

Before Rust 1.75, you had to use Box<dyn Iterator> or an associated type to achieve this in traits. RPITIT removes the allocation.

impl Trait vs dyn Trait — Decision Guide

Do you know the concrete type at compile time?
├── YES → Use impl Trait or generics (zero cost, inlinable)
└── NO  → Do you need a heterogeneous collection?
     ├── YES → Use dyn Trait (Box<dyn T>, &dyn T)
     └── NO  → Do you need the SAME trait object across an API boundary?
          ├── YES → Use dyn Trait
          └── NO  → Use generics / impl Trait
Featureimpl Traitdyn Trait
DispatchStatic (monomorphized)Dynamic (vtable)
PerformanceBest — inlinableOne indirection per call
Heterogeneous collections
Binary size per typeOne copy eachShared code
Trait must be object-safe?NoYes
Works in trait definitions✅ (Rust 1.75+)Always

Type Erasure with Any and TypeId

Sometimes you need to store values of unknown types and downcast them later — a pattern familiar from void* in C or object in C#. Rust provides this through std::any::Any:

use std::any::Any;

// Store heterogeneous values:
fn log_value(value: &dyn Any) {
    if let Some(s) = value.downcast_ref::<String>() {
        println!("String: {s}");
    } else if let Some(n) = value.downcast_ref::<i32>() {
        println!("i32: {n}");
    } else {
        // TypeId lets you inspect the type at runtime:
        println!("Unknown type: {:?}", value.type_id());
    }
}

// Useful for plugin systems, event buses, or ECS-style architectures:
struct AnyMap(std::collections::HashMap<std::any::TypeId, Box<dyn Any + Send>>);

impl AnyMap {
    fn new() -> Self { AnyMap(std::collections::HashMap::new()) }

    fn insert<T: Any + Send + 'static>(&mut self, value: T) {
        self.0.insert(std::any::TypeId::of::<T>(), Box::new(value));
    }

    fn get<T: Any + Send + 'static>(&self) -> Option<&T> {
        self.0.get(&std::any::TypeId::of::<T>())?
            .downcast_ref()
    }
}

fn main() {
    let mut map = AnyMap::new();
    map.insert(42_i32);
    map.insert(String::from("hello"));

    assert_eq!(map.get::<i32>(), Some(&42));
    assert_eq!(map.get::<String>().map(|s| s.as_str()), Some("hello"));
    assert_eq!(map.get::<f64>(), None); // Never inserted
}

When to use Any: Plugin/extension systems, type-indexed maps (typemap), error downcasting (anyhow::Error::downcast_ref). Prefer generics or trait objects when the set of types is known at compile time — Any is a last resort that trades compile-time safety for flexibility.


Extension Traits — Adding Methods to Types You Don’t Own

Rust’s orphan rule prevents you from implementing a foreign trait on a foreign type. Extension traits are the standard workaround: define a new trait in your crate whose methods have a blanket implementation for any type that meets a bound. The caller imports the trait and the new methods appear on existing types.

This pattern is pervasive in the Rust ecosystem: itertools::Itertools, futures::StreamExt, tokio::io::AsyncReadExt, tower::ServiceExt.

The Problem

#![allow(unused)]
fn main() {
// We want to add a .mean() method to all iterators that yield f64.
// But Iterator is defined in std and f64 is a primitive — orphan rule prevents:
//
// impl<I: Iterator<Item = f64>> I {   // ❌ Cannot add inherent methods to a foreign type
//     fn mean(self) -> f64 { ... }
// }
}

The Solution: An Extension Trait

#![allow(unused)]
fn main() {
/// Extension methods for iterators over numeric values.
pub trait IteratorExt: Iterator {
    /// Computes the arithmetic mean. Returns `None` for empty iterators.
    fn mean(self) -> Option<f64>
    where
        Self: Sized,
        Self::Item: Into<f64>;
}

// Blanket implementation — automatically applies to ALL iterators
impl<I: Iterator> IteratorExt for I {
    fn mean(self) -> Option<f64>
    where
        Self: Sized,
        Self::Item: Into<f64>,
    {
        let mut sum: f64 = 0.0;
        let mut count: u64 = 0;
        for item in self {
            sum += item.into();
            count += 1;
        }
        if count == 0 { None } else { Some(sum / count as f64) }
    }
}

// Usage — just import the trait:
use crate::IteratorExt;  // One import and the method appears on all iterators

fn analyze_temperatures(readings: &[f64]) -> Option<f64> {
    readings.iter().copied().mean()  // .mean() is now available!
}

fn analyze_sensor_data(data: &[i32]) -> Option<f64> {
    data.iter().copied().mean()  // Works on i32 too (i32: Into<f64>)
}
}

Real-World Example: Diagnostic Result Extensions

#![allow(unused)]
fn main() {
use std::collections::HashMap;

struct DiagResult {
    component: String,
    passed: bool,
    message: String,
}

/// Extension trait for Vec<DiagResult> — adds domain-specific analysis methods.
pub trait DiagResultsExt {
    fn passed_count(&self) -> usize;
    fn failed_count(&self) -> usize;
    fn overall_pass(&self) -> bool;
    fn failures_by_component(&self) -> HashMap<String, Vec<&DiagResult>>;
}

impl DiagResultsExt for Vec<DiagResult> {
    fn passed_count(&self) -> usize {
        self.iter().filter(|r| r.passed).count()
    }

    fn failed_count(&self) -> usize {
        self.iter().filter(|r| !r.passed).count()
    }

    fn overall_pass(&self) -> bool {
        self.iter().all(|r| r.passed)
    }

    fn failures_by_component(&self) -> HashMap<String, Vec<&DiagResult>> {
        let mut map = HashMap::new();
        for r in self.iter().filter(|r| !r.passed) {
            map.entry(r.component.clone()).or_default().push(r);
        }
        map
    }
}

// Now any Vec<DiagResult> has these methods:
fn report(results: Vec<DiagResult>) {
    if !results.overall_pass() {
        let failures = results.failures_by_component();
        for (component, fails) in &failures {
            eprintln!("{component}: {} failures", fails.len());
        }
    }
}
}

Naming Convention

The Rust ecosystem uses a consistent Ext suffix:

CrateExtension TraitExtends
itertoolsItertoolsIterator
futuresStreamExt, FutureExtStream, Future
tokioAsyncReadExt, AsyncWriteExtAsyncRead, AsyncWrite
towerServiceExtService
bytesBufMut (partial)&mut [u8]
Your crateDiagResultsExtVec<DiagResult>

When to Use

SituationUse Extension Trait?
Adding convenience methods to a foreign type
Grouping domain-specific logic on generic collections
The method needs access to private fields❌ (use a wrapper/newtype)
The method logically belongs on a new type you control❌ (just add it to your type)
You want the method available without any import❌ (inherent methods only)

Enum Dispatch — Static Polymorphism Without dyn

When you have a closed set of types implementing a trait, you can replace dyn Trait with an enum whose variants hold the concrete types. This eliminates the vtable indirection and heap allocation while preserving the same caller-facing interface.

The Problem with dyn Trait

#![allow(unused)]
fn main() {
trait Sensor {
    fn read(&self) -> f64;
    fn name(&self) -> &str;
}

struct Gps { lat: f64, lon: f64 }
struct Thermometer { temp_c: f64 }
struct Accelerometer { g_force: f64 }

impl Sensor for Gps {
    fn read(&self) -> f64 { self.lat }
    fn name(&self) -> &str { "GPS" }
}
impl Sensor for Thermometer {
    fn read(&self) -> f64 { self.temp_c }
    fn name(&self) -> &str { "Thermometer" }
}
impl Sensor for Accelerometer {
    fn read(&self) -> f64 { self.g_force }
    fn name(&self) -> &str { "Accelerometer" }
}

// Heterogeneous collection with dyn — works, but has costs:
fn read_all_dyn(sensors: &[Box<dyn Sensor>]) -> Vec<f64> {
    sensors.iter().map(|s| s.read()).collect()
    // Each .read() goes through a vtable indirection
    // Each Box allocates on the heap
}
}

The Enum Dispatch Solution

// Replace the trait object with an enum:
enum AnySensor {
    Gps(Gps),
    Thermometer(Thermometer),
    Accelerometer(Accelerometer),
}

impl AnySensor {
    fn read(&self) -> f64 {
        match self {
            AnySensor::Gps(s) => s.read(),
            AnySensor::Thermometer(s) => s.read(),
            AnySensor::Accelerometer(s) => s.read(),
        }
    }

    fn name(&self) -> &str {
        match self {
            AnySensor::Gps(s) => s.name(),
            AnySensor::Thermometer(s) => s.name(),
            AnySensor::Accelerometer(s) => s.name(),
        }
    }
}

// Now: no heap allocation, no vtable, stored inline
fn read_all(sensors: &[AnySensor]) -> Vec<f64> {
    sensors.iter().map(|s| s.read()).collect()
    // Each .read() is a match branch — compiler can inline everything
}

fn main() {
    let sensors = vec![
        AnySensor::Gps(Gps { lat: 47.6, lon: -122.3 }),
        AnySensor::Thermometer(Thermometer { temp_c: 72.5 }),
        AnySensor::Accelerometer(Accelerometer { g_force: 1.02 }),
    ];

    for sensor in &sensors {
        println!("{}: {:.2}", sensor.name(), sensor.read());
    }
}

Implement the Trait on the Enum

For interoperability, you can implement the original trait on the enum itself:

#![allow(unused)]
fn main() {
impl Sensor for AnySensor {
    fn read(&self) -> f64 {
        match self {
            AnySensor::Gps(s) => s.read(),
            AnySensor::Thermometer(s) => s.read(),
            AnySensor::Accelerometer(s) => s.read(),
        }
    }

    fn name(&self) -> &str {
        match self {
            AnySensor::Gps(s) => s.name(),
            AnySensor::Thermometer(s) => s.name(),
            AnySensor::Accelerometer(s) => s.name(),
        }
    }
}

// Now AnySensor works anywhere a Sensor is expected via generics:
fn report<S: Sensor>(s: &S) {
    println!("{}: {:.2}", s.name(), s.read());
}
}

Reducing Boilerplate with a Macro

The match-arm delegation is repetitive. A macro eliminates it:

#![allow(unused)]
fn main() {
macro_rules! dispatch_sensor {
    ($self:expr, $method:ident $(, $arg:expr)*) => {
        match $self {
            AnySensor::Gps(s) => s.$method($($arg),*),
            AnySensor::Thermometer(s) => s.$method($($arg),*),
            AnySensor::Accelerometer(s) => s.$method($($arg),*),
        }
    };
}

impl Sensor for AnySensor {
    fn read(&self) -> f64     { dispatch_sensor!(self, read) }
    fn name(&self) -> &str    { dispatch_sensor!(self, name) }
}
}

For larger projects, the enum_dispatch crate automates this entirely:

#![allow(unused)]
fn main() {
use enum_dispatch::enum_dispatch;

#[enum_dispatch]
trait Sensor {
    fn read(&self) -> f64;
    fn name(&self) -> &str;
}

#[enum_dispatch(Sensor)]
enum AnySensor {
    Gps,
    Thermometer,
    Accelerometer,
}
// All delegation code is generated automatically.
}

dyn Trait vs Enum Dispatch — Decision Guide

Is the set of types closed (known at compile time)?
├── YES → Prefer enum dispatch (faster, no heap allocation)
│         ├── Few variants (< ~20)?     → Manual enum
│         └── Many variants or growing? → enum_dispatch crate
└── NO  → Must use dyn Trait (plugins, user-provided types)
Propertydyn TraitEnum Dispatch
Dispatch costVtable indirection (~2ns)Branch prediction (~0.3ns)
Heap allocationUsually (Box)None (inline)
Cache-friendlyNo (pointer chasing)Yes (contiguous)
Open to new types✅ (anyone can impl)❌ (closed set)
Code sizeSharedOne copy per variant
Trait must be object-safeYesNo
Adding a variantNo code changesUpdate enum + match arms

When to Use Enum Dispatch

ScenarioRecommendation
Diagnostic test types (CPU, GPU, NIC, Memory, …)✅ Enum dispatch — closed set, known at compile time
Bus protocols (SPI, I2C, UART, …)✅ Enum dispatch or Config trait
Plugin system (user loads .so at runtime)❌ Use dyn Trait
2-3 variants✅ Manual enum dispatch
10+ variants with many methodsenum_dispatch crate
Performance-critical inner loop✅ Enum dispatch (eliminates vtable)

Capability Mixins — Associated Types as Zero-Cost Composition

Ruby developers compose behaviour with mixinsinclude SomeModule injects methods into a class. Rust traits with associated types + default methods + blanket impls produce the same result, except:

  • Everything resolves at compile time — no method-missing surprises
  • Each associated type is a knob that changes what the default methods produce
  • The compiler monomorphises each combination — zero vtable overhead

The Problem: Cross-Cutting Bus Dependencies

Hardware diagnostic routines share common operations — read an IPMI sensor, toggle a GPIO rail, sample a temperature over SPI — but different diagnostics need different combinations. Inheritance hierarchies don’t exist in Rust. Passing every bus handle as a function argument creates unwieldy signatures. We need a way to mix in bus capabilities à la carte.

Step 1 — Define “Ingredient” Traits

Each ingredient provides one hardware capability via an associated type:

#![allow(unused)]
fn main() {
use std::io;

// ── Bus abstractions (traits the hardware team provides) ──────────
pub trait SpiBus {
    fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> io::Result<()>;
}

pub trait I2cBus {
    fn i2c_read(&self, addr: u8, reg: u8, buf: &mut [u8]) -> io::Result<()>;
    fn i2c_write(&self, addr: u8, reg: u8, data: &[u8]) -> io::Result<()>;
}

pub trait GpioPin {
    fn set_high(&self) -> io::Result<()>;
    fn set_low(&self) -> io::Result<()>;
    fn read_level(&self) -> io::Result<bool>;
}

pub trait IpmiBmc {
    fn raw_command(&self, net_fn: u8, cmd: u8, data: &[u8]) -> io::Result<Vec<u8>>;
    fn read_sensor(&self, sensor_id: u8) -> io::Result<f64>;
}

// ── Ingredient traits — one per bus, carries an associated type ───
pub trait HasSpi {
    type Spi: SpiBus;
    fn spi(&self) -> &Self::Spi;
}

pub trait HasI2c {
    type I2c: I2cBus;
    fn i2c(&self) -> &Self::I2c;
}

pub trait HasGpio {
    type Gpio: GpioPin;
    fn gpio(&self) -> &Self::Gpio;
}

pub trait HasIpmi {
    type Ipmi: IpmiBmc;
    fn ipmi(&self) -> &Self::Ipmi;
}
}

Each ingredient is tiny, generic, and testable in isolation.

Step 2 — Define “Mixin” Traits

A mixin trait declares its required ingredients as supertraits, then provides all its methods via defaults — implementors get them for free:

#![allow(unused)]
fn main() {
/// Mixin: fan diagnostics — needs I2C (tachometer) + GPIO (PWM enable)
pub trait FanDiagMixin: HasI2c + HasGpio {
    /// Read fan RPM from the tachometer IC over I2C.
    fn read_fan_rpm(&self, fan_id: u8) -> io::Result<u32> {
        let mut buf = [0u8; 2];
        self.i2c().i2c_read(0x48 + fan_id, 0x00, &mut buf)?;
        Ok(u16::from_be_bytes(buf) as u32 * 60) // tach counts → RPM
    }

    /// Enable or disable the fan PWM output via GPIO.
    fn set_fan_pwm(&self, enable: bool) -> io::Result<()> {
        if enable { self.gpio().set_high() }
        else      { self.gpio().set_low() }
    }

    /// Full fan health check — read RPM + verify within threshold.
    fn check_fan_health(&self, fan_id: u8, min_rpm: u32) -> io::Result<bool> {
        let rpm = self.read_fan_rpm(fan_id)?;
        Ok(rpm >= min_rpm)
    }
}

/// Mixin: temperature monitoring — needs SPI (thermocouple ADC) + IPMI (BMC sensors)
pub trait TempMonitorMixin: HasSpi + HasIpmi {
    /// Read a thermocouple via the SPI ADC (e.g. MAX31855).
    fn read_thermocouple(&self) -> io::Result<f64> {
        let mut rx = [0u8; 4];
        self.spi().spi_transfer(&[0x00; 4], &mut rx)?;
        let raw = i32::from_be_bytes(rx) >> 18; // 14-bit signed
        Ok(raw as f64 * 0.25)
    }

    /// Read a BMC-managed temperature sensor via IPMI.
    fn read_bmc_temp(&self, sensor_id: u8) -> io::Result<f64> {
        self.ipmi().read_sensor(sensor_id)
    }

    /// Cross-validate: thermocouple vs BMC must agree within delta.
    fn validate_temps(&self, sensor_id: u8, max_delta: f64) -> io::Result<bool> {
        let tc = self.read_thermocouple()?;
        let bmc = self.read_bmc_temp(sensor_id)?;
        Ok((tc - bmc).abs() <= max_delta)
    }
}

/// Mixin: power sequencing — needs GPIO (rail enable) + IPMI (event logging)
pub trait PowerSeqMixin: HasGpio + HasIpmi {
    /// Assert the power-good GPIO and verify via IPMI sensor.
    fn enable_power_rail(&self, sensor_id: u8) -> io::Result<bool> {
        self.gpio().set_high()?;
        std::thread::sleep(std::time::Duration::from_millis(50));
        let voltage = self.ipmi().read_sensor(sensor_id)?;
        Ok(voltage > 0.8) // above 80% nominal = good
    }

    /// De-assert power and log shutdown via IPMI OEM command.
    fn disable_power_rail(&self) -> io::Result<()> {
        self.gpio().set_low()?;
        // Log OEM "power rail disabled" event to BMC
        self.ipmi().raw_command(0x2E, 0x01, &[0x00, 0x01])?;
        Ok(())
    }
}
}

Step 3 — Blanket Impls Make It Truly “Mixin”

The magic line — provide the ingredients, get the methods:

#![allow(unused)]
fn main() {
impl<T: HasI2c + HasGpio>  FanDiagMixin    for T {}
impl<T: HasSpi  + HasIpmi>  TempMonitorMixin for T {}
impl<T: HasGpio + HasIpmi>  PowerSeqMixin   for T {}
}

Any struct that implements the right ingredient traits automatically gains every mixin method — no boilerplate, no forwarding, no inheritance.

Step 4 — Wire Up Production

#![allow(unused)]
fn main() {
// ── Concrete bus implementations (Linux platform) ────────────────
struct LinuxSpi  { dev: String }
struct LinuxI2c  { dev: String }
struct SysfsGpio { pin: u32 }
struct IpmiTool  { timeout_secs: u32 }

impl SpiBus for LinuxSpi {
    fn spi_transfer(&self, _tx: &[u8], _rx: &mut [u8]) -> io::Result<()> {
        // spidev ioctl — omitted for brevity
        Ok(())
    }
}
impl I2cBus for LinuxI2c {
    fn i2c_read(&self, _addr: u8, _reg: u8, _buf: &mut [u8]) -> io::Result<()> {
        // i2c-dev ioctl — omitted for brevity
        Ok(())
    }
    fn i2c_write(&self, _addr: u8, _reg: u8, _data: &[u8]) -> io::Result<()> { Ok(()) }
}
impl GpioPin for SysfsGpio {
    fn set_high(&self) -> io::Result<()>  { /* /sys/class/gpio */ Ok(()) }
    fn set_low(&self) -> io::Result<()>   { Ok(()) }
    fn read_level(&self) -> io::Result<bool> { Ok(true) }
}
impl IpmiBmc for IpmiTool {
    fn raw_command(&self, _nf: u8, _cmd: u8, _data: &[u8]) -> io::Result<Vec<u8>> {
        // shells out to ipmitool — omitted for brevity
        Ok(vec![])
    }
    fn read_sensor(&self, _id: u8) -> io::Result<f64> { Ok(25.0) }
}

// ── Production platform — all four buses ─────────────────────────
struct DiagPlatform {
    spi:  LinuxSpi,
    i2c:  LinuxI2c,
    gpio: SysfsGpio,
    ipmi: IpmiTool,
}

impl HasSpi  for DiagPlatform { type Spi  = LinuxSpi;  fn spi(&self)  -> &LinuxSpi  { &self.spi  } }
impl HasI2c  for DiagPlatform { type I2c  = LinuxI2c;  fn i2c(&self)  -> &LinuxI2c  { &self.i2c  } }
impl HasGpio for DiagPlatform { type Gpio = SysfsGpio; fn gpio(&self) -> &SysfsGpio { &self.gpio } }
impl HasIpmi for DiagPlatform { type Ipmi = IpmiTool;  fn ipmi(&self) -> &IpmiTool  { &self.ipmi } }

// DiagPlatform now has ALL mixin methods:
fn production_diagnostics(platform: &DiagPlatform) -> io::Result<()> {
    let rpm = platform.read_fan_rpm(0)?;       // from FanDiagMixin
    let tc  = platform.read_thermocouple()?;   // from TempMonitorMixin
    let ok  = platform.enable_power_rail(42)?;  // from PowerSeqMixin
    println!("Fan: {rpm} RPM, Temp: {tc}°C, Power: {ok}");
    Ok(())
}
}

Step 5 — Test With Mocks (No Hardware Required)

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use std::cell::Cell;

    struct MockSpi  { temp: Cell<f64> }
    struct MockI2c  { rpm: Cell<u32> }
    struct MockGpio { level: Cell<bool> }
    struct MockIpmi { sensor_val: Cell<f64> }

    impl SpiBus for MockSpi {
        fn spi_transfer(&self, _tx: &[u8], rx: &mut [u8]) -> io::Result<()> {
            // Encode mock temp as MAX31855 format
            let raw = ((self.temp.get() / 0.25) as i32) << 18;
            rx.copy_from_slice(&raw.to_be_bytes());
            Ok(())
        }
    }
    impl I2cBus for MockI2c {
        fn i2c_read(&self, _addr: u8, _reg: u8, buf: &mut [u8]) -> io::Result<()> {
            let tach = (self.rpm.get() / 60) as u16;
            buf.copy_from_slice(&tach.to_be_bytes());
            Ok(())
        }
        fn i2c_write(&self, _: u8, _: u8, _: &[u8]) -> io::Result<()> { Ok(()) }
    }
    impl GpioPin for MockGpio {
        fn set_high(&self)  -> io::Result<()>   { self.level.set(true);  Ok(()) }
        fn set_low(&self)   -> io::Result<()>   { self.level.set(false); Ok(()) }
        fn read_level(&self) -> io::Result<bool> { Ok(self.level.get()) }
    }
    impl IpmiBmc for MockIpmi {
        fn raw_command(&self, _: u8, _: u8, _: &[u8]) -> io::Result<Vec<u8>> { Ok(vec![]) }
        fn read_sensor(&self, _: u8) -> io::Result<f64> { Ok(self.sensor_val.get()) }
    }

    // ── Partial platform: only fan-related buses ─────────────────
    struct FanTestRig {
        i2c:  MockI2c,
        gpio: MockGpio,
    }
    impl HasI2c  for FanTestRig { type I2c  = MockI2c;  fn i2c(&self)  -> &MockI2c  { &self.i2c  } }
    impl HasGpio for FanTestRig { type Gpio = MockGpio; fn gpio(&self) -> &MockGpio { &self.gpio } }
    // FanTestRig gets FanDiagMixin but NOT TempMonitorMixin or PowerSeqMixin

    #[test]
    fn fan_health_check_passes_above_threshold() {
        let rig = FanTestRig {
            i2c:  MockI2c  { rpm: Cell::new(6000) },
            gpio: MockGpio { level: Cell::new(false) },
        };
        assert!(rig.check_fan_health(0, 4000).unwrap());
    }

    #[test]
    fn fan_health_check_fails_below_threshold() {
        let rig = FanTestRig {
            i2c:  MockI2c  { rpm: Cell::new(2000) },
            gpio: MockGpio { level: Cell::new(false) },
        };
        assert!(!rig.check_fan_health(0, 4000).unwrap());
    }
}
}

Notice that FanTestRig only implements HasI2c + HasGpio — it gets FanDiagMixin automatically, but the compiler refuses rig.read_thermocouple() because HasSpi is not satisfied. This is mixin scoping enforced at compile time.

Conditional Methods — Beyond What Ruby Can Do

Add where bounds to individual default methods. The method only exists when the associated type satisfies the extra bound:

#![allow(unused)]
fn main() {
/// Marker trait for DMA-capable SPI controllers
pub trait DmaCapable: SpiBus {
    fn dma_transfer(&self, tx: &[u8], rx: &mut [u8]) -> io::Result<()>;
}

/// Marker trait for interrupt-capable GPIO pins
pub trait InterruptCapable: GpioPin {
    fn wait_for_edge(&self, timeout_ms: u32) -> io::Result<bool>;
}

pub trait AdvancedDiagMixin: HasSpi + HasGpio {
    // Always available
    fn basic_probe(&self) -> io::Result<bool> {
        let mut rx = [0u8; 1];
        self.spi().spi_transfer(&[0xFF], &mut rx)?;
        Ok(rx[0] != 0x00)
    }

    // Only exists when the SPI controller supports DMA
    fn bulk_sensor_read(&self, buf: &mut [u8]) -> io::Result<()>
    where
        Self::Spi: DmaCapable,
    {
        self.spi().dma_transfer(&vec![0x00; buf.len()], buf)
    }

    // Only exists when the GPIO pin supports interrupts
    fn wait_for_fault_signal(&self, timeout_ms: u32) -> io::Result<bool>
    where
        Self::Gpio: InterruptCapable,
    {
        self.gpio().wait_for_edge(timeout_ms)
    }
}

impl<T: HasSpi + HasGpio> AdvancedDiagMixin for T {}
}

If your platform’s SPI doesn’t support DMA, calling bulk_sensor_read() is a compile error, not a runtime crash. Ruby’s respond_to? check is the closest equivalent — but it happens at deploy time, not compile time.

Composability: Stacking Mixins

Multiple mixins can share the same ingredient — no diamond problem:

┌─────────────┐    ┌───────────┐    ┌──────────────┐
│ FanDiagMixin│    │TempMonitor│    │ PowerSeqMixin│
│  (I2C+GPIO) │    │ (SPI+IPMI)│    │  (GPIO+IPMI) │
└──────┬──────┘    └─────┬─────┘    └──────┬───────┘
       │                 │                 │
       │   ┌─────────────┴─────────────┐   │
       └──►│      DiagPlatform         │◄──┘
           │ HasSpi+HasI2c+HasGpio     │
           │        +HasIpmi           │
           └───────────────────────────┘

DiagPlatform implements HasGpio once, and both FanDiagMixin and PowerSeqMixin use the same self.gpio(). In Ruby, this would be two modules both calling self.gpio_pin — but if they expected different pin numbers, you’d discover the conflict at runtime. In Rust, you can disambiguate at the type level.

Comparison: Ruby Mixins vs Rust Capability Mixins

DimensionRuby MixinsRust Capability Mixins
DispatchRuntime (method table lookup)Compile-time (monomorphised)
Safe compositionMRO linearisation hides conflictsCompiler rejects ambiguity
Conditional methodsrespond_to? at runtimewhere bounds at compile time
OverheadMethod dispatch + GCZero-cost (inlined)
TestabilityStub/mock via metaprogrammingGeneric over mock types
Adding new busesinclude at runtimeAdd ingredient trait, recompile
Runtime flexibilityextend, prepend, open classesNone (fully static)

When to Use Capability Mixins

ScenarioUse Mixins?
Multiple diagnostics share bus-reading logic
Test harness needs different bus subsets✅ (partial ingredient structs)
Methods only valid for certain bus capabilities (DMA, IRQ)✅ (conditional where bounds)
You need runtime module loading (plugins)❌ (use dyn Trait or enum dispatch)
Single struct with one bus — no sharing needed❌ (keep it simple)
Cross-crate ingredients with coherence issues⚠️ (use newtype wrappers)

Key Takeaways — Capability Mixins

  1. Ingredient trait = associated type + accessor method (e.g., HasSpi)
  2. Mixin trait = supertrait bounds on ingredients + default method bodies
  3. Blanket impl = impl<T: HasX + HasY> Mixin for T {} — auto-injects methods
  4. Conditional methods = where Self::Spi: DmaCapable on individual defaults
  5. Partial platforms = test structs that only impl the needed ingredients
  6. No runtime cost — the compiler generates specialised code for each platform type

Typed Commands — GADT-Style Return Type Safety

In Haskell, Generalised Algebraic Data Types (GADTs) let each constructor of a data type refine the type parameter — so Expr Int and Expr Bool are enforced by the type checker. Rust has no direct GADT syntax, but traits with associated types achieve the same guarantee: the command type determines the response type, and mixing them up is a compile error.

This pattern is particularly powerful for hardware diagnostics, where IPMI commands, register reads, and sensor queries each return different physical quantities that should never be confused.

The Problem: The Untyped Vec<u8> Swamp

Most C/C++ IPMI stacks — and naïve Rust ports — use raw bytes everywhere:

#![allow(unused)]
fn main() {
use std::io;

struct BmcConnectionUntyped { timeout_secs: u32 }

impl BmcConnectionUntyped {
    fn raw_command(&self, net_fn: u8, cmd: u8, data: &[u8]) -> io::Result<Vec<u8>> {
        // ... shells out to ipmitool ...
        Ok(vec![0x00, 0x19, 0x00]) // stub
    }
}

fn diagnose_thermal_untyped(bmc: &BmcConnectionUntyped) -> io::Result<()> {
    // Read CPU temperature — sensor ID 0x20
    let raw = bmc.raw_command(0x04, 0x2D, &[0x20])?;
    let cpu_temp = raw[0] as f64;  // 🤞 hope byte 0 is the reading

    // Read fan speed — sensor ID 0x30
    let raw = bmc.raw_command(0x04, 0x2D, &[0x30])?;
    let fan_rpm = raw[0] as u32;  // 🐛 BUG: fan speed is 2 bytes LE

    // Read inlet voltage — sensor ID 0x40
    let raw = bmc.raw_command(0x04, 0x2D, &[0x40])?;
    let voltage = raw[0] as f64;  // 🐛 BUG: need to divide by 1000

    // 🐛 Comparing °C to RPM — compiles, but nonsensical
    if cpu_temp > fan_rpm as f64 {
        println!("uh oh");
    }

    // 🐛 Passing Volts as temperature — compiles fine
    log_temp_untyped(voltage);
    log_volts_untyped(cpu_temp);

    Ok(())
}

fn log_temp_untyped(t: f64)  { println!("Temp: {t}°C"); }
fn log_volts_untyped(v: f64) { println!("Voltage: {v}V"); }
}

Every reading is f64 — the compiler has no idea that one is a temperature, another is RPM, another is voltage. Four distinct bugs compile without warning:

#BugConsequenceDiscovered
1Fan RPM parsed as 1 byte instead of 2Reads 25 RPM instead of 6400Production, 3 AM fan-failure flood
2Voltage not divided by 100012000V instead of 12.0VThreshold check flags every PSU
3Comparing °C to RPMMeaningless booleanPossibly never
4Voltage passed to log_temp_untyped()Silent data corruption in logs6 months later, reading history

The Solution: Typed Commands via Associated Types

Step 1 — Domain newtypes

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Celsius(f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Rpm(u32);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Volts(f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Watts(f64);
}

Step 2 — The command trait (the GADT equivalent)

The associated type Response is the key — it binds each command to its return type:

#![allow(unused)]
fn main() {
trait IpmiCmd {
    /// The GADT "index" — determines what execute() returns.
    type Response;

    fn net_fn(&self) -> u8;
    fn cmd_byte(&self) -> u8;
    fn payload(&self) -> Vec<u8>;

    /// Parsing is encapsulated HERE — each command knows its own byte layout.
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}
}

Step 3 — One struct per command, parsing written once

#![allow(unused)]
fn main() {
struct ReadTemp { sensor_id: u8 }
impl IpmiCmd for ReadTemp {
    type Response = Celsius;  // ← "this command returns a temperature"
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.sensor_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Celsius> {
        // Signed byte per IPMI SDR — written once, tested once
        Ok(Celsius(raw[0] as i8 as f64))
    }
}

struct ReadFanSpeed { fan_id: u8 }
impl IpmiCmd for ReadFanSpeed {
    type Response = Rpm;     // ← "this command returns RPM"
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.fan_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Rpm> {
        // 2-byte LE — the correct layout, encoded once
        Ok(Rpm(u16::from_le_bytes([raw[0], raw[1]]) as u32))
    }
}

struct ReadVoltage { rail: u8 }
impl IpmiCmd for ReadVoltage {
    type Response = Volts;   // ← "this command returns voltage"
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.rail] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Volts> {
        // Millivolts → Volts, always correct
        Ok(Volts(u16::from_le_bytes([raw[0], raw[1]]) as f64 / 1000.0))
    }
}

struct ReadFru { fru_id: u8 }
impl IpmiCmd for ReadFru {
    type Response = String;
    fn net_fn(&self) -> u8 { 0x0A }
    fn cmd_byte(&self) -> u8 { 0x11 }
    fn payload(&self) -> Vec<u8> { vec![self.fru_id, 0x00, 0x00, 0xFF] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<String> {
        Ok(String::from_utf8_lossy(raw).to_string())
    }
}
}

Step 4 — The executor (zero dyn, monomorphised)

#![allow(unused)]
fn main() {
struct BmcConnection { timeout_secs: u32 }

impl BmcConnection {
    /// Generic over any command — compiler generates one version per command type.
    fn execute<C: IpmiCmd>(&self, cmd: &C) -> io::Result<C::Response> {
        let raw = self.raw_send(cmd.net_fn(), cmd.cmd_byte(), &cmd.payload())?;
        cmd.parse_response(&raw)
    }

    fn raw_send(&self, _nf: u8, _cmd: u8, _data: &[u8]) -> io::Result<Vec<u8>> {
        Ok(vec![0x19, 0x00]) // stub — real impl calls ipmitool
    }
}
}

Step 5 — Caller code: all four bugs become compile errors

#![allow(unused)]
fn main() {
fn diagnose_thermal(bmc: &BmcConnection) -> io::Result<()> {
    let cpu_temp: Celsius = bmc.execute(&ReadTemp { sensor_id: 0x20 })?;
    let fan_rpm:  Rpm     = bmc.execute(&ReadFanSpeed { fan_id: 0x30 })?;
    let voltage:  Volts   = bmc.execute(&ReadVoltage { rail: 0x40 })?;

    // Bug #1 — IMPOSSIBLE: parsing lives in ReadFanSpeed::parse_response
    // Bug #2 — IMPOSSIBLE: scaling lives in ReadVoltage::parse_response

    // Bug #3 — COMPILE ERROR:
    // if cpu_temp > fan_rpm { }
    //    ^^^^^^^^   ^^^^^^^
    //    Celsius    Rpm      → "mismatched types" ❌

    // Bug #4 — COMPILE ERROR:
    // log_temperature(voltage);
    //                 ^^^^^^^  Volts, expected Celsius ❌

    // Only correct comparisons compile:
    if cpu_temp > Celsius(85.0) {
        println!("CPU overheating: {:?}", cpu_temp);
    }
    if fan_rpm < Rpm(4000) {
        println!("Fan too slow: {:?}", fan_rpm);
    }

    Ok(())
}

fn log_temperature(t: Celsius) { println!("Temp: {:?}", t); }
fn log_voltage(v: Volts)       { println!("Voltage: {:?}", v); }
}

Macro DSL for Diagnostic Scripts

For large diagnostic routines that run many commands in sequence, a macro gives concise declarative syntax while preserving full type safety:

#![allow(unused)]
fn main() {
/// Execute a series of typed IPMI commands, returning a tuple of results.
/// Each element of the tuple has the command's own Response type.
macro_rules! diag_script {
    ($bmc:expr; $($cmd:expr),+ $(,)?) => {{
        ( $( $bmc.execute(&$cmd)?, )+ )
    }};
}

fn full_pre_flight(bmc: &BmcConnection) -> io::Result<()> {
    // Expands to: (Celsius, Rpm, Volts, String) — every type tracked
    let (temp, rpm, volts, board_pn) = diag_script!(bmc;
        ReadTemp     { sensor_id: 0x20 },
        ReadFanSpeed { fan_id:    0x30 },
        ReadVoltage  { rail:      0x40 },
        ReadFru      { fru_id:    0x00 },
    );

    println!("Board: {:?}", board_pn);
    println!("CPU: {:?}, Fan: {:?}, 12V: {:?}", temp, rpm, volts);

    // Type-safe threshold checks:
    assert!(temp  < Celsius(95.0), "CPU too hot");
    assert!(rpm   > Rpm(3000),     "Fan too slow");
    assert!(volts > Volts(11.4),   "12V rail sagging");

    Ok(())
}
}

The macro is just syntactic sugar — the tuple type (Celsius, Rpm, Volts, String) is fully inferred by the compiler. Swap two commands and the destructuring breaks at compile time, not at runtime.

Enum Dispatch for Heterogeneous Command Lists

When you need a Vec of mixed commands (e.g., a configurable script loaded from JSON), use enum dispatch to stay dyn-free:

#![allow(unused)]
fn main() {
enum AnyReading {
    Temp(Celsius),
    Rpm(Rpm),
    Volt(Volts),
    Text(String),
}

enum AnyCmd {
    Temp(ReadTemp),
    Fan(ReadFanSpeed),
    Voltage(ReadVoltage),
    Fru(ReadFru),
}

impl AnyCmd {
    fn execute(&self, bmc: &BmcConnection) -> io::Result<AnyReading> {
        match self {
            AnyCmd::Temp(c)    => Ok(AnyReading::Temp(bmc.execute(c)?)),
            AnyCmd::Fan(c)     => Ok(AnyReading::Rpm(bmc.execute(c)?)),
            AnyCmd::Voltage(c) => Ok(AnyReading::Volt(bmc.execute(c)?)),
            AnyCmd::Fru(c)     => Ok(AnyReading::Text(bmc.execute(c)?)),
        }
    }
}

/// Dynamic diagnostic script — commands loaded at runtime
fn run_script(bmc: &BmcConnection, script: &[AnyCmd]) -> io::Result<Vec<AnyReading>> {
    script.iter().map(|cmd| cmd.execute(bmc)).collect()
}
}

You lose per-element type tracking (everything is AnyReading), but you gain runtime flexibility — and the parsing is still encapsulated in each IpmiCmd impl.

Testing Typed Commands

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    struct StubBmc {
        responses: std::collections::HashMap<u8, Vec<u8>>,
    }

    impl StubBmc {
        fn execute<C: IpmiCmd>(&self, cmd: &C) -> io::Result<C::Response> {
            let key = cmd.payload()[0]; // sensor ID as key
            let raw = self.responses.get(&key)
                .ok_or_else(|| io::Error::new(io::ErrorKind::NotFound, "no stub"))?;
            cmd.parse_response(raw)
        }
    }

    #[test]
    fn read_temp_parses_signed_byte() {
        let bmc = StubBmc {
            responses: [( 0x20, vec![0xE7] )].into() // -25 as i8 = 0xE7
        };
        let temp = bmc.execute(&ReadTemp { sensor_id: 0x20 }).unwrap();
        assert_eq!(temp, Celsius(-25.0));
    }

    #[test]
    fn read_fan_parses_two_byte_le() {
        let bmc = StubBmc {
            responses: [( 0x30, vec![0x00, 0x19] )].into() // 0x1900 = 6400
        };
        let rpm = bmc.execute(&ReadFanSpeed { fan_id: 0x30 }).unwrap();
        assert_eq!(rpm, Rpm(6400));
    }

    #[test]
    fn read_voltage_scales_millivolts() {
        let bmc = StubBmc {
            responses: [( 0x40, vec![0xE8, 0x2E] )].into() // 0x2EE8 = 12008 mV
        };
        let v = bmc.execute(&ReadVoltage { rail: 0x40 }).unwrap();
        assert!((v.0 - 12.008).abs() < 0.001);
    }
}
}

Each command’s parsing is tested independently. If ReadFanSpeed changes from 2-byte LE to 4-byte BE in a new IPMI spec revision, you update one parse_response and the test catches regressions.

How This Maps to Haskell GADTs

Haskell GADT                         Rust Equivalent
────────────────                     ───────────────────────
data Cmd a where                     trait IpmiCmd {
  ReadTemp :: SensorId -> Cmd Temp       type Response;
  ReadFan  :: FanId    -> Cmd Rpm        ...
                                     }

eval :: Cmd a -> IO a                fn execute<C: IpmiCmd>(&self, cmd: &C)
                                         -> io::Result<C::Response>

Type refinement in case branches     Monomorphisation: compiler generates
                                     execute::<ReadTemp>() → returns Celsius
                                     execute::<ReadFanSpeed>() → returns Rpm

Both guarantee: the command determines the return type. Rust achieves it through generic monomorphisation instead of type-level case analysis — same safety, zero runtime cost.

Before vs After Summary

DimensionUntyped (Vec<u8>)Typed Commands
Lines per sensor~3 (duplicated at every call site)~15 (written and tested once)
Parsing errors possibleAt every call siteIn one parse_response impl
Unit confusion bugsUnlimitedZero (compile error)
Adding a new sensorTouch N files, copy-paste parsingAdd 1 struct + 1 impl
Runtime costIdentical (monomorphised)
IDE autocompletef64 everywhereCelsius, Rpm, Volts — self-documenting
Code review burdenMust verify every raw byte parseVerify one parse_response per sensor
Macro DSLN/Adiag_script!(bmc; ReadTemp{..}, ReadFan{..})(Celsius, Rpm)
Dynamic scriptsManual dispatchAnyCmd enum — still dyn-free

When to Use Typed Commands

ScenarioRecommendation
IPMI sensor reads with distinct physical units✅ Typed commands
Register map with different-width fields✅ Typed commands
Network protocol messages (request → response)✅ Typed commands
Single command type with one return format❌ Overkill — just return the type directly
Prototyping / exploring an unknown device❌ Raw bytes first, type later
Plugin system where commands aren’t known at compile time⚠️ Use AnyCmd enum dispatch

Key Takeaways — Traits

  • Associated types = one impl per type; generic parameters = many impls per type
  • GATs unlock lending iterators and async-in-traits patterns
  • Use enum dispatch for closed sets (fast); dyn Trait for open sets (flexible)
  • Any + TypeId is the escape hatch when compile-time types are unknown

See also: Ch 1 — Generics for monomorphization and when generics cause code bloat. Ch 3 — Newtype & Type-State for using traits with the config trait pattern.


Exercise: Repository with Associated Types ★★★ (~40 min)

Design a Repository trait with associated Error, Id, and Item types. Implement it for an in-memory store and demonstrate compile-time type safety.

🔑 Solution
use std::collections::HashMap;

trait Repository {
    type Item;
    type Id;
    type Error;

    fn get(&self, id: &Self::Id) -> Result<Option<&Self::Item>, Self::Error>;
    fn insert(&mut self, item: Self::Item) -> Result<Self::Id, Self::Error>;
    fn delete(&mut self, id: &Self::Id) -> Result<bool, Self::Error>;
}

#[derive(Debug, Clone)]
struct User {
    name: String,
    email: String,
}

struct InMemoryUserRepo {
    data: HashMap<u64, User>,
    next_id: u64,
}

impl InMemoryUserRepo {
    fn new() -> Self {
        InMemoryUserRepo { data: HashMap::new(), next_id: 1 }
    }
}

impl Repository for InMemoryUserRepo {
    type Item = User;
    type Id = u64;
    type Error = std::convert::Infallible;

    fn get(&self, id: &u64) -> Result<Option<&User>, Self::Error> {
        Ok(self.data.get(id))
    }

    fn insert(&mut self, item: User) -> Result<u64, Self::Error> {
        let id = self.next_id;
        self.next_id += 1;
        self.data.insert(id, item);
        Ok(id)
    }

    fn delete(&mut self, id: &u64) -> Result<bool, Self::Error> {
        Ok(self.data.remove(id).is_some())
    }
}

fn create_and_fetch<R: Repository>(repo: &mut R, item: R::Item) -> Result<(), R::Error>
where
    R::Item: std::fmt::Debug,
    R::Id: std::fmt::Debug,
{
    let id = repo.insert(item)?;
    println!("Inserted with id: {id:?}");
    let retrieved = repo.get(&id)?;
    println!("Retrieved: {retrieved:?}");
    Ok(())
}

fn main() {
    let mut repo = InMemoryUserRepo::new();
    create_and_fetch(&mut repo, User {
        name: "Alice".into(),
        email: "alice@example.com".into(),
    }).unwrap();
}

3. The Newtype and Type-State Patterns 🟡

What you’ll learn:

  • The newtype pattern for zero-cost compile-time type safety
  • Type-state pattern: making illegal state transitions unrepresentable
  • Builder pattern with type states for compile-time–enforced construction
  • Config trait pattern for taming generic parameter explosion

Newtype: Zero-Cost Type Safety

The newtype pattern wraps an existing type in a single-field tuple struct to create a distinct type with zero runtime overhead:

#![allow(unused)]
fn main() {
// Without newtypes — easy to mix up:
fn create_user(name: String, email: String, age: u32, employee_id: u32) { }
// create_user(name, email, age, id);  — but what if we swap age and id?
// create_user(name, email, id, age);  — COMPILES FINE, BUG

// With newtypes — the compiler catches mistakes:
struct UserName(String);
struct Email(String);
struct Age(u32);
struct EmployeeId(u32);

fn create_user(name: UserName, email: Email, age: Age, id: EmployeeId) { }
// create_user(name, email, EmployeeId(42), Age(30));
// ❌ Compile error: expected Age, got EmployeeId
}

impl Deref for Newtypes — Power and Pitfalls

Implementing Deref on a newtype lets it auto-coerce to the inner type’s reference, giving you all of the inner type’s methods “for free”:

#![allow(unused)]
fn main() {
use std::ops::Deref;

struct Email(String);

impl Email {
    fn new(raw: &str) -> Result<Self, &'static str> {
        if raw.contains('@') {
            Ok(Email(raw.to_string()))
        } else {
            Err("invalid email: missing @")
        }
    }
}

impl Deref for Email {
    type Target = str;
    fn deref(&self) -> &str { &self.0 }
}

// Now Email auto-derefs to &str:
let email = Email::new("user@example.com").unwrap();
println!("Length: {}", email.len()); // Uses str::len via Deref
}

This is convenient — but it effectively punches a hole through your newtype’s abstraction boundary because every method on the target type becomes callable on your wrapper.

When Deref IS appropriate

ScenarioExampleWhy it’s fine
Smart-pointer wrappersBox<T>, Arc<T>, MutexGuard<T>The wrapper’s whole purpose is to behave like T
Transparent “thin” wrappersStringstr, PathBufPath, Vec<T>[T]The wrapper IS-A superset of the target
Your newtype genuinely IS the inner typestruct Hostname(String) where you always want full string opsRestricting the API would add no value

When Deref is an anti-pattern

ScenarioProblem
Domain types with invariantsEmail derefs to &str, so callers can call .split_at(), .trim(), etc. — none of which preserve the “must contain @” invariant. If someone stores the trimmed &str and reconstructs, the invariant is lost.
Types where you want a restricted APIstruct Password(String) with Deref<Target = str> leaks .as_bytes(), .chars(), Debug output — exactly what you’re trying to hide.
Fake inheritanceUsing Deref to make ManagerWidget auto-deref to Widget simulates OOP inheritance. This is explicitly discouraged — see the Rust API Guidelines (C-DEREF).

Rule of thumb: If your newtype exists to add type safety or restrict the API, don’t implement Deref. If it exists to add capabilities while keeping the inner type’s full surface (like a smart pointer), Deref is the right choice.

DerefMut — doubles the risk

If you also implement DerefMut, callers can mutate the inner value directly, bypassing any validation in your constructors:

#![allow(unused)]
fn main() {
use std::ops::{Deref, DerefMut};

struct PortNumber(u16);

impl Deref for PortNumber {
    type Target = u16;
    fn deref(&self) -> &u16 { &self.0 }
}

impl DerefMut for PortNumber {
    fn deref_mut(&mut self) -> &mut u16 { &mut self.0 }
}

let mut port = PortNumber(443);
*port = 0; // Bypasses any validation — now an invalid port
}

Only implement DerefMut when the inner type has no invariants to protect.

Prefer explicit delegation instead

When you want only some of the inner type’s methods, delegate explicitly:

#![allow(unused)]
fn main() {
struct Email(String);

impl Email {
    fn new(raw: &str) -> Result<Self, &'static str> {
        if raw.contains('@') { Ok(Email(raw.to_string())) }
        else { Err("missing @") }
    }

    // Expose only what makes sense:
    pub fn as_str(&self) -> &str { &self.0 }
    pub fn len(&self) -> usize { self.0.len() }
    pub fn domain(&self) -> &str {
        self.0.split('@').nth(1).unwrap_or("")
    }
    // .split_at(), .trim(), .replace() — NOT exposed
}
}

Clippy and the ecosystem

  • clippy::wrong_self_convention can fire when Deref coercion makes method resolution surprising (e.g., is_empty() resolving to the inner type’s version instead of one you intended to shadow).
  • The Rust API Guidelines (C-DEREF) state: “only smart pointers should implement Deref.” Treat this as a strong default; deviate only with clear justification.
  • If you need trait compatibility (e.g., passing Email to functions expecting &str), consider implementing AsRef<str> and Borrow<str> instead — they’re explicit conversions without auto-coercion surprises.

Decision matrix

Do you want ALL methods of the inner type to be callable?
  ├─ YES → Does your type enforce invariants or restrict the API?
  │    ├─ NO  → impl Deref ✅  (smart-pointer / transparent wrapper)
  │    └─ YES → Don't impl Deref ❌ (invariant leaks)
  └─ NO  → Don't impl Deref ❌  (use AsRef / explicit delegation)

Type-State: Compile-Time Protocol Enforcement

The type-state pattern uses the type system to enforce that operations happen in the correct order. Invalid states become unrepresentable.

stateDiagram-v2
    [*] --> Disconnected: new()
    Disconnected --> Connected: connect()
    Connected --> Authenticated: authenticate()
    Authenticated --> Authenticated: request()
    Authenticated --> [*]: drop

    Disconnected --> Disconnected: ❌ request() won't compile
    Connected --> Connected: ❌ request() won't compile

Each transition consumes self and returns a new type — the compiler enforces valid ordering.

// Problem: A network connection that must be:
// 1. Created
// 2. Connected
// 3. Authenticated
// 4. Then used for requests
// Calling request() before authenticate() should be a COMPILE error.

// --- Type-state markers (zero-sized types) ---
struct Disconnected;
struct Connected;
struct Authenticated;

// --- Connection parameterized by state ---
struct Connection<State> {
    address: String,
    _state: std::marker::PhantomData<State>,
}

// Only Disconnected connections can connect:
impl Connection<Disconnected> {
    fn new(address: &str) -> Self {
        Connection {
            address: address.to_string(),
            _state: std::marker::PhantomData,
        }
    }

    fn connect(self) -> Connection<Connected> {
        println!("Connecting to {}...", self.address);
        Connection {
            address: self.address,
            _state: std::marker::PhantomData,
        }
    }
}

// Only Connected connections can authenticate:
impl Connection<Connected> {
    fn authenticate(self, _token: &str) -> Connection<Authenticated> {
        println!("Authenticating...");
        Connection {
            address: self.address,
            _state: std::marker::PhantomData,
        }
    }
}

// Only Authenticated connections can make requests:
impl Connection<Authenticated> {
    fn request(&self, path: &str) -> String {
        format!("GET {} from {}", path, self.address)
    }
}

fn main() {
    let conn = Connection::new("api.example.com");
    // conn.request("/data"); // ❌ Compile error: no method `request` on Connection<Disconnected>

    let conn = conn.connect();
    // conn.request("/data"); // ❌ Compile error: no method `request` on Connection<Connected>

    let conn = conn.authenticate("secret-token");
    let response = conn.request("/data"); // ✅ Only works after authentication
    println!("{response}");
}

Key insight: Each state transition consumes self and returns a new type. You can’t use the old state after transitioning — the compiler enforces it. Zero runtime cost — PhantomData is zero-sized, states are erased at compile time.

Comparison with C++/C#: In C++ or C#, you’d enforce this with runtime checks (if (!authenticated) throw ...). The Rust type-state pattern moves these checks to compile time — invalid states are literally unrepresentable in the type system.

Builder Pattern with Type States

A practical application — a builder that enforces required fields:

use std::marker::PhantomData;

// Marker types for required fields
struct NeedsName;
struct NeedsPort;
struct Ready;

struct ServerConfig<State> {
    name: Option<String>,
    port: Option<u16>,
    max_connections: usize, // Optional, has default
    _state: PhantomData<State>,
}

impl ServerConfig<NeedsName> {
    fn new() -> Self {
        ServerConfig {
            name: None,
            port: None,
            max_connections: 100,
            _state: PhantomData,
        }
    }

    fn name(self, name: &str) -> ServerConfig<NeedsPort> {
        ServerConfig {
            name: Some(name.to_string()),
            port: self.port,
            max_connections: self.max_connections,
            _state: PhantomData,
        }
    }
}

impl ServerConfig<NeedsPort> {
    fn port(self, port: u16) -> ServerConfig<Ready> {
        ServerConfig {
            name: self.name,
            port: Some(port),
            max_connections: self.max_connections,
            _state: PhantomData,
        }
    }
}

impl ServerConfig<Ready> {
    fn max_connections(mut self, n: usize) -> Self {
        self.max_connections = n;
        self
    }

    fn build(self) -> Server {
        Server {
            name: self.name.unwrap(),
            port: self.port.unwrap(),
            max_connections: self.max_connections,
        }
    }
}

struct Server {
    name: String,
    port: u16,
    max_connections: usize,
}

fn main() {
    // Must provide name, then port, then can build:
    let server = ServerConfig::new()
        .name("my-server")
        .port(8080)
        .max_connections(500)
        .build();

    // ServerConfig::new().port(8080); // ❌ Compile error: no method `port` on NeedsName
    // ServerConfig::new().name("x").build(); // ❌ Compile error: no method `build` on NeedsPort
}

Case Study: Type-Safe Connection Pool

Real-world systems need connection pools where connections move through well-defined states. Here’s how the typestate pattern enforces correctness in a production pool:

stateDiagram-v2
    [*] --> Idle: pool.acquire()
    Idle --> Active: conn.begin_transaction()
    Active --> Active: conn.execute(query)
    Active --> Idle: conn.commit() / conn.rollback()
    Idle --> [*]: pool.release(conn)

    Active --> [*]: ❌ cannot release mid-transaction
use std::marker::PhantomData;

// States
struct Idle;
struct InTransaction;

struct PooledConnection<State> {
    id: u32,
    _state: PhantomData<State>,
}

struct Pool {
    next_id: u32,
}

impl Pool {
    fn new() -> Self { Pool { next_id: 0 } }

    fn acquire(&mut self) -> PooledConnection<Idle> {
        self.next_id += 1;
        println!("[pool] Acquired connection #{}", self.next_id);
        PooledConnection { id: self.next_id, _state: PhantomData }
    }

    // Only idle connections can be released — prevents mid-transaction leaks
    fn release(&self, conn: PooledConnection<Idle>) {
        println!("[pool] Released connection #{}", conn.id);
    }
}

impl PooledConnection<Idle> {
    fn begin_transaction(self) -> PooledConnection<InTransaction> {
        println!("[conn #{}] BEGIN", self.id);
        PooledConnection { id: self.id, _state: PhantomData }
    }
}

impl PooledConnection<InTransaction> {
    fn execute(&self, query: &str) {
        println!("[conn #{}] EXEC: {}", self.id, query);
    }

    fn commit(self) -> PooledConnection<Idle> {
        println!("[conn #{}] COMMIT", self.id);
        PooledConnection { id: self.id, _state: PhantomData }
    }

    fn rollback(self) -> PooledConnection<Idle> {
        println!("[conn #{}] ROLLBACK", self.id);
        PooledConnection { id: self.id, _state: PhantomData }
    }
}

fn main() {
    let mut pool = Pool::new();

    let conn = pool.acquire();
    let conn = conn.begin_transaction();
    conn.execute("INSERT INTO users VALUES ('Alice')");
    conn.execute("INSERT INTO orders VALUES (1, 42)");
    let conn = conn.commit(); // Back to Idle
    pool.release(conn);       // ✅ Only works on Idle connections

    // pool.release(conn_active); // ❌ Compile error: can't release InTransaction
}

Why this matters in production: A connection leaked mid-transaction holds database locks indefinitely. The typestate pattern makes this impossible — you literally cannot return a connection to the pool until the transaction is committed or rolled back.


Config Trait Pattern — Taming Generic Parameter Explosion

The Problem

As a struct takes on more responsibilities, each backed by a trait-constrained generic, the type signature grows unwieldy:

#![allow(unused)]
fn main() {
trait SpiBus   { fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> Result<(), BusError>; }
trait ComPort  { fn com_send(&self, data: &[u8]) -> Result<usize, BusError>; }
trait I3cBus   { fn i3c_read(&self, addr: u8, buf: &mut [u8]) -> Result<(), BusError>; }
trait SmBus    { fn smbus_read_byte(&self, addr: u8, cmd: u8) -> Result<u8, BusError>; }
trait GpioBus  { fn gpio_set(&self, pin: u32, high: bool); }

// ❌ Every new bus trait adds another generic parameter
struct DiagController<S: SpiBus, C: ComPort, I: I3cBus, M: SmBus, G: GpioBus> {
    spi: S,
    com: C,
    i3c: I,
    smbus: M,
    gpio: G,
}
// impl blocks, function signatures, and callers all repeat the full list.
// Adding a 6th bus means editing every mention of DiagController<S, C, I, M, G>.
}

This is often called “generic parameter explosion.” It compounds across impl blocks, function parameters, and downstream consumers — each of which must repeat the full parameter list.

The Solution: A Config Trait

Bundle all associated types into a single trait. The struct then has one generic parameter regardless of how many component types it contains:

#![allow(unused)]
fn main() {
#[derive(Debug)]
enum BusError {
    Timeout,
    NakReceived,
    HardwareFault(String),
}

// --- Bus traits (unchanged) ---
trait SpiBus {
    fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> Result<(), BusError>;
    fn spi_write(&self, data: &[u8]) -> Result<(), BusError>;
}

trait ComPort {
    fn com_send(&self, data: &[u8]) -> Result<usize, BusError>;
    fn com_recv(&self, buf: &mut [u8], timeout_ms: u32) -> Result<usize, BusError>;
}

trait I3cBus {
    fn i3c_read(&self, addr: u8, buf: &mut [u8]) -> Result<(), BusError>;
    fn i3c_write(&self, addr: u8, data: &[u8]) -> Result<(), BusError>;
}

// --- The Config trait: one associated type per component ---
trait BoardConfig {
    type Spi: SpiBus;
    type Com: ComPort;
    type I3c: I3cBus;
}

// --- DiagController has exactly ONE generic parameter ---
struct DiagController<Cfg: BoardConfig> {
    spi: Cfg::Spi,
    com: Cfg::Com,
    i3c: Cfg::I3c,
}
}

DiagController<Cfg> will never gain another generic parameter. Adding a 4th bus means adding one associated type to BoardConfig and one field to DiagController — no downstream signature changes.

Implementing the Controller

#![allow(unused)]
fn main() {
impl<Cfg: BoardConfig> DiagController<Cfg> {
    fn new(spi: Cfg::Spi, com: Cfg::Com, i3c: Cfg::I3c) -> Self {
        DiagController { spi, com, i3c }
    }

    fn read_flash_id(&self) -> Result<u32, BusError> {
        let cmd = [0x9F]; // JEDEC Read ID
        let mut id = [0u8; 4];
        self.spi.spi_transfer(&cmd, &mut id)?;
        Ok(u32::from_be_bytes(id))
    }

    fn send_bmc_command(&self, cmd: &[u8]) -> Result<Vec<u8>, BusError> {
        self.com.com_send(cmd)?;
        let mut resp = vec![0u8; 256];
        let n = self.com.com_recv(&mut resp, 1000)?;
        resp.truncate(n);
        Ok(resp)
    }

    fn read_sensor_temp(&self, sensor_addr: u8) -> Result<i16, BusError> {
        let mut buf = [0u8; 2];
        self.i3c.i3c_read(sensor_addr, &mut buf)?;
        Ok(i16::from_be_bytes(buf))
    }

    fn run_full_diag(&self) -> Result<DiagReport, BusError> {
        let flash_id = self.read_flash_id()?;
        let bmc_resp = self.send_bmc_command(b"VERSION\n")?;
        let cpu_temp = self.read_sensor_temp(0x48)?;
        let gpu_temp = self.read_sensor_temp(0x49)?;

        Ok(DiagReport {
            flash_id,
            bmc_version: String::from_utf8_lossy(&bmc_resp).to_string(),
            cpu_temp_c: cpu_temp,
            gpu_temp_c: gpu_temp,
        })
    }
}

#[derive(Debug)]
struct DiagReport {
    flash_id: u32,
    bmc_version: String,
    cpu_temp_c: i16,
    gpu_temp_c: i16,
}
}

Production Wiring

One impl BoardConfig selects the concrete hardware drivers:

struct PlatformSpi  { dev: String, speed_hz: u32 }
struct UartCom      { dev: String, baud: u32 }
struct LinuxI3c     { dev: String }

impl SpiBus for PlatformSpi {
    fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> Result<(), BusError> {
        // ioctl(SPI_IOC_MESSAGE) in production
        rx[0..4].copy_from_slice(&[0xEF, 0x40, 0x18, 0x00]);
        Ok(())
    }
    fn spi_write(&self, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

impl ComPort for UartCom {
    fn com_send(&self, _data: &[u8]) -> Result<usize, BusError> { Ok(0) }
    fn com_recv(&self, buf: &mut [u8], _timeout: u32) -> Result<usize, BusError> {
        let resp = b"BMC v2.4.1\n";
        buf[..resp.len()].copy_from_slice(resp);
        Ok(resp.len())
    }
}

impl I3cBus for LinuxI3c {
    fn i3c_read(&self, _addr: u8, buf: &mut [u8]) -> Result<(), BusError> {
        buf[0] = 0x00; buf[1] = 0x2D; // 45°C
        Ok(())
    }
    fn i3c_write(&self, _addr: u8, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

// ✅ One struct, one impl — all concrete types resolved here
struct ProductionBoard;
impl BoardConfig for ProductionBoard {
    type Spi = PlatformSpi;
    type Com = UartCom;
    type I3c = LinuxI3c;
}

fn main() {
    let ctrl = DiagController::<ProductionBoard>::new(
        PlatformSpi { dev: "/dev/spidev0.0".into(), speed_hz: 10_000_000 },
        UartCom     { dev: "/dev/ttyS0".into(),     baud: 115200 },
        LinuxI3c    { dev: "/dev/i3c-0".into() },
    );
    let report = ctrl.run_full_diag().unwrap();
    println!("{report:#?}");
}

Test Wiring with Mocks

Swap the entire hardware layer by defining a different BoardConfig:

#![allow(unused)]
fn main() {
struct MockSpi  { flash_id: [u8; 4] }
struct MockCom  { response: Vec<u8> }
struct MockI3c  { temps: std::collections::HashMap<u8, i16> }

impl SpiBus for MockSpi {
    fn spi_transfer(&self, _tx: &[u8], rx: &mut [u8]) -> Result<(), BusError> {
        rx[..4].copy_from_slice(&self.flash_id);
        Ok(())
    }
    fn spi_write(&self, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

impl ComPort for MockCom {
    fn com_send(&self, _data: &[u8]) -> Result<usize, BusError> { Ok(0) }
    fn com_recv(&self, buf: &mut [u8], _timeout: u32) -> Result<usize, BusError> {
        let n = self.response.len().min(buf.len());
        buf[..n].copy_from_slice(&self.response[..n]);
        Ok(n)
    }
}

impl I3cBus for MockI3c {
    fn i3c_read(&self, addr: u8, buf: &mut [u8]) -> Result<(), BusError> {
        let temp = self.temps.get(&addr).copied().unwrap_or(0);
        buf[..2].copy_from_slice(&temp.to_be_bytes());
        Ok(())
    }
    fn i3c_write(&self, _addr: u8, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

struct TestBoard;
impl BoardConfig for TestBoard {
    type Spi = MockSpi;
    type Com = MockCom;
    type I3c = MockI3c;
}

#[cfg(test)]
mod tests {
    use super::*;

    fn make_test_controller() -> DiagController<TestBoard> {
        let mut temps = std::collections::HashMap::new();
        temps.insert(0x48, 45i16);
        temps.insert(0x49, 72i16);

        DiagController::<TestBoard>::new(
            MockSpi  { flash_id: [0xEF, 0x40, 0x18, 0x00] },
            MockCom  { response: b"BMC v2.4.1\n".to_vec() },
            MockI3c  { temps },
        )
    }

    #[test]
    fn test_flash_id() {
        let ctrl = make_test_controller();
        assert_eq!(ctrl.read_flash_id().unwrap(), 0xEF401800);
    }

    #[test]
    fn test_sensor_temps() {
        let ctrl = make_test_controller();
        assert_eq!(ctrl.read_sensor_temp(0x48).unwrap(), 45);
        assert_eq!(ctrl.read_sensor_temp(0x49).unwrap(), 72);
    }

    #[test]
    fn test_full_diag() {
        let ctrl = make_test_controller();
        let report = ctrl.run_full_diag().unwrap();
        assert_eq!(report.flash_id, 0xEF401800);
        assert_eq!(report.cpu_temp_c, 45);
        assert_eq!(report.gpu_temp_c, 72);
        assert!(report.bmc_version.contains("2.4.1"));
    }
}
}

Adding a New Bus Later

When you need a 4th bus, only two things change — BoardConfig and DiagController. No downstream signature changes. The generic parameter count stays at one:

#![allow(unused)]
fn main() {
trait SmBus {
    fn smbus_read_byte(&self, addr: u8, cmd: u8) -> Result<u8, BusError>;
}

// 1. Add one associated type:
trait BoardConfig {
    type Spi: SpiBus;
    type Com: ComPort;
    type I3c: I3cBus;
    type Smb: SmBus;     // ← new
}

// 2. Add one field:
struct DiagController<Cfg: BoardConfig> {
    spi: Cfg::Spi,
    com: Cfg::Com,
    i3c: Cfg::I3c,
    smb: Cfg::Smb,       // ← new
}

// 3. Provide the concrete type in each config impl:
impl BoardConfig for ProductionBoard {
    type Spi = PlatformSpi;
    type Com = UartCom;
    type I3c = LinuxI3c;
    type Smb = LinuxSmbus; // ← new
}
}

When to Use This Pattern

SituationUse Config Trait?Alternative
3+ trait-constrained generics on a struct✅ Yes
Need to swap entire hardware/platform layer✅ Yes
Only 1-2 generics❌ OverkillDirect generics
Need runtime polymorphismdyn Trait objects
Open-ended plugin systemType-map / Any
Component traits form a natural group (board, platform)✅ Yes

Key Properties

  • One generic parameter foreverDiagController<Cfg> never gains more <A, B, C, ...>
  • Fully static dispatch — no vtables, no dyn, no heap allocation for trait objects
  • Clean test swapping — define TestBoard with mock impls, zero conditional compilation
  • Compile-time safety — forget an associated type → compile error, not runtime crash
  • Battle-tested — this is the pattern used by Substrate/Polkadot’s frame system to manage 20+ associated types through a single Config trait

Key Takeaways — Newtype & Type-State

  • Newtypes give compile-time type safety at zero runtime cost
  • Type-state makes illegal state transitions a compile error, not a runtime bug
  • Config traits tame generic parameter explosion in large systems

See also: Ch 4 — PhantomData for the zero-sized markers that power type-state. Ch 2 — Traits In Depth for associated types used in the config trait pattern.


Case Study: Dual-Axis Typestate — Vendor × Protocol State

The patterns above handle one axis at a time: typestate enforces protocol order, and trait abstraction handles multiple vendors. Real systems often need both simultaneously: a wrapper Handle<Vendor, State> where available methods depend on which vendor is plugged in and which state the handle is in.

This section shows the dual-axis conditional impl pattern — where impl blocks are gated on both a vendor trait bound and a state marker trait.

The Two-Dimensional Problem

Consider a debug probe interface (JTAG/SWD). Multiple vendors make probes, and every probe must be unlocked before registers become accessible. Some vendors additionally support direct memory reads — but only after an extended unlock that configures the memory access port:

graph LR
    subgraph "All vendors"
        L["🔒 Locked"] -- "unlock()" --> U["🔓 Unlocked"]
    end
    subgraph "Memory-capable vendors only"
        U -- "extended_unlock()" --> E["🔓🧠 ExtendedUnlocked"]
    end

    U -. "read_reg() / write_reg()" .-> U
    E -. "read_reg() / write_reg()" .-> E
    E -. "read_memory() / write_memory()" .-> E

    style L fill:#fee,stroke:#c33
    style U fill:#efe,stroke:#3a3
    style E fill:#eef,stroke:#33c

The capability matrix — which methods exist for which (vendor, state) combination — is two-dimensional:

block-beta
    columns 4
    space header1["Locked"] header2["Unlocked"] header3["ExtendedUnlocked"]
    basic["Basic Vendor"]:1 b1["unlock()"] b2["read_reg()\nwrite_reg()"] b3["— unreachable —"]
    memory["Memory Vendor"]:1 m1["unlock()"] m2["read_reg()\nwrite_reg()\nextended_unlock()"] m3["read_reg()\nwrite_reg()\nread_memory()\nwrite_memory()"]

    style b1 fill:#ffd,stroke:#aa0
    style b2 fill:#efe,stroke:#3a3
    style b3 fill:#eee,stroke:#999,stroke-dasharray: 5 5
    style m1 fill:#ffd,stroke:#aa0
    style m2 fill:#efe,stroke:#3a3
    style m3 fill:#eef,stroke:#33c

The challenge: express this matrix entirely at compile time, with static dispatch, so that calling extended_unlock() on a basic probe or read_memory() on an unlocked-but-not-extended handle is a compile error.

The Solution: Jtag<V, S> with Marker Traits

Step 1 — State tokens and capability markers:

use std::marker::PhantomData;

// Zero-sized state tokens — no runtime cost
struct Locked;
struct Unlocked;
struct ExtendedUnlocked;

// Marker traits express which capabilities each state has
trait HasRegAccess {}
impl HasRegAccess for Unlocked {}
impl HasRegAccess for ExtendedUnlocked {}

trait HasMemAccess {}
impl HasMemAccess for ExtendedUnlocked {}

Why marker traits, not just concrete states? Writing impl<V, S: HasRegAccess> Jtag<V, S> means read_reg() works in any state with register access — today that’s Unlocked and ExtendedUnlocked, but if you add DebugHalted tomorrow, you just add one line: impl HasRegAccess for DebugHalted {}. Every register function works with it automatically — zero code changes.

Step 2 — Vendor traits (raw operations):

// Every probe vendor implements these
trait JtagVendor {
    fn raw_unlock(&mut self);
    fn raw_read_reg(&self, addr: u32) -> u32;
    fn raw_write_reg(&mut self, addr: u32, val: u32);
}

// Vendors with memory access also implement this super-trait
trait JtagMemoryVendor: JtagVendor {
    fn raw_extended_unlock(&mut self);
    fn raw_read_memory(&self, addr: u64, buf: &mut [u8]);
    fn raw_write_memory(&mut self, addr: u64, data: &[u8]);
}

Step 3 — The wrapper with conditional impl blocks:

struct Jtag<V, S = Locked> {
    vendor: V,
    _state: PhantomData<S>,
}

// Construction — always starts Locked
impl<V: JtagVendor> Jtag<V, Locked> {
    fn new(vendor: V) -> Self {
        Jtag { vendor, _state: PhantomData }
    }

    fn unlock(mut self) -> Jtag<V, Unlocked> {
        self.vendor.raw_unlock();
        Jtag { vendor: self.vendor, _state: PhantomData }
    }
}

// Register I/O — any vendor, any state with HasRegAccess
impl<V: JtagVendor, S: HasRegAccess> Jtag<V, S> {
    fn read_reg(&self, addr: u32) -> u32 {
        self.vendor.raw_read_reg(addr)
    }
    fn write_reg(&mut self, addr: u32, val: u32) {
        self.vendor.raw_write_reg(addr, val);
    }
}

// Extended unlock — only memory-capable vendors, only from Unlocked
impl<V: JtagMemoryVendor> Jtag<V, Unlocked> {
    fn extended_unlock(mut self) -> Jtag<V, ExtendedUnlocked> {
        self.vendor.raw_extended_unlock();
        Jtag { vendor: self.vendor, _state: PhantomData }
    }
}

// Memory I/O — only memory-capable vendors, only ExtendedUnlocked
impl<V: JtagMemoryVendor, S: HasMemAccess> Jtag<V, S> {
    fn read_memory(&self, addr: u64, buf: &mut [u8]) {
        self.vendor.raw_read_memory(addr, buf);
    }
    fn write_memory(&mut self, addr: u64, data: &[u8]) {
        self.vendor.raw_write_memory(addr, data);
    }
}

Each impl block encodes one cell (or row) of the capability matrix. The compiler enforces the matrix — no runtime checks anywhere.

Vendor Implementations

Adding a vendor means implementing raw methods on one struct — no per-state struct duplication, no delegation boilerplate:

// Vendor A: basic probe — register access only
struct BasicProbe { port: u16 }

impl JtagVendor for BasicProbe {
    fn raw_unlock(&mut self)                    { /* TAP reset sequence */ }
    fn raw_read_reg(&self, addr: u32) -> u32    { /* DR scan */  0 }
    fn raw_write_reg(&mut self, addr: u32, val: u32) { /* DR scan */ }
}
// BasicProbe does NOT impl JtagMemoryVendor.
// extended_unlock() will not compile on Jtag<BasicProbe, _>.

// Vendor B: full-featured probe — registers + memory
struct DapProbe { serial: String }

impl JtagVendor for DapProbe {
    fn raw_unlock(&mut self)                    { /* SWD switch, read DPIDR */ }
    fn raw_read_reg(&self, addr: u32) -> u32    { /* AP register read */ 0 }
    fn raw_write_reg(&mut self, addr: u32, val: u32) { /* AP register write */ }
}

impl JtagMemoryVendor for DapProbe {
    fn raw_extended_unlock(&mut self)           { /* select MEM-AP, power up */ }
    fn raw_read_memory(&self, addr: u64, buf: &mut [u8])  { /* MEM-AP read */ }
    fn raw_write_memory(&mut self, addr: u64, data: &[u8]) { /* MEM-AP write */ }
}

What the Compiler Prevents

AttemptErrorWhy
Jtag<_, Locked>::read_reg()no method read_regLocked doesn’t impl HasRegAccess
Jtag<BasicProbe, _>::extended_unlock()no method extended_unlockBasicProbe doesn’t impl JtagMemoryVendor
Jtag<_, Unlocked>::read_memory()no method read_memoryUnlocked doesn’t impl HasMemAccess
Calling unlock() twicevalue used after moveunlock() consumes self

All four errors are caught at compile time. No panics, no Option, no runtime state enum.

Writing Generic Functions

Functions bind only the axes they care about:

/// Works with ANY vendor, ANY state that grants register access.
fn read_idcode<V: JtagVendor, S: HasRegAccess>(jtag: &Jtag<V, S>) -> u32 {
    jtag.read_reg(0x00)
}

/// Only compiles for memory-capable vendors in ExtendedUnlocked state.
fn dump_firmware<V: JtagMemoryVendor, S: HasMemAccess>(jtag: &Jtag<V, S>) {
    let mut buf = [0u8; 256];
    jtag.read_memory(0x0800_0000, &mut buf);
}

read_idcode doesn’t care whether you’re in Unlocked or ExtendedUnlocked — it only requires HasRegAccess. This is where marker traits pay off over hardcoding specific states in signatures.

Same Pattern, Different Domain: Storage Backends

The dual-axis technique isn’t hardware-specific. Here’s the same structure for a storage layer where some backends support transactions:

// States
struct Closed;
struct Open;
struct InTransaction;

trait HasReadWrite {}
impl HasReadWrite for Open {}
impl HasReadWrite for InTransaction {}

// Vendor traits
trait StorageBackend {
    fn raw_open(&mut self);
    fn raw_read(&self, key: &[u8]) -> Option<Vec<u8>>;
    fn raw_write(&mut self, key: &[u8], value: &[u8]);
}

trait TransactionalBackend: StorageBackend {
    fn raw_begin(&mut self);
    fn raw_commit(&mut self);
    fn raw_rollback(&mut self);
}

// Wrapper
struct Store<B, S = Closed> { backend: B, _s: PhantomData<S> }

impl<B: StorageBackend> Store<B, Closed> {
    fn open(mut self) -> Store<B, Open> { self.backend.raw_open(); /* ... */ todo!() }
}
impl<B: StorageBackend, S: HasReadWrite> Store<B, S> {
    fn read(&self, key: &[u8]) -> Option<Vec<u8>>  { self.backend.raw_read(key) }
    fn write(&mut self, key: &[u8], val: &[u8])    { self.backend.raw_write(key, val) }
}
impl<B: TransactionalBackend> Store<B, Open> {
    fn begin(mut self) -> Store<B, InTransaction>   { /* ... */ todo!() }
}
impl<B: TransactionalBackend> Store<B, InTransaction> {
    fn commit(mut self) -> Store<B, Open>           { /* ... */ todo!() }
    fn rollback(mut self) -> Store<B, Open>         { /* ... */ todo!() }
}

A flat-file backend implements StorageBackend only — begin() won’t compile. A database backend adds TransactionalBackend — the full Open → InTransaction → Open cycle becomes available.

When to Reach for This Pattern

SignalWhy dual-axis fits
Two independent axes: “who provides it” and “what state is it in”The impl block matrix directly encodes both
Some providers have strictly more capabilities than othersSuper-trait (MemoryVendor: Vendor) + conditional impl
Misusing state or capability is a safety/correctness bugCompile-time prevention > runtime checks
You want static dispatch (no vtables)PhantomData + generics = zero-cost
SignalConsider something simpler
Only one axis varies (state OR vendor, not both)Single-axis typestate or plain trait objects
Three or more independent axesConfig Trait Pattern (above) bundles axes into associated types
Runtime polymorphism is acceptableenum state + dyn dispatch is simpler

When two axes become three or more: If you find yourself writing Handle<V, S, D, T> — vendor, state, debug level, transport — the generic parameter list is telling you something. Consider collapsing the vendor axis into an associated-type config trait (the Config Trait Pattern from earlier in this chapter), keeping only the state axis as a generic parameter: Handle<Cfg, S>. The config trait bundles type Vendor, type Transport, etc. into one parameter, and the state axis retains its compile-time transition guarantees. This is a natural evolution, not a rewrite — you lift vendor-related types into Cfg and leave the typestate machinery untouched.

Key Takeaway: The dual-axis pattern is the intersection of typestate and trait-based abstraction. Each impl block maps to one cell of the (vendor × state) matrix. The compiler enforces the entire matrix — no runtime state checks, no impossible-state panics, no cost.


Exercise: Type-Safe State Machine ★★ (~30 min)

Build a traffic light state machine using the type-state pattern. The light must transition Red → Green → Yellow → Red and no other order should be possible.

🔑 Solution
use std::marker::PhantomData;

struct Red;
struct Green;
struct Yellow;

struct TrafficLight<State> {
    _state: PhantomData<State>,
}

impl TrafficLight<Red> {
    fn new() -> Self {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }

    fn go(self) -> TrafficLight<Green> {
        println!("🟢 Green — GO");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Green> {
    fn caution(self) -> TrafficLight<Yellow> {
        println!("🟡 Yellow — CAUTION");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Yellow> {
    fn stop(self) -> TrafficLight<Red> {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }
}

fn main() {
    let light = TrafficLight::new(); // Red
    let light = light.go();          // Green
    let light = light.caution();     // Yellow
    let _light = light.stop();       // Red

    // light.caution(); // ❌ Compile error: no method `caution` on Red
    // TrafficLight::new().stop(); // ❌ Compile error: no method `stop` on Red
}

Key takeaway: Invalid transitions are compile errors, not runtime panics.


4. PhantomData — Types That Carry No Data 🔴

What you’ll learn:

  • Why PhantomData<T> exists and the three problems it solves
  • Lifetime branding for compile-time scope enforcement
  • The unit-of-measure pattern for dimension-safe arithmetic
  • Variance (covariant, contravariant, invariant) and how PhantomData controls it

What PhantomData Solves

PhantomData<T> is a zero-sized type that tells the compiler “this struct is logically associated with T, even though it doesn’t contain a T.” It affects variance, drop checking, and auto-trait inference — without using any memory.

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// Without PhantomData:
struct Slice<'a, T> {
    ptr: *const T,
    len: usize,
    // Problem: compiler doesn't know this struct borrows from 'a
    // or that it's associated with T for drop-check purposes
}

// With PhantomData:
struct Slice<'a, T> {
    ptr: *const T,
    len: usize,
    _marker: PhantomData<&'a T>,
    // Now the compiler knows:
    // 1. This struct borrows data with lifetime 'a
    // 2. It's covariant over 'a (lifetimes can shrink)
    // 3. Drop check considers T
}
}

The three jobs of PhantomData:

JobExampleWhat It Does
Lifetime bindingPhantomData<&'a T>Struct is treated as borrowing 'a
Ownership simulationPhantomData<T>Drop check assumes struct owns a T
Variance controlPhantomData<fn(T)>Makes struct contravariant over T

Lifetime Branding

Use PhantomData to prevent mixing values from different “sessions” or “contexts”:

use std::marker::PhantomData;

/// A handle that's valid only within a specific arena's lifetime
struct ArenaHandle<'arena> {
    index: usize,
    _brand: PhantomData<&'arena ()>,
}

struct Arena {
    data: Vec<String>,
}

impl Arena {
    fn new() -> Self {
        Arena { data: Vec::new() }
    }

    /// Allocate a string and return a branded handle
    fn alloc<'a>(&'a mut self, value: String) -> ArenaHandle<'a> {
        let index = self.data.len();
        self.data.push(value);
        ArenaHandle { index, _brand: PhantomData }
    }

    /// Look up by handle — only accepts handles from THIS arena
    fn get<'a>(&'a self, handle: ArenaHandle<'a>) -> &'a str {
        &self.data[handle.index]
    }
}

fn main() {
    let mut arena1 = Arena::new();
    let handle1 = arena1.alloc("hello".to_string());

    // Can't use handle1 with a different arena — lifetimes won't match
    // let mut arena2 = Arena::new();
    // arena2.get(handle1); // ❌ Lifetime mismatch

    println!("{}", arena1.get(handle1)); // ✅
}

Unit-of-Measure Pattern

Prevent mixing incompatible units at compile time, with zero runtime cost:

use std::marker::PhantomData;
use std::ops::{Add, Mul};

// Unit marker types (zero-sized)
struct Meters;
struct Seconds;
struct MetersPerSecond;

#[derive(Debug, Clone, Copy)]
struct Quantity<Unit> {
    value: f64,
    _unit: PhantomData<Unit>,
}

impl<U> Quantity<U> {
    fn new(value: f64) -> Self {
        Quantity { value, _unit: PhantomData }
    }
}

// Can only add same units:
impl<U> Add for Quantity<U> {
    type Output = Quantity<U>;
    fn add(self, rhs: Self) -> Self::Output {
        Quantity::new(self.value + rhs.value)
    }
}

// Meters / Seconds = MetersPerSecond (custom trait)
impl std::ops::Div<Quantity<Seconds>> for Quantity<Meters> {
    type Output = Quantity<MetersPerSecond>;
    fn div(self, rhs: Quantity<Seconds>) -> Quantity<MetersPerSecond> {
        Quantity::new(self.value / rhs.value)
    }
}

fn main() {
    let dist = Quantity::<Meters>::new(100.0);
    let time = Quantity::<Seconds>::new(9.58);
    let speed = dist / time; // Quantity<MetersPerSecond>
    println!("Speed: {:.2} m/s", speed.value); // 10.44 m/s

    // let nonsense = dist + time; // ❌ Compile error: can't add Meters + Seconds
}

This is pure type-system magicPhantomData<Meters> is zero-sized, so Quantity<Meters> has the same layout as f64. No wrapper overhead at runtime, but full unit safety at compile time.

PhantomData and Drop Check

When the compiler checks whether a struct’s destructor might access expired data, it uses PhantomData to decide:

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// PhantomData<T> — compiler assumes we MIGHT drop a T
// This means T must outlive our struct
struct OwningSemantic<T> {
    ptr: *const T,
    _marker: PhantomData<T>,  // "I logically own a T"
}

// PhantomData<*const T> — compiler assumes we DON'T own T
// More permissive — T doesn't need to outlive us
struct NonOwningSemantic<T> {
    ptr: *const T,
    _marker: PhantomData<*const T>,  // "I just point to T"
}
}

Practical rule: When wrapping raw pointers, choose PhantomData carefully:

  • Writing a container that owns its data? → PhantomData<T>
  • Writing a view/reference type? → PhantomData<&'a T> or PhantomData<*const T>

Variance — Why PhantomData’s Type Parameter Matters

Variance determines whether a generic type can be substituted with a sub- or super-type (in Rust, “subtype” means “has a longer lifetime”). Getting variance wrong causes either rejected-good-code or unsound-accepted-code.

graph LR
    subgraph Covariant
        direction TB
        A1["&'long T"] -->|"can become"| A2["&'short T"]
    end

    subgraph Contravariant
        direction TB
        B1["fn(&'short T)"] -->|"can become"| B2["fn(&'long T)"]
    end

    subgraph Invariant
        direction TB
        C1["&'a mut T"] ---|"NO substitution"| C2["&'b mut T"]
    end

    style A1 fill:#d4efdf,stroke:#27ae60,color:#000
    style A2 fill:#d4efdf,stroke:#27ae60,color:#000
    style B1 fill:#e8daef,stroke:#8e44ad,color:#000
    style B2 fill:#e8daef,stroke:#8e44ad,color:#000
    style C1 fill:#fadbd8,stroke:#e74c3c,color:#000
    style C2 fill:#fadbd8,stroke:#e74c3c,color:#000

The Three Variances

VarianceMeaning“Can I substitute…”Rust example
CovariantSubtype flows through'long where 'short expected ✅&'a T, Vec<T>, Box<T>
ContravariantSubtype flows against'short where 'long expected ✅fn(T) (in parameter position)
InvariantNo substitution allowedNeither direction ✅&mut T, Cell<T>, UnsafeCell<T>

Why &'a T is Covariant Over 'a

fn print_str(s: &str) {
    println!("{s}");
}

fn main() {
    let owned = String::from("hello");
    // owned lives for the entire function ('long)
    // print_str expects &'_ str ('short — just for the call)
    print_str(&owned); // ✅ Covariance: 'long → 'short is safe
    // A longer-lived reference can always be used where a shorter one is needed.
}

Why &mut T is Invariant Over T

#![allow(unused)]
fn main() {
// If &mut T were covariant over T, this would compile:
fn evil(s: &mut &'static str) {
    // We could write a shorter-lived &str into a &'static str slot!
    let local = String::from("temporary");
    // *s = &local; // ← Would create a dangling &'static str
}

// Invariance prevents this: &'static str ≠ &'a str when mutating.
// The compiler rejects the substitution entirely.
}

How PhantomData Controls Variance

PhantomData<X> gives your struct the same variance as X:

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// Covariant over 'a — a Ref<'long> can be used as Ref<'short>
struct Ref<'a, T> {
    ptr: *const T,
    _marker: PhantomData<&'a T>,  // Covariant over 'a, covariant over T
}

// Invariant over T — prevents unsound lifetime shortening of T
struct MutRef<'a, T> {
    ptr: *mut T,
    _marker: PhantomData<&'a mut T>,  // Covariant over 'a, INVARIANT over T
}

// Contravariant over T — useful for callback containers
struct CallbackSlot<T> {
    _marker: PhantomData<fn(T)>,  // Contravariant over T
}
}

PhantomData variance cheat sheet:

PhantomData typeVariance over TVariance over 'aUse when
PhantomData<T>CovariantYou logically own a T
PhantomData<&'a T>CovariantCovariantYou borrow a T with lifetime 'a
PhantomData<&'a mut T>InvariantCovariantYou mutably borrow T
PhantomData<*const T>CovariantNon-owning pointer to T
PhantomData<*mut T>InvariantNon-owning mutable pointer
PhantomData<fn(T)>ContravariantT appears in argument position
PhantomData<fn() -> T>CovariantT appears in return position
PhantomData<fn(T) -> T>InvariantT in both positions cancels out

Worked Example: Why This Matters in Practice

use std::marker::PhantomData;

// A token that brands values with a session lifetime.
// MUST be covariant over 'a — otherwise callers can't shorten
// the lifetime when passing to functions that need a shorter borrow.
struct SessionToken<'a> {
    id: u64,
    _brand: PhantomData<&'a ()>,  // ✅ Covariant — callers can shorten 'a
    // _brand: PhantomData<fn(&'a ())>,  // ❌ Contravariant — breaks ergonomics
    // _brand: PhantomData<&'a mut ()>;  // Still covariant over 'a (invariant over T, but T is fixed as ())
}

fn use_token(token: &SessionToken<'_>) {
    println!("Using token {}", token.id);
}

fn main() {
    let token = SessionToken { id: 42, _brand: PhantomData };
    use_token(&token); // ✅ Works because SessionToken is covariant over 'a
}

Decision rule: Start with PhantomData<&'a T> (covariant). Switch to PhantomData<&'a mut T> (invariant) only if your abstraction hands out mutable access to T. Use PhantomData<fn(T)> (contravariant) almost never — it’s only correct for callback-storage scenarios.

Key Takeaways — PhantomData

  • PhantomData<T> carries type/lifetime information without runtime cost
  • Use it for lifetime branding, variance control, and unit-of-measure patterns
  • Drop check: PhantomData<T> tells the compiler your type logically owns a T

See also: Ch 3 — Newtype & Type-State for type-state patterns that use PhantomData. Ch 11 — Unsafe Rust for how PhantomData interacts with raw pointers.


Exercise: Unit-of-Measure with PhantomData ★★ (~30 min)

Extend the unit-of-measure pattern to support:

  • Meters, Seconds, Kilograms
  • Addition of same units
  • Multiplication: Meters * Meters = SquareMeters
  • Division: Meters / Seconds = MetersPerSecond
🔑 Solution
use std::marker::PhantomData;
use std::ops::{Add, Mul, Div};

#[derive(Clone, Copy)]
struct Meters;
#[derive(Clone, Copy)]
struct Seconds;
#[derive(Clone, Copy)]
struct Kilograms;
#[derive(Clone, Copy)]
struct SquareMeters;
#[derive(Clone, Copy)]
struct MetersPerSecond;

#[derive(Debug, Clone, Copy)]
struct Qty<U> {
    value: f64,
    _unit: PhantomData<U>,
}

impl<U> Qty<U> {
    fn new(v: f64) -> Self { Qty { value: v, _unit: PhantomData } }
}

impl<U> Add for Qty<U> {
    type Output = Qty<U>;
    fn add(self, rhs: Self) -> Self::Output { Qty::new(self.value + rhs.value) }
}

impl Mul<Qty<Meters>> for Qty<Meters> {
    type Output = Qty<SquareMeters>;
    fn mul(self, rhs: Qty<Meters>) -> Qty<SquareMeters> {
        Qty::new(self.value * rhs.value)
    }
}

impl Div<Qty<Seconds>> for Qty<Meters> {
    type Output = Qty<MetersPerSecond>;
    fn div(self, rhs: Qty<Seconds>) -> Qty<MetersPerSecond> {
        Qty::new(self.value / rhs.value)
    }
}

fn main() {
    let width = Qty::<Meters>::new(5.0);
    let height = Qty::<Meters>::new(3.0);
    let area = width * height; // Qty<SquareMeters>
    println!("Area: {:.1} m²", area.value);

    let dist = Qty::<Meters>::new(100.0);
    let time = Qty::<Seconds>::new(9.58);
    let speed = dist / time;
    println!("Speed: {:.2} m/s", speed.value);

    let sum = width + height; // Same unit ✅
    println!("Sum: {:.1} m", sum.value);

    // let bad = width + time; // ❌ Compile error: can't add Meters + Seconds
}

5. Channels and Message Passing 🟢

What you’ll learn:

  • std::sync::mpsc basics and when to upgrade to crossbeam-channel
  • Channel selection with select! for multi-source message handling
  • Bounded vs unbounded channels and backpressure strategies
  • The actor pattern for encapsulating concurrent state

std::sync::mpsc — The Standard Channel

Rust’s standard library provides a multi-producer, single-consumer channel:

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    // Create a channel: tx (transmitter) and rx (receiver)
    let (tx, rx) = mpsc::channel();

    // Spawn a producer thread
    let tx1 = tx.clone(); // Clone for multiple producers
    thread::spawn(move || {
        for i in 0..5 {
            tx1.send(format!("producer-1: msg {i}")).unwrap();
            thread::sleep(Duration::from_millis(100));
        }
    });

    // Second producer
    thread::spawn(move || {
        for i in 0..5 {
            tx.send(format!("producer-2: msg {i}")).unwrap();
            thread::sleep(Duration::from_millis(150));
        }
    });

    // Consumer: receive all messages
    for msg in rx {
        // rx iterator ends when ALL senders are dropped
        println!("Received: {msg}");
    }
    println!("All producers done.");
}

Note: .unwrap() on .send() is used for brevity. It panics if the receiver has been dropped. Production code should handle SendError gracefully.

Key properties:

  • Unbounded by default (can fill memory if consumer is slow)
  • mpsc::sync_channel(N) creates a bounded channel with backpressure
  • rx.recv() blocks the current thread until a message arrives
  • rx.try_recv() returns immediately with Err(TryRecvError::Empty) if nothing is ready
  • The channel closes when all Senders are dropped
#![allow(unused)]
fn main() {
// Bounded channel with backpressure:
let (tx, rx) = mpsc::sync_channel(10); // Buffer of 10 messages

thread::spawn(move || {
    for i in 0..1000 {
        tx.send(i).unwrap(); // BLOCKS if buffer is full — natural backpressure
    }
});
}

Note: .unwrap() is used for brevity. In production, handle SendError (receiver dropped) instead of panicking.

crossbeam-channel — The Production Workhorse

crossbeam-channel is the de facto standard for production channel usage. It’s faster than std::sync::mpsc and supports multi-consumer (mpmc):

// Cargo.toml:
//   [dependencies]
//   crossbeam-channel = "0.5"
use crossbeam_channel::{bounded, unbounded, select, Sender, Receiver};
use std::thread;
use std::time::Duration;

fn main() {
    // Bounded MPMC channel
    let (tx, rx) = bounded::<String>(100);

    // Multiple producers
    for id in 0..4 {
        let tx = tx.clone();
        thread::spawn(move || {
            for i in 0..10 {
                tx.send(format!("worker-{id}: item-{i}")).unwrap();
            }
        });
    }
    drop(tx); // Drop the original sender so the channel can close

    // Multiple consumers (not possible with std::sync::mpsc!)
    let rx2 = rx.clone();
    let consumer1 = thread::spawn(move || {
        while let Ok(msg) = rx.recv() {
            println!("[consumer-1] {msg}");
        }
    });
    let consumer2 = thread::spawn(move || {
        while let Ok(msg) = rx2.recv() {
            println!("[consumer-2] {msg}");
        }
    });

    consumer1.join().unwrap();
    consumer2.join().unwrap();
}

Channel Selection (select!)

Listen on multiple channels simultaneously — like select in Go:

use crossbeam_channel::{bounded, tick, after, select};
use std::time::Duration;

fn main() {
    let (work_tx, work_rx) = bounded::<String>(10);
    let ticker = tick(Duration::from_secs(1));        // Periodic tick
    let deadline = after(Duration::from_secs(10));     // One-shot timeout

    // Producer
    let tx = work_tx.clone();
    std::thread::spawn(move || {
        for i in 0..100 {
            tx.send(format!("job-{i}")).unwrap();
            std::thread::sleep(Duration::from_millis(500));
        }
    });
    drop(work_tx);

    loop {
        select! {
            recv(work_rx) -> msg => {
                match msg {
                    Ok(job) => println!("Processing: {job}"),
                    Err(_) => {
                        println!("Work channel closed");
                        break;
                    }
                }
            },
            recv(ticker) -> _ => {
                println!("Tick — heartbeat");
            },
            recv(deadline) -> _ => {
                println!("Deadline reached — shutting down");
                break;
            },
        }
    }
}

Go comparison: This is exactly like Go’s select statement over channels. crossbeam’s select! macro randomizes order to prevent starvation, just like Go.

Bounded vs Unbounded and Backpressure

TypeBehavior When FullMemoryUse Case
UnboundedNever blocks (grows heap)Unbounded ⚠️Rare — only when producer is slower than consumer
Boundedsend() blocks until spaceFixedProduction default — prevents OOM
Rendezvous (bounded(0))send() blocks until receiver is readyNoneSynchronization / handoff
#![allow(unused)]
fn main() {
// Rendezvous channel — zero capacity, direct handoff
let (tx, rx) = crossbeam_channel::bounded(0);
// tx.send(x) blocks until rx.recv() is called, and vice versa.
// This synchronizes the two threads precisely.
}

Rule: Always use bounded channels in production unless you can prove the producer will never outpace the consumer.

Actor Pattern with Channels

The actor pattern uses channels to serialize access to mutable state — no mutexes needed:

use std::sync::mpsc;
use std::thread;

// Messages the actor can receive
enum CounterMsg {
    Increment,
    Decrement,
    Get(mpsc::Sender<i64>), // Reply channel
}

struct CounterActor {
    count: i64,
    rx: mpsc::Receiver<CounterMsg>,
}

impl CounterActor {
    fn new(rx: mpsc::Receiver<CounterMsg>) -> Self {
        CounterActor { count: 0, rx }
    }

    fn run(mut self) {
        while let Ok(msg) = self.rx.recv() {
            match msg {
                CounterMsg::Increment => self.count += 1,
                CounterMsg::Decrement => self.count -= 1,
                CounterMsg::Get(reply) => {
                    let _ = reply.send(self.count);
                }
            }
        }
    }
}

// Actor handle — cheap to clone, Send + Sync
#[derive(Clone)]
struct Counter {
    tx: mpsc::Sender<CounterMsg>,
}

impl Counter {
    fn spawn() -> Self {
        let (tx, rx) = mpsc::channel();
        thread::spawn(move || CounterActor::new(rx).run());
        Counter { tx }
    }

    fn increment(&self) { let _ = self.tx.send(CounterMsg::Increment); }
    fn decrement(&self) { let _ = self.tx.send(CounterMsg::Decrement); }

    fn get(&self) -> i64 {
        let (reply_tx, reply_rx) = mpsc::channel();
        self.tx.send(CounterMsg::Get(reply_tx)).unwrap();
        reply_rx.recv().unwrap()
    }
}

fn main() {
    let counter = Counter::spawn();

    // Multiple threads can safely use the counter — no mutex!
    let handles: Vec<_> = (0..10).map(|_| {
        let counter = counter.clone();
        thread::spawn(move || {
            for _ in 0..1000 {
                counter.increment();
            }
        })
    }).collect();

    for h in handles { h.join().unwrap(); }
    println!("Final count: {}", counter.get()); // 10000
}

When to use actors vs mutexes: Actors are great when the state has complex invariants, operations take a long time, or you want to serialize access without thinking about lock ordering. Mutexes are simpler for short critical sections.

Key Takeaways — Channels

  • crossbeam-channel is the production workhorse — faster and more feature-rich than std::sync::mpsc
  • select! replaces complex multi-source polling with declarative channel selection
  • Bounded channels provide natural backpressure; unbounded channels risk OOM

See also: Ch 6 — Concurrency for threads, Mutex, and shared state. Ch 15 — Async for async channels (tokio::sync::mpsc).


Exercise: Channel-Based Worker Pool ★★★ (~45 min)

Build a worker pool using channels where:

  • A dispatcher sends Job structs through a channel
  • N workers consume jobs and send results back
  • Use std::sync::mpsc with Arc<Mutex<Receiver>> for work-stealing
🔑 Solution
use std::sync::mpsc;
use std::thread;

struct Job {
    id: u64,
    data: String,
}

struct JobResult {
    job_id: u64,
    output: String,
    worker_id: usize,
}

fn worker_pool(jobs: Vec<Job>, num_workers: usize) -> Vec<JobResult> {
    let (job_tx, job_rx) = mpsc::channel::<Job>();
    let (result_tx, result_rx) = mpsc::channel::<JobResult>();

    let job_rx = std::sync::Arc::new(std::sync::Mutex::new(job_rx));

    let mut handles = Vec::new();
    for worker_id in 0..num_workers {
        let job_rx = job_rx.clone();
        let result_tx = result_tx.clone();
        handles.push(thread::spawn(move || {
            loop {
                let job = {
                    let rx = job_rx.lock().unwrap();
                    rx.recv()
                };
                match job {
                    Ok(job) => {
                        let output = format!("processed '{}' by worker {worker_id}", job.data);
                        result_tx.send(JobResult {
                            job_id: job.id, output, worker_id,
                        }).unwrap();
                    }
                    Err(_) => break,
                }
            }
        }));
    }
    drop(result_tx);

    let num_jobs = jobs.len();
    for job in jobs {
        job_tx.send(job).unwrap();
    }
    drop(job_tx);

    let results: Vec<_> = result_rx.into_iter().collect();
    assert_eq!(results.len(), num_jobs);

    for h in handles { h.join().unwrap(); }
    results
}

fn main() {
    let jobs: Vec<Job> = (0..20).map(|i| Job {
        id: i, data: format!("task-{i}"),
    }).collect();

    let results = worker_pool(jobs, 4);
    for r in &results {
        println!("[worker {}] job {}: {}", r.worker_id, r.job_id, r.output);
    }
}

6. Concurrency vs Parallelism vs Threads 🟡

What you’ll learn:

  • The precise distinction between concurrency and parallelism
  • OS threads, scoped threads, and rayon for data parallelism
  • Shared state primitives: Arc, Mutex, RwLock, Atomics, Condvar
  • Lazy initialization with OnceLock/LazyLock and lock-free patterns

Terminology: Concurrency ≠ Parallelism

These terms are often confused. Here is the precise distinction:

ConcurrencyParallelism
DefinitionManaging multiple tasks that can make progressExecuting multiple tasks simultaneously
Hardware requirementOne core is enoughRequires multiple cores
AnalogyOne cook, multiple dishes (switching between them)Multiple cooks, each working on a dish
Rust toolsasync/await, channels, select!rayon, thread::spawn, par_iter()
Concurrency (single core):           Parallelism (multi-core):
                                      
Task A: ██░░██░░██                   Task A: ██████████
Task B: ░░██░░██░░                   Task B: ██████████
─────────────────→ time              ─────────────────→ time
(interleaved on one core)           (simultaneous on two cores)

std::thread — OS Threads

Rust threads map 1:1 to OS threads. Each gets its own stack (typically 2-8 MB):

use std::thread;
use std::time::Duration;

fn main() {
    // Spawn a thread — takes a closure
    let handle = thread::spawn(|| {
        for i in 0..5 {
            println!("spawned thread: {i}");
            thread::sleep(Duration::from_millis(100));
        }
        42 // Return value
    });

    // Do work on the main thread simultaneously
    for i in 0..3 {
        println!("main thread: {i}");
        thread::sleep(Duration::from_millis(150));
    }

    // Wait for the thread to finish and get its return value
    let result = handle.join().unwrap(); // unwrap panics if thread panicked
    println!("Thread returned: {result}");
}

Thread::spawn type requirements:

#![allow(unused)]
fn main() {
// The closure must be:
// 1. Send — can be transferred to another thread
// 2. 'static — can't borrow from the calling scope
// 3. FnOnce — takes ownership of captured variables

let data = vec![1, 2, 3];

// ❌ Borrows data — not 'static
// thread::spawn(|| println!("{data:?}"));

// ✅ Move ownership into the thread
thread::spawn(move || println!("{data:?}"));
// data is no longer accessible here
}

Scoped Threads (std::thread::scope)

Since Rust 1.63, scoped threads solve the 'static requirement — threads can borrow from the parent scope:

use std::thread;

fn main() {
    let mut data = vec![1, 2, 3, 4, 5];

    thread::scope(|s| {
        // Thread 1: borrow shared reference
        s.spawn(|| {
            let sum: i32 = data.iter().sum();
            println!("Sum: {sum}");
        });

        // Thread 2: also borrow shared reference (multiple readers OK)
        s.spawn(|| {
            let max = data.iter().max().unwrap();
            println!("Max: {max}");
        });

        // ❌ Can't mutably borrow while shared borrows exist:
        // s.spawn(|| data.push(6));
    });
    // ALL scoped threads joined here — guaranteed before scope returns

    // Now safe to mutate — all threads have finished
    data.push(6);
    println!("Updated: {data:?}");
}

This is huge: Before scoped threads, you had to Arc::clone() everything to share with threads. Now you can borrow directly, and the compiler proves all threads finish before the data goes out of scope.

rayon — Data Parallelism

rayon provides parallel iterators that distribute work across a thread pool automatically:

// Cargo.toml: rayon = "1"
use rayon::prelude::*;

fn main() {
    let data: Vec<u64> = (0..1_000_000).collect();

    // Sequential:
    let sum_seq: u64 = data.iter().map(|x| x * x).sum();

    // Parallel — just change .iter() to .par_iter():
    let sum_par: u64 = data.par_iter().map(|x| x * x).sum();

    assert_eq!(sum_seq, sum_par);

    // Parallel sort:
    let mut numbers = vec![5, 2, 8, 1, 9, 3];
    numbers.par_sort();

    // Parallel processing with map/filter/collect:
    let results: Vec<_> = data
        .par_iter()
        .filter(|&&x| x % 2 == 0)
        .map(|&x| expensive_computation(x))
        .collect();
}

fn expensive_computation(x: u64) -> u64 {
    // Simulate CPU-heavy work
    (0..1000).fold(x, |acc, _| acc.wrapping_mul(7).wrapping_add(13))
}

When to use rayon vs threads:

UseWhen
rayon::par_iter()Processing collections in parallel (map, filter, reduce)
thread::spawnLong-running background tasks, I/O workers
thread::scopeShort-lived parallel tasks that borrow local data
async + tokioI/O-bound concurrency (networking, file I/O)

Shared State: Arc, Mutex, RwLock, Atomics

When threads need shared mutable state, Rust provides safe abstractions:

Note: .unwrap() on .lock(), .read(), and .write() is used for brevity throughout these examples. These calls fail only if another thread panicked while holding the lock (“poisoning”). Production code should decide whether to recover from poisoned locks or propagate the error.

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex, RwLock};
use std::sync::atomic::{AtomicU64, Ordering};
use std::thread;

// --- Arc<Mutex<T>>: Shared + Exclusive access ---
fn mutex_example() {
    let counter = Arc::new(Mutex::new(0u64));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                let mut guard = counter.lock().unwrap();
                *guard += 1;
            } // Guard dropped → lock released
        }));
    }

    for h in handles { h.join().unwrap(); }
    println!("Counter: {}", counter.lock().unwrap()); // 10000
}

// --- Arc<RwLock<T>>: Multiple readers OR one writer ---
fn rwlock_example() {
    let config = Arc::new(RwLock::new(String::from("initial")));

    // Many readers — don't block each other
    let readers: Vec<_> = (0..5).map(|id| {
        let config = Arc::clone(&config);
        thread::spawn(move || {
            let guard = config.read().unwrap();
            println!("Reader {id}: {guard}");
        })
    }).collect();

    // Writer — blocks and waits for all readers to finish
    {
        let mut guard = config.write().unwrap();
        *guard = "updated".to_string();
    }

    for r in readers { r.join().unwrap(); }
}

// --- Atomics: Lock-free for simple values ---
fn atomic_example() {
    let counter = Arc::new(AtomicU64::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                counter.fetch_add(1, Ordering::Relaxed);
                // No lock, no mutex — hardware atomic instruction
            }
        }));
    }

    for h in handles { h.join().unwrap(); }
    println!("Atomic counter: {}", counter.load(Ordering::Relaxed)); // 10000
}
}

Quick Comparison

PrimitiveUse CaseCostContention
Mutex<T>Short critical sectionsLock + unlockThreads wait in line
RwLock<T>Read-heavy, rare writesReader-writer lockReaders concurrent, writer exclusive
AtomicU64 etc.Counters, flagsHardware CASLock-free — no waiting
ChannelsMessage passingQueue opsProducer/consumer decouple

Condition Variables (Condvar)

A Condvar lets a thread wait until another thread signals that a condition is true, without busy-looping. It is always paired with a Mutex:

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex, Condvar};
use std::thread;

let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);

// Spawned thread: wait until ready == true
let handle = thread::spawn(move || {
    let (lock, cvar) = &*pair2;
    let mut ready = lock.lock().unwrap();
    while !*ready {
        ready = cvar.wait(ready).unwrap(); // atomically unlocks + sleeps
    }
    println!("Worker: condition met, proceeding");
});

// Main thread: set ready = true, then signal
{
    let (lock, cvar) = &*pair;
    let mut ready = lock.lock().unwrap();
    *ready = true;
    cvar.notify_one(); // wake one waiting thread (use notify_all for many)
}
handle.join().unwrap();
}

Pattern: Always re-check the condition in a while loop after wait() returns — spurious wakeups are allowed by the OS.

Lazy Initialization: OnceLock and LazyLock

Before Rust 1.80, initializing a global static that requires runtime computation (e.g., parsing a config, compiling a regex) needed the lazy_static! macro or the once_cell crate. The standard library now provides two types that cover these use cases natively:

#![allow(unused)]
fn main() {
use std::sync::{OnceLock, LazyLock};
use std::collections::HashMap;

// OnceLock — initialize on first use via `get_or_init`.
// Useful when the init value depends on runtime arguments.
static CONFIG: OnceLock<HashMap<String, String>> = OnceLock::new();

fn get_config() -> &'static HashMap<String, String> {
    CONFIG.get_or_init(|| {
        // Expensive: read & parse config file — happens exactly once.
        let mut m = HashMap::new();
        m.insert("log_level".into(), "info".into());
        m
    })
}

// LazyLock — initialize on first access, closure provided at definition site.
// Equivalent to lazy_static! but without a macro.
static REGEX: LazyLock<regex::Regex> = LazyLock::new(|| {
    regex::Regex::new(r"^[a-zA-Z0-9_]+$").unwrap()
});

fn is_valid_identifier(s: &str) -> bool {
    REGEX.is_match(s) // First call compiles the regex; subsequent calls reuse it.
}
}
TypeStabilizedInit TimingUse When
OnceLock<T>Rust 1.70Call-site (get_or_init)Init depends on runtime args
LazyLock<T>Rust 1.80Definition-site (closure)Init is self-contained
lazy_static!Definition-site (macro)Pre-1.80 codebases (migrate away)
const fn + staticAlwaysCompile-timeValue is computable at compile time

Migration tip: Replace lazy_static! { static ref X: T = expr; } with static X: LazyLock<T> = LazyLock::new(|| expr); — same semantics, no macro, no external dependency.

Lock-Free Patterns

For high-performance code, avoid locks entirely:

#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::Arc;

// Pattern 1: Spin lock (educational — prefer std::sync::Mutex)
// ⚠️ WARNING: This is a teaching example only. Real spinlocks need:
//   - A RAII guard (so a panic while holding doesn't deadlock forever)
//   - Fairness guarantees (this starves under contention)
//   - Backoff strategies (exponential backoff, yield to OS)
// Use std::sync::Mutex or parking_lot::Mutex in production.
struct SpinLock {
    locked: AtomicBool,
}

impl SpinLock {
    fn new() -> Self { SpinLock { locked: AtomicBool::new(false) } }

    fn lock(&self) {
        while self.locked
            .compare_exchange_weak(false, true, Ordering::Acquire, Ordering::Relaxed)
            .is_err()
        {
            std::hint::spin_loop(); // CPU hint: we're spinning
        }
    }

    fn unlock(&self) {
        self.locked.store(false, Ordering::Release);
    }
}

// Pattern 2: Lock-free SPSC (single producer, single consumer)
// Use crossbeam::queue::ArrayQueue or similar in production
// roll-your-own only for learning.

// Pattern 3: Sequence counter for wait-free reads
// ⚠️ Best for single-machine-word types (u64, f64); wider T may tear on read.
struct SeqLock<T: Copy> {
    seq: AtomicUsize,
    data: std::cell::UnsafeCell<T>,
}

unsafe impl<T: Copy + Send> Sync for SeqLock<T> {}

impl<T: Copy> SeqLock<T> {
    fn new(val: T) -> Self {
        SeqLock {
            seq: AtomicUsize::new(0),
            data: std::cell::UnsafeCell::new(val),
        }
    }

    fn read(&self) -> T {
        loop {
            let s1 = self.seq.load(Ordering::Acquire);
            if s1 & 1 != 0 { continue; } // Writer in progress, retry

            // SAFETY: We use ptr::read_volatile to prevent the compiler from
            // reordering or caching the read. The SeqLock protocol (checking
            // s1 == s2 after reading) ensures we retry if a writer was active.
            // This mirrors the C SeqLock pattern where the data read must use
            // volatile/relaxed semantics to avoid tearing under concurrency.
            let value = unsafe { core::ptr::read_volatile(self.data.get() as *const T) };

            // Acquire fence: ensures the data read above is ordered before
            // we re-check the sequence counter.
            std::sync::atomic::fence(Ordering::Acquire);
            let s2 = self.seq.load(Ordering::Relaxed);

            if s1 == s2 { return value; } // No writer intervened
            // else retry
        }
    }

    /// # Safety contract
    /// Only ONE thread may call `write()` at a time. If multiple writers
    /// are needed, wrap the `write()` call in an external `Mutex`.
    fn write(&self, val: T) {
        // Increment to odd (signals write in progress).
        // AcqRel: the Acquire side prevents the subsequent data write
        // from being reordered before this increment (readers must see
        // odd before they could observe a partial write). The Release
        // side is technically unnecessary for a single writer but
        // harmless and consistent.
        self.seq.fetch_add(1, Ordering::AcqRel);
        // SAFETY: Single-writer invariant upheld by caller (see doc above).
        // UnsafeCell allows interior mutation; seq counter protects readers.
        unsafe { *self.data.get() = val; }
        // Increment to even (signals write complete).
        // Release: ensure the data write is visible before readers see the even seq.
        self.seq.fetch_add(1, Ordering::Release);
    }
}
}

⚠️ Rust memory model caveat: The non-atomic write through UnsafeCell in write() concurrent with the non-atomic ptr::read_volatile in read() is technically a data race under the Rust abstract machine — even though the SeqLock protocol ensures readers always retry on stale data. This mirrors the C kernel SeqLock pattern and is sound in practice on all modern hardware for types T that fit in a single machine word (e.g., u64). For wider types, consider using AtomicU64 for the data field or wrapping access in a Mutex. See the Rust unsafe code guidelines for the evolving story on UnsafeCell concurrency.

Practical advice: Lock-free code is hard to get right. Use Mutex or RwLock unless profiling shows lock contention is your bottleneck. When you do need lock-free, reach for proven crates (crossbeam, arc-swap, dashmap) rather than rolling your own.

Key Takeaways — Concurrency

  • Scoped threads (thread::scope) let you borrow stack data without Arc
  • rayon::par_iter() parallelizes iterators with one method call
  • Use OnceLock/LazyLock instead of lazy_static!; use Mutex before reaching for atomics
  • Lock-free code is hard — prefer proven crates over hand-rolled implementations

See also: Ch 5 — Channels for message-passing concurrency. Ch 8 — Smart Pointers for Arc/Rc details.

flowchart TD
    A["Need shared<br>mutable state?"] -->|Yes| B{"How much<br>contention?"}
    A -->|No| C["Use channels<br>(Ch 5)"]

    B -->|"Read-heavy"| D["RwLock"]
    B -->|"Short critical<br>section"| E["Mutex"]
    B -->|"Simple counter<br>or flag"| F["Atomics"]
    B -->|"Complex state"| G["Actor + channels"]

    H["Need parallelism?"] -->|"Collection<br>processing"| I["rayon::par_iter"]
    H -->|"Background task"| J["thread::spawn"]
    H -->|"Borrow local data"| K["thread::scope"]

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#fef9e7,stroke:#f1c40f,color:#000
    style C fill:#d4efdf,stroke:#27ae60,color:#000
    style D fill:#fdebd0,stroke:#e67e22,color:#000
    style E fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#fdebd0,stroke:#e67e22,color:#000
    style G fill:#fdebd0,stroke:#e67e22,color:#000
    style H fill:#e8f4f8,stroke:#2980b9,color:#000
    style I fill:#d4efdf,stroke:#27ae60,color:#000
    style J fill:#d4efdf,stroke:#27ae60,color:#000
    style K fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Parallel Map with Scoped Threads ★★ (~25 min)

Write a function parallel_map<T, R>(data: &[T], f: fn(&T) -> R, num_threads: usize) -> Vec<R> that splits data into num_threads chunks and processes each in a scoped thread. Do not use rayon — use std::thread::scope.

🔑 Solution
fn parallel_map<T: Sync, R: Send>(data: &[T], f: fn(&T) -> R, num_threads: usize) -> Vec<R> {
    let chunk_size = (data.len() + num_threads - 1) / num_threads;
    let mut results = Vec::with_capacity(data.len());

    std::thread::scope(|s| {
        let mut handles = Vec::new();
        for chunk in data.chunks(chunk_size) {
            handles.push(s.spawn(move || {
                chunk.iter().map(f).collect::<Vec<_>>()
            }));
        }
        for h in handles {
            results.extend(h.join().unwrap());
        }
    });

    results
}

fn main() {
    let data: Vec<u64> = (1..=20).collect();
    let squares = parallel_map(&data, |x| x * x, 4);
    assert_eq!(squares, (1..=20).map(|x: u64| x * x).collect::<Vec<_>>());
    println!("Parallel squares: {squares:?}");
}

7. Closures and Higher-Order Functions 🟢

What you’ll learn:

  • The three closure traits (Fn, FnMut, FnOnce) and how capture works
  • Passing closures as parameters and returning them from functions
  • Combinator chains and iterator adapters for functional-style programming
  • Designing your own higher-order APIs with the right trait bounds

Fn, FnMut, FnOnce — The Closure Traits

Every closure in Rust implements one or more of three traits, based on how it captures variables:

#![allow(unused)]
fn main() {
// FnOnce — consumes captured values (can only be called once)
let name = String::from("Alice");
let greet = move || {
    println!("Hello, {name}!"); // Takes ownership of `name`
    drop(name); // name is consumed
};
greet(); // ✅ First call
// greet(); // ❌ Can't call again — `name` was consumed

// FnMut — mutably borrows captured values (can be called many times)
let mut count = 0;
let mut increment = || {
    count += 1; // Mutably borrows `count`
};
increment(); // count == 1
increment(); // count == 2

// Fn — immutably borrows captured values (can be called many times, concurrently)
let prefix = "Result";
let display = |x: i32| {
    println!("{prefix}: {x}"); // Immutably borrows `prefix`
};
display(1);
display(2);
}

The hierarchy: Fn : FnMut : FnOnce — each is a subtrait of the next:

FnOnce  ← everything can be called at least once
 ↑
FnMut   ← can be called repeatedly (may mutate state)
 ↑
Fn      ← can be called repeatedly and concurrently (no mutation)

If a closure implements Fn, it also implements FnMut and FnOnce.

Closures as Parameters and Return Values

// --- Parameters ---

// Static dispatch (monomorphized — fastest)
fn apply_twice<F: Fn(i32) -> i32>(f: F, x: i32) -> i32 {
    f(f(x))
}

// Also written with impl Trait:
fn apply_twice_v2(f: impl Fn(i32) -> i32, x: i32) -> i32 {
    f(f(x))
}

// Dynamic dispatch (trait object — flexible, slight overhead)
fn apply_dyn(f: &dyn Fn(i32) -> i32, x: i32) -> i32 {
    f(x)
}

// --- Return Values ---

// Can't return closures by value without boxing (they have anonymous types):
fn make_adder(n: i32) -> Box<dyn Fn(i32) -> i32> {
    Box::new(move |x| x + n)
}

// With impl Trait (simpler, monomorphized, but can't be dynamic):
fn make_adder_v2(n: i32) -> impl Fn(i32) -> i32 {
    move |x| x + n
}

fn main() {
    let double = |x: i32| x * 2;
    println!("{}", apply_twice(double, 3)); // 12

    let add5 = make_adder(5);
    println!("{}", add5(10)); // 15
}

Combinator Chains and Iterator Adapters

Higher-order functions shine with iterators — this is idiomatic Rust:

#![allow(unused)]
fn main() {
// C-style loop (imperative):
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
let mut result = Vec::new();
for x in &data {
    if x % 2 == 0 {
        result.push(x * x);
    }
}

// Idiomatic Rust (functional combinator chain):
let result: Vec<i32> = data.iter()
    .filter(|&&x| x % 2 == 0)
    .map(|&x| x * x)
    .collect();

// Same performance — iterators are lazy and optimized by LLVM
assert_eq!(result, vec![4, 16, 36, 64, 100]);
}

Common combinators cheat sheet:

CombinatorWhat It DoesExample
.map(f)Transform each element`.map(
.filter(p)Keep elements where predicate is true`.filter(
.filter_map(f)Map + filter in one step (returns Option)`.filter_map(
.flat_map(f)Map then flatten nested iterators`.flat_map(
.fold(init, f)Reduce to single value (like Aggregate in C#)`.fold(0,
.any(p) / .all(p)Short-circuit boolean check`.any(
.enumerate()Add index`.enumerate().map(
.zip(other)Pair with another iterator.zip(labels.iter())
.take(n) / .skip(n)First/skip N elements.take(10)
.chain(other)Concatenate two iterators.chain(extra.iter())
.peekable()Look ahead without consuming.peek()
.collect()Gather into a collection.collect::<Vec<_>>()

Implementing Your Own Higher-Order APIs

Design APIs that accept closures for customization:

#![allow(unused)]
fn main() {
/// Retry an operation with a configurable strategy
fn retry<T, E, F, S>(
    mut operation: F,
    mut should_retry: S,
    max_attempts: usize,
) -> Result<T, E>
where
    F: FnMut() -> Result<T, E>,
    S: FnMut(&E, usize) -> bool, // (error, attempt) → try again?
{
    for attempt in 1..=max_attempts {
        match operation() {
            Ok(val) => return Ok(val),
            Err(e) if attempt < max_attempts && should_retry(&e, attempt) => {
                continue;
            }
            Err(e) => return Err(e),
        }
    }
    unreachable!()
}

// Usage — caller controls retry logic:
}
#![allow(unused)]
fn main() {
fn connect_to_database() -> Result<(), String> { Ok(()) }
fn http_get(_url: &str) -> Result<String, String> { Ok(String::new()) }
trait TransientError { fn is_transient(&self) -> bool; }
impl TransientError for String { fn is_transient(&self) -> bool { true } }
let url = "http://example.com";
let result = retry(
    || connect_to_database(),
    |err, attempt| {
        eprintln!("Attempt {attempt} failed: {err}");
        true // Always retry
    },
    3,
);

// Usage — retry only specific errors:
let result = retry(
    || http_get(url),
    |err, _| err.is_transient(), // Only retry transient errors
    5,
);
}

The with Pattern — Bracketed Resource Access

Sometimes you need to guarantee that a resource is in a specific state for the duration of an operation, and restored afterward — regardless of how the caller’s code exits (early return, ?, panic). Instead of exposing the resource directly and hoping callers remember to set up and tear down, lend it through a closure:

set up → call closure with resource → tear down

The caller never touches setup or teardown. They can’t forget, can’t get it wrong, and can’t hold the resource beyond the closure’s scope.

Example: GPIO Pin Direction

A GPIO controller manages pins that support bidirectional I/O. Some callers need the pin configured as input, others as output. Rather than exposing raw pin access and trusting callers to set direction correctly, the controller provides with_pin_input and with_pin_output:

/// GPIO pin direction — not public, callers never set this directly.
#[derive(Debug, Clone, Copy, PartialEq)]
enum Direction { In, Out }

/// A GPIO pin handle lent to the closure. Cannot be stored or cloned —
/// it exists only for the duration of the callback.
pub struct GpioPin<'a> {
    pin_number: u8,
    _controller: &'a GpioController,
}

impl GpioPin<'_> {
    pub fn read(&self) -> bool {
        // Read pin level from hardware register
        println!("  reading pin {}", self.pin_number);
        true // stub
    }

    pub fn write(&self, high: bool) {
        // Drive pin level via hardware register
        println!("  writing pin {} = {high}", self.pin_number);
    }
}

pub struct GpioController {
    current_direction: std::cell::Cell<Option<Direction>>,
}

impl GpioController {
    pub fn new() -> Self {
        GpioController {
            current_direction: std::cell::Cell::new(None),
        }
    }

    /// Configure pin as input, run the closure, restore state.
    /// The caller receives a `GpioPin` that lives only for the callback.
    pub fn with_pin_input<R>(
        &self,
        pin: u8,
        mut f: impl FnMut(&GpioPin<'_>) -> R,
    ) -> R {
        let prev = self.current_direction.get();
        self.set_direction(pin, Direction::In);
        let handle = GpioPin { pin_number: pin, _controller: self };
        let result = f(&handle);
        // Restore previous direction (or leave as-is — policy choice)
        if let Some(dir) = prev {
            self.set_direction(pin, dir);
        }
        result
    }

    /// Configure pin as output, run the closure, restore state.
    pub fn with_pin_output<R>(
        &self,
        pin: u8,
        mut f: impl FnMut(&GpioPin<'_>) -> R,
    ) -> R {
        let prev = self.current_direction.get();
        self.set_direction(pin, Direction::Out);
        let handle = GpioPin { pin_number: pin, _controller: self };
        let result = f(&handle);
        if let Some(dir) = prev {
            self.set_direction(pin, dir);
        }
        result
    }

    fn set_direction(&self, pin: u8, dir: Direction) {
        println!("  [hw] pin {pin} → {dir:?}");
        self.current_direction.set(Some(dir));
    }
}

fn main() {
    let gpio = GpioController::new();

    // Caller 1: needs input — doesn't know or care how direction is managed
    let level = gpio.with_pin_input(4, |pin| {
        pin.read()
    });
    println!("Pin 4 level: {level}");

    // Caller 2: needs output — same API shape, different guarantee
    gpio.with_pin_output(4, |pin| {
        pin.write(true);
        // do more work...
        pin.write(false);
    });

    // Can't use the pin handle outside the closure:
    // let escaped_pin = gpio.with_pin_input(4, |pin| pin);
    // ❌ ERROR: borrowed value does not live long enough
}

What the with pattern guarantees:

  • Direction is always set before the caller’s code runs
  • Direction is always restored after, even if the closure returns early
  • The GpioPin handle cannot escape the closure — the borrow checker enforces this via the lifetime tied to the controller reference
  • Callers never import Direction, never call set_direction — the API is impossible to misuse

Where This Pattern Appears

The with pattern shows up throughout Rust’s standard library and ecosystem:

APISetupCallbackTeardown
std::thread::scopeCreate scope|s| { s.spawn(...) }Join all threads
Mutex::lockAcquire lockUse MutexGuard (RAII, not closure, but same idea)Release on drop
tempfile::tempdirCreate temp directoryUse pathDelete on drop
std::io::BufWriter::newBuffer writesWrite operationsFlush on drop
GPIO with_pin_* (above)Set directionUse pin handleRestore direction

The closure-based variant is strongest when:

  • Setup and teardown are paired and forgetting either is a bug
  • The resource shouldn’t outlive the operation — the borrow checker enforces this naturally
  • Multiple configurations exist (with_pin_input vs with_pin_output) — each with_* method encapsulates a different setup without exposing the configuration to the caller

with vs RAII (Drop): Both guarantee cleanup. Use RAII / Drop when the caller needs to hold the resource across multiple statements and function calls. Use with when the operation is bracketed — one setup, one block of work, one teardown — and you don’t want the caller to be able to break the bracket.

FnMut vs Fn in API design: Use FnMut as the default bound — it’s the most flexible (callers can pass Fn or FnMut closures). Only require Fn if you need to call the closure concurrently (e.g., from multiple threads). Only require FnOnce if you call it exactly once.

Key Takeaways — Closures

  • Fn borrows, FnMut borrows mutably, FnOnce consumes — accept the weakest bound your API needs
  • impl Fn in parameters, Box<dyn Fn> for storage, impl Fn in return (or Box<dyn Fn> if dynamic)
  • Combinator chains (map, filter, and_then) compose cleanly and inline to tight loops
  • The with pattern (bracketed access via closure) guarantees setup/teardown and prevents resource escape — use it when the caller shouldn’t manage configuration lifecycle

See also: Ch 2 — Traits In Depth for how Fn/FnMut/FnOnce relate to trait objects. Ch 8 — Functional vs. Imperative for when to choose combinators over loops. Ch 15 — API Design for ergonomic parameter patterns.

graph TD
    FnOnce["FnOnce<br>(can call once)"]
    FnMut["FnMut<br>(can call many times,<br>may mutate captures)"]
    Fn["Fn<br>(can call many times,<br>immutable captures)"]

    Fn -->|"implements"| FnMut
    FnMut -->|"implements"| FnOnce

    style Fn fill:#d4efdf,stroke:#27ae60,color:#000
    style FnMut fill:#fef9e7,stroke:#f1c40f,color:#000
    style FnOnce fill:#fadbd8,stroke:#e74c3c,color:#000

Every Fn is also FnMut, and every FnMut is also FnOnce. Accept FnMut by default — it’s the most flexible bound for callers.


Exercise: Higher-Order Combinator Pipeline ★★ (~25 min)

Create a Pipeline struct that chains transformations. It should support .pipe(f) to add a transformation and .execute(input) to run the full chain.

🔑 Solution
struct Pipeline<T> {
    transforms: Vec<Box<dyn Fn(T) -> T>>,
}

impl<T: 'static> Pipeline<T> {
    fn new() -> Self {
        Pipeline { transforms: Vec::new() }
    }

    fn pipe(mut self, f: impl Fn(T) -> T + 'static) -> Self {
        self.transforms.push(Box::new(f));
        self
    }

    fn execute(self, input: T) -> T {
        self.transforms.into_iter().fold(input, |val, f| f(val))
    }
}

fn main() {
    let result = Pipeline::new()
        .pipe(|s: String| s.trim().to_string())
        .pipe(|s| s.to_uppercase())
        .pipe(|s| format!(">>> {s} <<<"))
        .execute("  hello world  ".to_string());

    println!("{result}"); // >>> HELLO WORLD <<<

    let result = Pipeline::new()
        .pipe(|x: i32| x * 2)
        .pipe(|x| x + 10)
        .pipe(|x| x * x)
        .execute(5);

    println!("{result}"); // (5*2 + 10)^2 = 400
}

Chapter 8 — Functional vs. Imperative: When Elegance Wins (and When It Doesn’t)

Difficulty: 🟡 Intermediate | Time: 2–3 hours | Prerequisites: Ch 7 — Closures

Rust gives you genuine parity between functional and imperative styles. Unlike Haskell (functional by fiat) or C (imperative by default), Rust lets you choose — and the right choice depends on what you’re expressing. This chapter builds the judgment to pick well.

The core principle: Functional style shines when you’re transforming data through a pipeline. Imperative style shines when you’re managing state transitions with side effects. Most real code has both, and the skill is knowing where the boundary falls.


8.1 The Combinator You Didn’t Know You Wanted

Many Rust developers write this:

#![allow(unused)]
fn main() {
let value = if let Some(x) = maybe_config() {
    x
} else {
    default_config()
};
process(value);
}

When they could write this:

#![allow(unused)]
fn main() {
process(maybe_config().unwrap_or_else(default_config));
}

Or this common pattern:

#![allow(unused)]
fn main() {
let display_name = if let Some(name) = user.nickname() {
    name.to_uppercase()
} else {
    "ANONYMOUS".to_string()
};
}

Which is:

#![allow(unused)]
fn main() {
let display_name = user.nickname()
    .map(|n| n.to_uppercase())
    .unwrap_or_else(|| "ANONYMOUS".to_string());
}

The functional version isn’t just shorter — it tells you what is happening (transform, then default) without making you trace control flow. The if let version makes you read the branches to figure out that both paths end up in the same place.

The Option combinator family

Here’s the mental model: Option<T> is a one-element-or-empty collection. Every combinator on Option has an analogy to a collection operation.

You write…Instead of…What it communicates
opt.unwrap_or(default)if let Some(x) = opt { x } else { default }“Use this value or fall back”
opt.unwrap_or_else(|| expensive())if let Some(x) = opt { x } else { expensive() }Same, but default is lazy
opt.map(f)match opt { Some(x) => Some(f(x)), None => None }“Transform the inside, propagate absence”
opt.and_then(f)match opt { Some(x) => f(x), None => None }“Chain fallible operations” (flatmap)
opt.filter(|x| pred(x))match opt { Some(x) if pred(&x) => Some(x), _ => None }“Keep only if it passes”
opt.zip(other)if let (Some(a), Some(b)) = (opt, other) { Some((a,b)) } else { None }“Both or neither”
opt.or(fallback)if opt.is_some() { opt } else { fallback }“First available”
opt.or_else(|| try_another())if opt.is_some() { opt } else { try_another() }“Try alternatives in order”
opt.map_or(default, f)if let Some(x) = opt { f(x) } else { default }“Transform or default” — one-liner
opt.map_or_else(default_fn, f)if let Some(x) = opt { f(x) } else { default_fn() }Same, both sides are closures
opt?match opt { Some(x) => x, None => return None }“Propagate absence upward”

The Result combinator family

The same pattern applies to Result<T, E>:

You write…Instead of…What it communicates
res.map(f)match res { Ok(x) => Ok(f(x)), Err(e) => Err(e) }Transform the success path
res.map_err(f)match res { Ok(x) => Ok(x), Err(e) => Err(f(e)) }Transform the error
res.and_then(f)match res { Ok(x) => f(x), Err(e) => Err(e) }Chain fallible operations
res.unwrap_or_else(|e| default(e))match res { Ok(x) => x, Err(e) => default(e) }Recover from error
res.ok()match res { Ok(x) => Some(x), Err(_) => None }“I don’t care about the error”
res?match res { Ok(x) => x, Err(e) => return Err(e.into()) }Propagate errors upward

When if let IS better

The combinators lose when:

  • You need multiple statements in the Some branch. A map closure with 5 lines is worse than an if let with 5 lines.
  • The control flow is the point. if let Some(connection) = pool.try_get() { /* use it */ } else { /* log, retry, alert */ } — the two branches are genuinely different code paths, not a transform-or-default.
  • Side effects dominate. If both branches do I/O with different error handling, the combinator version obscures the important differences.

Rule of thumb: If the else branch produces the same type as the Some branch and the bodies are short expressions, use a combinator. If the branches do fundamentally different things, use if let or match.


8.2 Bool Combinators: .then() and .then_some()

Another pattern that’s more common than it should be:

#![allow(unused)]
fn main() {
let label = if is_admin {
    Some("ADMIN")
} else {
    None
};
}

Rust 1.62+ gives you:

#![allow(unused)]
fn main() {
let label = is_admin.then_some("ADMIN");
}

Or with a computed value:

#![allow(unused)]
fn main() {
let permissions = is_admin.then(|| compute_admin_permissions());
}

This is especially powerful in chains:

#![allow(unused)]
fn main() {
// Imperative
let mut tags = Vec::new();
if user.is_admin { tags.push("admin"); }
if user.is_verified { tags.push("verified"); }
if user.score > 100 { tags.push("power-user"); }

// Functional
let tags: Vec<&str> = [
    user.is_admin.then_some("admin"),
    user.is_verified.then_some("verified"),
    (user.score > 100).then_some("power-user"),
]
.into_iter()
.flatten()
.collect();
}

The functional version makes the pattern explicit: “build a list from conditional elements.” The imperative version makes you read each if to confirm they all do the same thing (push a tag).


8.3 Iterator Chains vs. Loops: The Decision Framework

Ch 7 showed the mechanics. This section builds the judgment.

When iterators win

Data pipelines — transforming a collection through a series of steps:

#![allow(unused)]
fn main() {
// Imperative: 8 lines, 2 mutable variables
let mut results = Vec::new();
for item in inventory {
    if item.category == Category::Server {
        if let Some(temp) = item.last_temperature() {
            if temp > 80.0 {
                results.push((item.id, temp));
            }
        }
    }
}

// Functional: 6 lines, 0 mutable variables, one pipeline
let results: Vec<_> = inventory.iter()
    .filter(|item| item.category == Category::Server)
    .filter_map(|item| item.last_temperature().map(|t| (item.id, t)))
    .filter(|(_, temp)| *temp > 80.0)
    .collect();
}

The functional version wins because:

  • Each filter is independently readable
  • No mut — the data flows in one direction
  • You can add/remove/reorder pipeline stages without restructuring
  • LLVM inlines iterator adapters to the same machine code as the loop

Aggregation — computing a single value from a collection:

#![allow(unused)]
fn main() {
// Imperative
let mut total_power = 0.0;
let mut count = 0;
for server in fleet {
    total_power += server.power_draw();
    count += 1;
}
let avg = total_power / count as f64;

// Functional
let (total_power, count) = fleet.iter()
    .map(|s| s.power_draw())
    .fold((0.0, 0usize), |(sum, n), p| (sum + p, n + 1));
let avg = total_power / count as f64;
}

Or even simpler if you just need the sum:

#![allow(unused)]
fn main() {
let total: f64 = fleet.iter().map(|s| s.power_draw()).sum();
}

When loops win

Early exit with complex state:

#![allow(unused)]
fn main() {
// This is clear and direct
let mut best_candidate = None;
for server in fleet {
    let score = evaluate(server);
    if score > threshold {
        if server.is_available() {
            best_candidate = Some(server);
            break; // Found one — stop immediately
        }
    }
}

// The functional version is strained
let best_candidate = fleet.iter()
    .filter(|s| evaluate(s) > threshold)
    .find(|s| s.is_available());
}

Wait — that functional version is actually pretty clean. Let’s try a case where it genuinely loses:

Building multiple outputs simultaneously:

#![allow(unused)]
fn main() {
// Imperative: clear, each branch does something different
let mut warnings = Vec::new();
let mut errors = Vec::new();
let mut stats = Stats::default();

for event in log_stream {
    match event.severity {
        Severity::Warn => {
            warnings.push(event.clone());
            stats.warn_count += 1;
        }
        Severity::Error => {
            errors.push(event.clone());
            stats.error_count += 1;
            if event.is_critical() {
                alert_oncall(&event);
            }
        }
        _ => stats.other_count += 1,
    }
}

// Functional version: forced, awkward, nobody wants to read this
let (warnings, errors, stats) = log_stream.iter().fold(
    (Vec::new(), Vec::new(), Stats::default()),
    |(mut w, mut e, mut s), event| {
        match event.severity {
            Severity::Warn => { w.push(event.clone()); s.warn_count += 1; }
            Severity::Error => {
                e.push(event.clone()); s.error_count += 1;
                if event.is_critical() { alert_oncall(event); }
            }
            _ => s.other_count += 1,
        }
        (w, e, s)
    },
);
}

The fold version is longer, harder to read, and has mutation anyway (the mut deconstructed accumulators). The loop wins because:

  • Multiple outputs being built in parallel
  • Side effects (alerting) mixed into the logic
  • Branch bodies are statements, not expressions

State machines with I/O:

#![allow(unused)]
fn main() {
// A parser that reads tokens — the loop IS the algorithm
let mut state = ParseState::Start;
loop {
    let token = lexer.next_token()?;
    state = match state {
        ParseState::Start => match token {
            Token::Keyword(k) => ParseState::GotKeyword(k),
            Token::Eof => break,
            _ => return Err(ParseError::UnexpectedToken(token)),
        },
        ParseState::GotKeyword(k) => match token {
            Token::Ident(name) => ParseState::GotName(k, name),
            _ => return Err(ParseError::ExpectedIdentifier),
        },
        // ...more states
    };
}
}

No functional equivalent is cleaner. The loop with match state is the natural expression of a state machine.

The decision flowchart

flowchart TB
    START{What are you doing?}

    START -->|"Transforming a collection\ninto another collection"| PIPE[Use iterator chain]
    START -->|"Computing a single value\nfrom a collection"| AGG{How complex?}
    START -->|"Multiple outputs from\none pass"| LOOP[Use a for loop]
    START -->|"State machine with\nI/O or side effects"| LOOP
    START -->|"One Option/Result\ntransform + default"| COMB[Use combinators]

    AGG -->|"Sum, count, min, max"| BUILTIN["Use .sum(), .count(),\n.min(), .max()"]
    AGG -->|"Custom accumulation"| FOLD{Accumulator has mutation\nor side effects?}
    FOLD -->|"No"| FOLDF["Use .fold()"]
    FOLD -->|"Yes"| LOOP

    style PIPE fill:#d4efdf,stroke:#27ae60,color:#000
    style COMB fill:#d4efdf,stroke:#27ae60,color:#000
    style BUILTIN fill:#d4efdf,stroke:#27ae60,color:#000
    style FOLDF fill:#d4efdf,stroke:#27ae60,color:#000
    style LOOP fill:#fef9e7,stroke:#f1c40f,color:#000

Rust blocks are expressions. This lets you confine mutation to a construction phase and bind the result immutably:

#![allow(unused)]
fn main() {
use rand::random;

let samples = {
    let mut buf = Vec::with_capacity(10);
    while buf.len() < 10 {
        let reading: f64 = random();
        buf.push(reading);
        if random::<u8>() % 3 == 0 { break; } // randomly stop early
    }
    buf
};
// samples is immutable — contains between 1 and 10 elements
}

The inner buf is mutable only inside the block. Once the block yields, the outer binding samples is immutable and the compiler will reject any later samples.push(...).

Why not an iterator chain? You might try:

#![allow(unused)]
fn main() {
let samples: Vec<f64> = std::iter::from_fn(|| Some(random()))
    .take(10)
    .take_while(|_| random::<u8>() % 3 != 0)
    .collect();
}

But take_while excludes the element that fails the predicate, producing anywhere from zero to nine elements instead of the guaranteed-at-least-one the imperative version provides. You can work around it with scan or chain, but the imperative version is clearer.

When scoped mutability genuinely wins:

ScenarioWhy iterators struggle
Sort-then-freeze (sort_unstable() + dedup())Both return () — no chainable output (itertools offers .sorted().dedup() if available)
Stateful termination (stop on a condition unrelated to the data)take_while drops the boundary element
Multi-step struct population (field-by-field from different sources)No natural single pipeline

Honest calibration: For most collection-building tasks, iterator chains or itertools are preferred. Reach for scoped mutability when the construction logic has branching, early exit, or in-place mutation that doesn’t map to a single pipeline. The pattern’s real value is teaching that mutation scope can be smaller than variable lifetime — a Rust fundamental that surprises developers coming from C++, C#, and Python.


8.4 The ? Operator: Where Functional Meets Imperative

The ? operator is Rust’s most elegant synthesis of both styles. It’s essentially .and_then() combined with early return:

#![allow(unused)]
fn main() {
// This chain of and_then...
fn load_config() -> Result<Config, Error> {
    read_file("config.toml")
        .and_then(|contents| parse_toml(&contents))
        .and_then(|table| validate_config(table))
        .and_then(|valid| Config::from_validated(valid))
}

// ...is exactly equivalent to this
fn load_config() -> Result<Config, Error> {
    let contents = read_file("config.toml")?;
    let table = parse_toml(&contents)?;
    let valid = validate_config(table)?;
    Config::from_validated(valid)
}
}

Both are functional in spirit (they propagate errors automatically) but the ? version gives you named intermediate variables, which matter when:

  • You need to use contents again later
  • You want to add .context("while parsing config")? per step
  • You’re debugging and want to inspect intermediate values

The anti-pattern: long .and_then() chains when ? is available. If every closure in the chain is |x| next_step(x), you’ve reinvented ? without the readability.

When .and_then() IS better than ?:

#![allow(unused)]
fn main() {
// Transforming inside an Option, without early return
let port: Option<u16> = config.get("port")
    .and_then(|v| v.parse::<u16>().ok())
    .filter(|&p| p > 0 && p < 65535);
}

You can’t use ? here because there’s no enclosing function to return from — you’re building an Option, not propagating it.


8.5 Collection Building: collect() vs. Push Loops

collect() is more powerful than most developers realize:

Collecting into a Result

#![allow(unused)]
fn main() {
// Imperative: parse a list, fail on first error
let mut numbers = Vec::new();
for s in input_strings {
    let n: i64 = s.parse().map_err(|_| Error::BadInput(s.clone()))?;
    numbers.push(n);
}

// Functional: collect into Result<Vec<_>, _>
let numbers: Vec<i64> = input_strings.iter()
    .map(|s| s.parse::<i64>().map_err(|_| Error::BadInput(s.clone())))
    .collect::<Result<_, _>>()?;
}

The collect::<Result<Vec<_>, _>>() trick works because Result implements FromIterator. It short-circuits on the first Err, just like the loop with ?.

Collecting into a HashMap

#![allow(unused)]
fn main() {
// Imperative
let mut index = HashMap::new();
for server in fleet {
    index.insert(server.id.clone(), server);
}

// Functional
let index: HashMap<_, _> = fleet.into_iter()
    .map(|s| (s.id.clone(), s))
    .collect();
}

Collecting into a String

#![allow(unused)]
fn main() {
// Imperative
let mut csv = String::new();
for (i, field) in fields.iter().enumerate() {
    if i > 0 { csv.push(','); }
    csv.push_str(field);
}

// Functional
let csv = fields.join(",");

// Or for more complex formatting:
let csv: String = fields.iter()
    .map(|f| format!("\"{f}\""))
    .collect::<Vec<_>>()
    .join(",");
}

When the loop version wins

collect() allocates a new collection. If you’re modifying in place, the loop is both clearer and more efficient:

#![allow(unused)]
fn main() {
// In-place update — no functional equivalent that's better
for server in &mut fleet {
    if server.needs_refresh() {
        server.refresh_telemetry()?;
    }
}
}

The functional version would require .iter_mut().for_each(|s| { ... }), which is just a loop with extra syntax.


8.6 Pattern Matching as Function Dispatch

Rust’s match is a functional construct that most developers use imperatively. Here’s the functional lens:

Match as a lookup table

#![allow(unused)]
fn main() {
// Imperative thinking: "check each case"
fn status_message(code: StatusCode) -> &'static str {
    if code == StatusCode::OK { "Success" }
    else if code == StatusCode::NOT_FOUND { "Not found" }
    else if code == StatusCode::INTERNAL { "Server error" }
    else { "Unknown" }
}

// Functional thinking: "map from domain to range"
fn status_message(code: StatusCode) -> &'static str {
    match code {
        StatusCode::OK => "Success",
        StatusCode::NOT_FOUND => "Not found",
        StatusCode::INTERNAL => "Server error",
        _ => "Unknown",
    }
}
}

The match version isn’t just style — the compiler verifies exhaustiveness. Add a new variant, and every match that doesn’t handle it becomes a compile error. The if/else chain silently falls through to the default.

Match + destructuring as a pipeline

#![allow(unused)]
fn main() {
// Parsing a command — each arm extracts and transforms
fn execute(cmd: Command) -> Result<Response, Error> {
    match cmd {
        Command::Get { key } => db.get(&key).map(Response::Value),
        Command::Set { key, value } => db.set(key, value).map(|_| Response::Ok),
        Command::Delete { key } => db.delete(&key).map(|_| Response::Ok),
        Command::Batch(cmds) => cmds.into_iter()
            .map(execute)
            .collect::<Result<Vec<_>, _>>()
            .map(Response::Batch),
    }
}
}

Each arm is an expression that returns the same type. This is pattern matching as function dispatch — the match arms are essentially a function table indexed by the enum variant.


8.7 Chaining Methods on Custom Types

The functional style extends beyond standard library types. Builder patterns and fluent APIs are functional programming in disguise:

#![allow(unused)]
fn main() {
// This is a combinator chain over your own type
let query = QueryBuilder::new("servers")
    .filter("status", Eq, "active")
    .filter("rack", In, &["A1", "A2", "B1"])
    .order_by("temperature", Desc)
    .limit(50)
    .build();
}

The key insight: if your type has methods that take self and return Self (or a transformed type), you’ve built a combinator. The same functional/imperative judgment applies:

#![allow(unused)]
fn main() {
// Good: chainable because each step is a simple transform
let config = Config::default()
    .with_timeout(Duration::from_secs(30))
    .with_retries(3)
    .with_tls(true);

// Bad: chainable but the chain is doing too many unrelated things
let result = processor
    .load_data(path)?       // I/O
    .validate()             // Pure
    .transform(rule_set)    // Pure
    .save_to_disk(output)?  // I/O
    .notify_downstream()?;  // Side effect

// Better: separate the pure pipeline from the I/O bookends
let data = load_data(path)?;
let processed = data.validate().transform(rule_set);
save_to_disk(output, &processed)?;
notify_downstream()?;
}

The chain fails when it mixes pure transforms with I/O. The reader can’t tell which calls might fail, which have side effects, and where the actual data transformations happen.


8.8 Performance: They’re the Same

A common misconception: “functional style is slower because of all the closures and allocations.”

In Rust, iterator chains compile to the same machine code as hand-written loops. LLVM inlines the closure calls, eliminates the iterator adapter structs, and often produces identical assembly. This is called zero-cost abstraction and it’s not aspirational — it’s measured.

#![allow(unused)]
fn main() {
// These produce identical assembly on release builds:

// Functional
let sum: i64 = (0..1000).filter(|n| n % 2 == 0).map(|n| n * n).sum();

// Imperative
let mut sum: i64 = 0;
for n in 0..1000 {
    if n % 2 == 0 {
        sum += n * n;
    }
}
}

The one exception: .collect() allocates. If you’re chaining .map().collect().iter().map().collect() with intermediate collections, you’re paying for allocations the loop version avoids. The fix: eliminate intermediate collects by chaining adapters directly, or use a loop if you need the intermediate collections for other reasons.


8.9 The Taste Test: A Catalog of Transformations

Here’s a reference table for the most common “I wrote 6 lines but there’s a one-liner” patterns:

Imperative patternFunctional equivalentWhen to prefer functional
if let Some(x) = opt { f(x) } else { default }opt.map_or(default, f)Short expressions on both sides
if let Some(x) = opt { Some(g(x)) } else { None }opt.map(g)Always — this is what map is for
if condition { Some(x) } else { None }condition.then_some(x)Always
if condition { Some(compute()) } else { None }condition.then(compute)Always
match opt { Some(x) if pred(x) => Some(x), _ => None }opt.filter(pred)Always
for x in iter { if pred(x) { result.push(f(x)); } }iter.filter(pred).map(f).collect()When the pipeline is readable in one screen
if a.is_some() && b.is_some() { Some((a?, b?)) }a.zip(b)Always — .zip() is exactly this
match (a, b) { (Some(x), Some(y)) => x + y, _ => 0 }a.zip(b).map(|(x,y)| x + y).unwrap_or(0)Judgment call — depends on complexity
iter.map(f).collect::<Vec<_>>()[0]iter.map(f).next().unwrap()Don’t allocate a Vec for one element
let mut v = vec; v.sort(); v{ let mut v = vec; v.sort(); v }Rust doesn’t have a .sorted() in std (use itertools)

8.10 The Anti-Patterns

Over-functionalizing: the 5-deep chain nobody can read

#![allow(unused)]
fn main() {
// This is not elegant. This is a puzzle.
let result = data.iter()
    .filter_map(|x| x.metadata.as_ref())
    .flat_map(|m| m.tags.iter())
    .filter(|t| t.starts_with("env:"))
    .map(|t| t.strip_prefix("env:").unwrap())
    .filter(|env| allowed_envs.contains(env))
    .map(|env| env.to_uppercase())
    .collect::<HashSet<_>>()
    .into_iter()
    .sorted()
    .collect::<Vec<_>>();
}

When a chain exceeds ~4 adapters, break it up with named intermediate variables or extract a helper:

#![allow(unused)]
fn main() {
let env_tags = data.iter()
    .filter_map(|x| x.metadata.as_ref())
    .flat_map(|m| m.tags.iter());

let allowed: Vec<_> = env_tags
    .filter_map(|t| t.strip_prefix("env:"))
    .filter(|env| allowed_envs.contains(env))
    .map(|env| env.to_uppercase())
    .sorted()
    .collect();
}

Under-functionalizing: the C-style loop that Rust has a word for

#![allow(unused)]
fn main() {
// This is just .any()
let mut found = false;
for item in &list {
    if item.is_expired() {
        found = true;
        break;
    }
}

// Write this instead
let found = list.iter().any(|item| item.is_expired());
}
#![allow(unused)]
fn main() {
// This is just .find()
let mut target = None;
for server in &fleet {
    if server.id == target_id {
        target = Some(server);
        break;
    }
}

// Write this instead
let target = fleet.iter().find(|s| s.id == target_id);
}
#![allow(unused)]
fn main() {
// This is just .all()
let mut all_healthy = true;
for server in &fleet {
    if !server.is_healthy() {
        all_healthy = false;
        break;
    }
}

// Write this instead
let all_healthy = fleet.iter().all(|s| s.is_healthy());
}

The standard library has these for a reason. Learn the vocabulary and the patterns become obvious.


Key Takeaways

  • Option and Result are one-element collections. Their combinators (.map(), .and_then(), .unwrap_or_else(), .filter(), .zip()) replace most if let / match boilerplate.
  • Use bool::then_some() — it replaces if cond { Some(x) } else { None } in every case.
  • Iterator chains win for data pipelines — filter/map/collect with zero mutable state. They compile to the same machine code as loops.
  • Loops win for multi-output state machines — when you’re building multiple collections, doing I/O in branches, or managing a state transition.
  • The ? operator is the best of both worlds — functional error propagation with imperative readability.
  • Break chains at ~4 adapters — use named intermediates for readability. Over-functionalizing is as bad as under-functionalizing.
  • Learn the standard-library vocabulary.any(), .all(), .find(), .position(), .sum(), .min_by_key() — each one replaces a multi-line loop with a single intent-revealing call.

See also: Ch 7 for closure mechanics and the Fn trait hierarchy. Ch 10 for error combinator patterns. Ch 15 for fluent API design.


Exercise: Refactoring Imperative to Functional ★★ (~30 min)

Refactor the following function from imperative to functional style. Then identify one place where the functional version is worse and explain why.

#![allow(unused)]
fn main() {
fn summarize_fleet(fleet: &[Server]) -> FleetSummary {
    let mut healthy = Vec::new();
    let mut degraded = Vec::new();
    let mut failed = Vec::new();
    let mut total_power = 0.0;
    let mut max_temp = f64::NEG_INFINITY;

    for server in fleet {
        match server.health_status() {
            Health::Healthy => healthy.push(server.id.clone()),
            Health::Degraded(reason) => degraded.push((server.id.clone(), reason)),
            Health::Failed(err) => failed.push((server.id.clone(), err)),
        }
        total_power += server.power_draw();
        if server.max_temperature() > max_temp {
            max_temp = server.max_temperature();
        }
    }

    FleetSummary {
        healthy,
        degraded,
        failed,
        avg_power: total_power / fleet.len() as f64,
        max_temp,
    }
}
}
🔑 Solution

The total_power and max_temp are clean functional rewrites:

#![allow(unused)]
fn main() {
fn summarize_fleet(fleet: &[Server]) -> FleetSummary {
    let avg_power: f64 = fleet.iter().map(|s| s.power_draw()).sum::<f64>()
        / fleet.len() as f64;

    let max_temp = fleet.iter()
        .map(|s| s.max_temperature())
        .fold(f64::NEG_INFINITY, f64::max);

    // But the three-way partition is BETTER as a loop.
    // Functional version would require three separate passes
    // or an awkward fold with three mutable accumulators.
    let mut healthy = Vec::new();
    let mut degraded = Vec::new();
    let mut failed = Vec::new();

    for server in fleet {
        match server.health_status() {
            Health::Healthy => healthy.push(server.id.clone()),
            Health::Degraded(reason) => degraded.push((server.id.clone(), reason)),
            Health::Failed(err) => failed.push((server.id.clone(), err)),
        }
    }

    FleetSummary { healthy, degraded, failed, avg_power, max_temp }
}
}

Why the loop is better for the three-way partition: A functional version would either require three .filter().collect() passes (3x iteration), or a .fold() with three mut Vec accumulators inside a tuple — which is just the loop rewritten with worse syntax. The imperative single-pass loop is clearer, more efficient, and easier to extend.


8. Smart Pointers and Interior Mutability 🟡

What you’ll learn:

  • Box, Rc, Arc for heap allocation and shared ownership
  • Weak references for breaking Rc/Arc reference cycles
  • Cell, RefCell, and Cow for interior mutability patterns
  • Pin for self-referential types and ManuallyDrop for lifecycle control

Box, Rc, Arc — Heap Allocation and Sharing

#![allow(unused)]
fn main() {
// --- Box<T>: Single owner, heap allocation ---
// Use when: recursive types, large values, trait objects
let boxed: Box<i32> = Box::new(42);
println!("{}", *boxed); // Deref to i32

// Recursive type requires Box (otherwise infinite size):
enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

// Trait object (dynamic dispatch):
let writer: Box<dyn std::io::Write> = Box::new(std::io::stdout());

// --- Rc<T>: Multiple owners, single-threaded ---
// Use when: shared ownership within one thread (no Send/Sync)
use std::rc::Rc;

let a = Rc::new(vec![1, 2, 3]);
let b = Rc::clone(&a); // Increments reference count (NOT deep clone)
let c = Rc::clone(&a);
println!("Ref count: {}", Rc::strong_count(&a)); // 3

// All three point to the same Vec. When the last Rc is dropped,
// the Vec is deallocated.

// --- Arc<T>: Multiple owners, thread-safe ---
// Use when: shared ownership across threads
use std::sync::Arc;

let shared = Arc::new(String::from("shared data"));
let handles: Vec<_> = (0..5).map(|_| {
    let shared = Arc::clone(&shared);
    std::thread::spawn(move || println!("{shared}"))
}).collect();
for h in handles { h.join().unwrap(); }
}

Weak References — Breaking Reference Cycles

Rc and Arc use reference counting, which cannot free cycles (A → B → A). Weak<T> is a non-owning handle that does not increment the strong count:

#![allow(unused)]
fn main() {
use std::rc::{Rc, Weak};
use std::cell::RefCell;

struct Node {
    value: i32,
    parent: RefCell<Weak<Node>>,   // does NOT keep parent alive
    children: RefCell<Vec<Rc<Node>>>,
}

let parent = Rc::new(Node {
    value: 0, parent: RefCell::new(Weak::new()), children: RefCell::new(vec![]),
});
let child = Rc::new(Node {
    value: 1, parent: RefCell::new(Rc::downgrade(&parent)), children: RefCell::new(vec![]),
});
parent.children.borrow_mut().push(Rc::clone(&child));

// Access parent from child — returns Option<Rc<Node>>:
if let Some(p) = child.parent.borrow().upgrade() {
    println!("Child's parent value: {}", p.value); // 0
}
// When `parent` is dropped, strong_count → 0, memory is freed.
// `child.parent.upgrade()` would then return `None`.
}

Rule of thumb: Use Rc/Arc for ownership edges, Weak for back-references and caches. For thread-safe code, use Arc<T> with sync::Weak<T>.

Cell and RefCell — Interior Mutability

Sometimes you need to mutate data behind a shared (&) reference. Rust provides interior mutability with runtime borrow checking:

#![allow(unused)]
fn main() {
use std::cell::{Cell, RefCell};

// --- Cell<T>: Copy-based interior mutability ---
// Only for Copy types (or types you swap in/out)
struct Counter {
    count: Cell<u32>,
}

impl Counter {
    fn new() -> Self { Counter { count: Cell::new(0) } }

    fn increment(&self) { // &self, not &mut self!
        self.count.set(self.count.get() + 1);
    }

    fn value(&self) -> u32 { self.count.get() }
}

// --- RefCell<T>: Runtime borrow checking ---
// Panics if you violate borrow rules at runtime
struct Cache {
    data: RefCell<Vec<String>>,
}

impl Cache {
    fn new() -> Self { Cache { data: RefCell::new(Vec::new()) } }

    fn add(&self, item: String) { // &self — looks immutable from outside
        self.data.borrow_mut().push(item); // Runtime-checked &mut
    }

    fn get_all(&self) -> Vec<String> {
        self.data.borrow().clone() // Runtime-checked &
    }

    fn bad_example(&self) {
        let _guard1 = self.data.borrow();
        // let _guard2 = self.data.borrow_mut();
        // ❌ PANICS at runtime — can't have &mut while & exists
    }
}
}

Cell vs RefCell: Cell never panics (it copies/swaps values) but only works with Copy types or via swap()/replace(). RefCell works with any type but panics on double-mutable-borrow. Neither is Sync — for multithreaded use, see Mutex/RwLock.

Cow — Clone on Write

Cow (Clone on Write) holds either a borrowed or owned value. It clones only when mutation is needed:

use std::borrow::Cow;

// Avoids allocating when no modification is needed:
fn normalize(input: &str) -> Cow<'_, str> {
    if input.contains('\t') {
        // Only allocate if tabs need replacing
        Cow::Owned(input.replace('\t', "    "))
    } else {
        // No allocation — just return a reference
        Cow::Borrowed(input)
    }
}

fn main() {
    let clean = "no tabs here";
    let dirty = "tabs\there";

    let r1 = normalize(clean); // Cow::Borrowed — zero allocation
    let r2 = normalize(dirty); // Cow::Owned — allocated new String

    println!("{r1}");
    println!("{r2}");
}

// Also useful for function parameters that MIGHT need ownership:
fn process(data: Cow<'_, [u8]>) {
    // Can read data without copying
    println!("Length: {}", data.len());
    // If we need to mutate, Cow auto-clones:
    let mut owned = data.into_owned(); // Clone only if Borrowed
    owned.push(0xFF);
}

Cow<'_, [u8]> for Binary Data

Cow is especially useful for byte-oriented APIs where data may or may not need transformation (checksum insertion, padding, escaping). This avoids allocating a Vec<u8> on the common fast path:

#![allow(unused)]
fn main() {
use std::borrow::Cow;

/// Pads a frame to a minimum length, borrowing when no padding is needed.
fn pad_frame(frame: &[u8], min_len: usize) -> Cow<'_, [u8]> {
    if frame.len() >= min_len {
        Cow::Borrowed(frame)  // Already long enough — zero allocation
    } else {
        let mut padded = frame.to_vec();
        padded.resize(min_len, 0x00);
        Cow::Owned(padded)    // Allocate only when padding is required
    }
}

let short = pad_frame(&[0xDE, 0xAD], 8);    // Owned — padded to 8 bytes
let long  = pad_frame(&[0; 64], 8);          // Borrowed — already ≥ 8
}

Tip: Combine Cow<[u8]> with bytes::Bytes (Ch10) when you need reference-counted sharing of potentially-transformed buffers.

When to Use Which Pointer

PointerOwner CountThread-SafeMutabilityUse When
Box<T>1✅ (if T: Send)Via &mutHeap allocation, trait objects, recursive types
Rc<T>NNone (wrap in Cell/RefCell)Shared ownership, single thread, graphs/trees
Arc<T>NNone (wrap in Mutex/RwLock)Shared ownership across threads
Cell<T>.get() / .set()Interior mutability for Copy types
RefCell<T>.borrow() / .borrow_mut()Interior mutability for any type, single thread
Cow<'_, T>0 or 1✅ (if T: Send)Clone on writeAvoid allocation when data is often unchanged

Pin and Self-Referential Types

Pin<P> prevents a value from being moved in memory. This is essential for self-referential types — structs that contain a pointer to their own data — and for Futures, which may hold references across .await points.

use std::pin::Pin;
use std::marker::PhantomPinned;

// A self-referential struct (simplified):
struct SelfRef {
    data: String,
    ptr: *const String, // Points to `data` above
    _pin: PhantomPinned, // Opts out of Unpin — can't be moved
}

impl SelfRef {
    fn new(s: &str) -> Pin<Box<Self>> {
        let val = SelfRef {
            data: s.to_string(),
            ptr: std::ptr::null(),
            _pin: PhantomPinned,
        };
        let mut boxed = Box::pin(val);

        // SAFETY: we don't move the data after setting the pointer
        let self_ptr: *const String = &boxed.data;
        unsafe {
            let mut_ref = Pin::as_mut(&mut boxed);
            Pin::get_unchecked_mut(mut_ref).ptr = self_ptr;
        }
        boxed
    }

    fn data(&self) -> &str {
        &self.data
    }

    fn ptr_data(&self) -> &str {
        // SAFETY: ptr was set to point to self.data while pinned
        unsafe { &*self.ptr }
    }
}

fn main() {
    let pinned = SelfRef::new("hello");
    assert_eq!(pinned.data(), pinned.ptr_data()); // Both "hello"
    // std::mem::swap would invalidate ptr — but Pin prevents it
}

Key concepts:

ConceptMeaning
Unpin (auto-trait)“Moving this type is safe.” Most types are Unpin by default.
!Unpin / PhantomPinned“I have internal pointers — don’t move me.”
Pin<&mut T>A mutable reference that guarantees T won’t move
Pin<Box<T>>An owned, heap-pinned value

Why this matters for async: Every async fn desugars to a Future that may hold references across .await points — making it self-referential. The async runtime uses Pin<&mut Future> to guarantee the future isn’t moved once polled.

#![allow(unused)]
fn main() {
// When you write:
async fn fetch(url: &str) -> String {
    let response = http_get(url).await; // reference held across await
    response.text().await
}

// The compiler generates a state machine struct that is !Unpin,
// and the runtime pins it before calling Future::poll().
}

When to care about Pin: (1) Implementing Future manually, (2) writing async runtimes or combinators, (3) any struct with self-referential pointers. For normal application code, async/await handles pinning transparently. See the companion Async Rust Training for deeper coverage.

Crate alternatives: For self-referential structs without manual Pin, consider ouroboros or self_cell — they generate safe wrappers with correct pinning and drop semantics.

Pin Projections — Structural Pinning

When you have a Pin<&mut MyStruct>, you often need to access individual fields. Pin projection is the pattern for safely going from Pin<&mut Struct> to Pin<&mut Field> (for pinned fields) or &mut Field (for unpinned fields).

The Problem: Field Access on Pinned Types

#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::marker::PhantomPinned;

struct MyFuture {
    data: String,              // Regular field — safe to move
    state: InternalState,      // Self-referential — must stay pinned
    _pin: PhantomPinned,
}

enum InternalState {
    Waiting { ptr: *const String }, // Points to `data` — self-referential
    Done,
}

// Given `Pin<&mut MyFuture>`, how do you access `data` and `state`?
// You CAN'T just do `pinned.data` — the compiler won't let you
// get a &mut to a field of a pinned value without unsafe.
}

Manual Pin Projection (unsafe)

#![allow(unused)]
fn main() {
impl MyFuture {
    // Project to `data` — this field is structurally unpinned (safe to move)
    fn data(self: Pin<&mut Self>) -> &mut String {
        // SAFETY: `data` is not structurally pinned. Moving `data` alone
        // doesn't move the whole struct, so Pin's guarantee is preserved.
        unsafe { &mut self.get_unchecked_mut().data }
    }

    // Project to `state` — this field IS structurally pinned
    fn state(self: Pin<&mut Self>) -> Pin<&mut InternalState> {
        // SAFETY: `state` is structurally pinned — we maintain the
        // pin invariant by returning Pin<&mut InternalState>.
        unsafe { Pin::new_unchecked(&mut self.get_unchecked_mut().state) }
    }
}
}

Structural pinning rules — a field is “structurally pinned” if:

  1. Moving/swapping that field alone could invalidate a self-reference
  2. The struct’s Drop impl must not move the field
  3. The struct must be !Unpin (enforced by PhantomPinned or a !Unpin field)

pin-project — Safe Pin Projections (Zero Unsafe)

The pin-project crate generates provably correct projections at compile time, eliminating the need for manual unsafe:

#![allow(unused)]
fn main() {
use pin_project::pin_project;
use std::pin::Pin;
use std::future::Future;
use std::task::{Context, Poll};

#[pin_project]                   // <-- Generates projection methods
struct TimedFuture<F: Future> {
    #[pin]                       // <-- Structurally pinned (it's a Future)
    inner: F,
    started_at: std::time::Instant, // NOT pinned — plain data
}

impl<F: Future> Future for TimedFuture<F> {
    type Output = (F::Output, std::time::Duration);

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        let this = self.project();  // Safe! Generated by pin_project
        //   this.inner   : Pin<&mut F>              — pinned field
        //   this.started_at : &mut std::time::Instant — unpinned field

        match this.inner.poll(cx) {
            Poll::Ready(output) => {
                let elapsed = this.started_at.elapsed();
                Poll::Ready((output, elapsed))
            }
            Poll::Pending => Poll::Pending,
        }
    }
}
}

pin-project vs Manual Projection

AspectManual (unsafe)pin-project
SafetyYou prove invariantsCompiler-verified
BoilerplateLow (but error-prone)Zero — derive macro
Drop interactionMust not move pinned fieldsEnforced: #[pinned_drop]
Compile-time costNoneProc-macro expansion
Use casePrimitives, no_stdApplication / library code

#[pinned_drop] — Drop for Pinned Types

When a type has #[pin] fields, pin-project requires #[pinned_drop] instead of a regular Drop impl to prevent accidentally moving pinned fields:

#![allow(unused)]
fn main() {
use pin_project::{pin_project, pinned_drop};
use std::pin::Pin;

#[pin_project(PinnedDrop)]
struct Connection<F> {
    #[pin]
    future: F,
    buffer: Vec<u8>,  // Not pinned — can be moved in drop
}

#[pinned_drop]
impl<F> PinnedDrop for Connection<F> {
    fn drop(self: Pin<&mut Self>) {
        let this = self.project();
        // `this.future` is Pin<&mut F> — can't be moved, only dropped in place
        // `this.buffer` is &mut Vec<u8> — can be drained, cleared, etc.
        this.buffer.clear();
        println!("Connection dropped, buffer cleared");
    }
}
}

When Pin Projections Matter in Practice

Note: The diagram below uses Mermaid syntax. It renders on GitHub and in tools that support Mermaid (mdBook with mermaid plugin, VS Code with Mermaid extension). In plain Markdown viewers, you’ll see the raw source.

graph TD
    A["Do you implement Future manually?"] -->|Yes| B["Does the future hold references<br/>across .await points?"]
    A -->|No| C["async/await handles Pin for you<br/>✅ No projections needed"]
    B -->|Yes| D["Use #[pin_project] on your<br/>future struct"]
    B -->|No| E["Your future is Unpin<br/>✅ No projections needed"]
    D --> F["Mark futures/streams as #[pin]<br/>Leave data fields unpinned"]
    
    style C fill:#91e5a3,color:#000
    style E fill:#91e5a3,color:#000
    style D fill:#ffa07a,color:#000
    style F fill:#ffa07a,color:#000

Rule of thumb: If you’re wrapping another Future or Stream, use pin-project. If you’re writing application code with async/await, you’ll never need pin projections directly. See the companion Async Rust Training for async combinator patterns that use pin projections.

Drop Ordering and ManuallyDrop

Rust’s drop order is deterministic but has rules worth knowing:

Drop Order Rules

struct Label(&'static str);

impl Drop for Label {
    fn drop(&mut self) { println!("Dropping {}", self.0); }
}

fn main() {
    let a = Label("first");   // Declared first
    let b = Label("second");  // Declared second
    let c = Label("third");   // Declared third
}
// Output:
//   Dropping third    ← locals drop in REVERSE declaration order
//   Dropping second
//   Dropping first

The three rules:

WhatDrop OrderRationale
Local variablesReverse declaration orderLater variables might reference earlier ones
Struct fieldsDeclaration order (top to bottom)Matches construction order (stable since Rust 1.0, guaranteed by RFC 1857)
Tuple elementsDeclaration order (left to right)(a, b, c) → drop a, then b, then c
#![allow(unused)]
fn main() {
struct Server {
    listener: Label,  // Dropped 1st
    handler: Label,   // Dropped 2nd
    logger: Label,    // Dropped 3rd
}
// Fields drop top-to-bottom (declaration order).
// This matters when fields reference each other or hold resources.
}

Practical impact: If your struct has a JoinHandle and a Sender, field order determines which drops first. If the thread reads from the channel, drop the Sender first (close the channel) so the thread exits, then join the handle. Put Sender above JoinHandle in the struct.

ManuallyDrop<T> — Suppressing Automatic Drop

ManuallyDrop<T> wraps a value and prevents its destructor from running automatically. You take responsibility for dropping it (or intentionally leaking it):

#![allow(unused)]
fn main() {
use std::mem::ManuallyDrop;

// Use case 1: Prevent double-free in unsafe code
struct TwoPhaseBuffer {
    // We need to drop the Vec ourselves to control timing
    data: ManuallyDrop<Vec<u8>>,
    committed: bool,
}

impl TwoPhaseBuffer {
    fn new(capacity: usize) -> Self {
        TwoPhaseBuffer {
            data: ManuallyDrop::new(Vec::with_capacity(capacity)),
            committed: false,
        }
    }

    fn write(&mut self, bytes: &[u8]) {
        self.data.extend_from_slice(bytes);
    }

    fn commit(&mut self) {
        self.committed = true;
        println!("Committed {} bytes", self.data.len());
    }
}

impl Drop for TwoPhaseBuffer {
    fn drop(&mut self) {
        if !self.committed {
            println!("Rolling back — dropping uncommitted data");
        }
        // SAFETY: data is always valid here; we only drop it once.
        unsafe { ManuallyDrop::drop(&mut self.data); }
    }
}
}
#![allow(unused)]
fn main() {
// Use case 2: Intentional leak (e.g., global singletons)
fn leaked_string() -> &'static str {
    // Box::leak() is the idiomatic way to create a &'static reference:
    let s = String::from("lives forever");
    Box::leak(s.into_boxed_str())
    // ⚠️ This is a controlled memory leak. The String's heap allocation
    // is never freed. Only use for long-lived singletons.
}

// ManuallyDrop alternative (requires unsafe):
// ⚠️ Prefer Box::leak() above — this is shown only to illustrate
// ManuallyDrop semantics (suppressing Drop while the heap data survives).
fn leaked_string_manual() -> &'static str {
    use std::mem::ManuallyDrop;
    let md = ManuallyDrop::new(String::from("lives forever"));
    // SAFETY: ManuallyDrop prevents deallocation; the heap data lives
    // forever, so a 'static reference is valid.
    unsafe { &*(md.as_str() as *const str) }
}
}
#![allow(unused)]
fn main() {
// Use case 3: Union fields (only one variant is valid at a time)
use std::mem::ManuallyDrop;

union IntOrString {
    i: u64,
    s: ManuallyDrop<String>,
    // String has a Drop impl, so it MUST be wrapped in ManuallyDrop
    // inside a union — the compiler can't know which field is active.
}

// No automatic Drop — the code that constructs IntOrString must also
// handle cleanup. If the String variant is active, call:
//   unsafe { ManuallyDrop::drop(&mut value.s); }
// without a Drop impl, the union is simply leaked (no UB, just a leak).
}

ManuallyDrop vs mem::forget:

ManuallyDrop<T>mem::forget(value)
WhenWrap at constructionConsume later
Access inner&*md / &mut *mdValue is gone
Drop laterManuallyDrop::drop(&mut md)Not possible
Use caseFine-grained lifecycle controlFire-and-forget leak

Rule: Use ManuallyDrop in unsafe abstractions where you need to control exactly when a destructor runs. In safe application code, you almost never need it — Rust’s automatic drop ordering handles things correctly.

Key Takeaways — Smart Pointers

  • Box for single ownership on heap; Rc/Arc for shared ownership (single-/multi-threaded)
  • Cell/RefCell provide interior mutability; RefCell panics on violations at runtime
  • Cow avoids allocation on the common path; Pin prevents moves for self-referential types
  • Drop order: fields drop in declaration order (RFC 1857); locals drop in reverse declaration order

See also: Ch 6 — Concurrency for Arc + Mutex patterns. Ch 4 — PhantomData for PhantomData used with smart pointers.

graph TD
    Box["Box&lt;T&gt;<br>Single owner, heap"] --> Heap["Heap allocation"]
    Rc["Rc&lt;T&gt;<br>Shared, single-thread"] --> Heap
    Arc["Arc&lt;T&gt;<br>Shared, multi-thread"] --> Heap

    Rc --> Weak1["Weak&lt;T&gt;<br>Non-owning"]
    Arc --> Weak2["Weak&lt;T&gt;<br>Non-owning"]

    Cell["Cell&lt;T&gt;<br>Copy interior mut"] --> Stack["Stack / interior"]
    RefCell["RefCell&lt;T&gt;<br>Runtime borrow check"] --> Stack
    Cow["Cow&lt;T&gt;<br>Clone on write"] --> Stack

    style Box fill:#d4efdf,stroke:#27ae60,color:#000
    style Rc fill:#e8f4f8,stroke:#2980b9,color:#000
    style Arc fill:#e8f4f8,stroke:#2980b9,color:#000
    style Weak1 fill:#fef9e7,stroke:#f1c40f,color:#000
    style Weak2 fill:#fef9e7,stroke:#f1c40f,color:#000
    style Cell fill:#fdebd0,stroke:#e67e22,color:#000
    style RefCell fill:#fdebd0,stroke:#e67e22,color:#000
    style Cow fill:#fdebd0,stroke:#e67e22,color:#000
    style Heap fill:#f5f5f5,stroke:#999,color:#000
    style Stack fill:#f5f5f5,stroke:#999,color:#000

Exercise: Reference-Counted Graph ★★ (~30 min)

Build a directed graph using Rc<RefCell<Node>> where each node has a name and a list of children. Create a cycle (A → B → C → A) using Weak to break the back-edge. Verify no memory leak with Rc::strong_count.

🔑 Solution
use std::cell::RefCell;
use std::rc::{Rc, Weak};

struct Node {
    name: String,
    children: Vec<Rc<RefCell<Node>>>,
    back_ref: Option<Weak<RefCell<Node>>>,
}

impl Node {
    fn new(name: &str) -> Rc<RefCell<Self>> {
        Rc::new(RefCell::new(Node {
            name: name.to_string(),
            children: Vec::new(),
            back_ref: None,
        }))
    }
}

impl Drop for Node {
    fn drop(&mut self) {
        println!("Dropping {}", self.name);
    }
}

fn main() {
    let a = Node::new("A");
    let b = Node::new("B");
    let c = Node::new("C");

    // A → B → C, with C back-referencing A via Weak
    a.borrow_mut().children.push(Rc::clone(&b));
    b.borrow_mut().children.push(Rc::clone(&c));
    c.borrow_mut().back_ref = Some(Rc::downgrade(&a)); // Weak ref!

    println!("A strong count: {}", Rc::strong_count(&a)); // 1 (only `a` binding)
    println!("B strong count: {}", Rc::strong_count(&b)); // 2 (b + A's child)
    println!("C strong count: {}", Rc::strong_count(&c)); // 2 (c + B's child)

    // Upgrade the weak ref to prove it works:
    let c_ref = c.borrow();
    if let Some(back) = &c_ref.back_ref {
        if let Some(a_ref) = back.upgrade() {
            println!("C points back to: {}", a_ref.borrow().name);
        }
    }
    // When a, b, c go out of scope, all Nodes drop (no cycle leak!)
}

9. Error Handling Patterns 🟢

What you’ll learn:

  • When to use thiserror (libraries) vs anyhow (applications)
  • Error conversion chains with #[from] and .context() wrappers
  • How the ? operator desugars and works in main()
  • When to panic vs return errors, and catch_unwind for FFI boundaries

thiserror vs anyhow — Library vs Application

Rust error handling centers on the Result<T, E> type. Two crates dominate:

// --- thiserror: For LIBRARIES ---
// Generates Display, Error, and From impls via derive macros
use thiserror::Error;

#[derive(Error, Debug)]
pub enum DatabaseError {
    #[error("connection failed: {0}")]
    ConnectionFailed(String),

    #[error("query error: {source}")]
    QueryError {
        #[source]
        source: sqlx::Error,
    },

    #[error("record not found: table={table} id={id}")]
    NotFound { table: String, id: u64 },

    #[error(transparent)] // Delegate Display to the inner error
    Io(#[from] std::io::Error), // Auto-generates From<io::Error>
}

// --- anyhow: For APPLICATIONS ---
// Dynamic error type — great for top-level code where you just want errors to propagate
use anyhow::{Context, Result, bail, ensure};

fn read_config(path: &str) -> Result<Config> {
    let content = std::fs::read_to_string(path)
        .with_context(|| format!("failed to read config from {path}"))?;

    let config: Config = serde_json::from_str(&content)
        .context("failed to parse config JSON")?;

    ensure!(config.port > 0, "port must be positive, got {}", config.port);

    Ok(config)
}

fn main() -> Result<()> {
    let config = read_config("server.toml")?;

    if config.name.is_empty() {
        bail!("server name cannot be empty"); // Return Err immediately
    }

    Ok(())
}

When to use which:

thiserroranyhow
Use inLibraries, shared cratesApplications, binaries
Error typesConcrete enums — callers can matchanyhow::Error — opaque
EffortDefine your error enumJust use Result<T>
DowncastingNot needed — pattern matcherror.downcast_ref::<MyError>()

Error Conversion Chains (#[from])

use thiserror::Error;

#[derive(Error, Debug)]
enum AppError {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),

    #[error("JSON error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("HTTP error: {0}")]
    Http(#[from] reqwest::Error),
}

// Now ? automatically converts:
fn fetch_and_parse(url: &str) -> Result<Config, AppError> {
    let body = reqwest::blocking::get(url)?.text()?;  // reqwest::Error → AppError::Http
    let config: Config = serde_json::from_str(&body)?; // serde_json::Error → AppError::Json
    Ok(config)
}

Context and Error Wrapping

Add human-readable context to errors without losing the original:

use anyhow::{Context, Result};

fn process_file(path: &str) -> Result<Data> {
    let content = std::fs::read_to_string(path)
        .with_context(|| format!("failed to read {path}"))?;

    let data = parse_content(&content)
        .with_context(|| format!("failed to parse {path}"))?;

    validate(&data)
        .context("validation failed")?;

    Ok(data)
}

// Error output:
// Error: validation failed
//
// Caused by:
//    0: failed to parse config.json
//    1: expected ',' at line 5 column 12

The ? Operator in Depth

? is syntactic sugar for a match + From conversion + early return:

#![allow(unused)]
fn main() {
// This:
let value = operation()?;

// Desugars to:
let value = match operation() {
    Ok(v) => v,
    Err(e) => return Err(From::from(e)),
    //                  ^^^^^^^^^^^^^^
    //                  Automatic conversion via From trait
};
}

? also works with Option (in functions returning Option):

#![allow(unused)]
fn main() {
fn find_user_email(users: &[User], name: &str) -> Option<String> {
    let user = users.iter().find(|u| u.name == name)?; // Returns None if not found
    let email = user.email.as_ref()?; // Returns None if email is None
    Some(email.to_uppercase())
}
}

Panics, catch_unwind, and When to Abort

#![allow(unused)]
fn main() {
// Panics: for BUGS, not expected errors
fn get_element(data: &[i32], index: usize) -> &i32 {
    // If this panics, it's a programming error (bug).
    // Don't "handle" it — fix the caller.
    &data[index]
}

// catch_unwind: for boundaries (FFI, thread pools)
use std::panic;

let result = panic::catch_unwind(|| {
    // Run potentially panicking code safely
    risky_operation()
});

match result {
    Ok(value) => println!("Success: {value:?}"),
    Err(_) => eprintln!("Operation panicked — continuing safely"),
}

// When to use which:
// - Result<T, E> → expected failures (file not found, network timeout)
// - panic!()     → programming bugs (index out of bounds, invariant violated)
// - process::abort() → unrecoverable state (security violation, corrupt data)
}

C++ comparison: Result<T, E> replaces exceptions for expected errors. panic!() is like assert() or std::terminate() — it’s for bugs, not control flow. Rust’s ? operator makes error propagation as ergonomic as exceptions without the unpredictable control flow.

Key Takeaways — Error Handling

  • Libraries: thiserror for structured error enums; applications: anyhow for ergonomic propagation
  • #[from] auto-generates From impls; .context() adds human-readable wrappers
  • ? desugars to From::from() + early return; works in main() returning Result

See also: Ch 14 — API Design for “parse, don’t validate” patterns. Ch 10 — Serialization for serde error handling.

flowchart LR
    A["std::io::Error"] -->|"#[from]"| B["AppError::Io"]
    C["serde_json::Error"] -->|"#[from]"| D["AppError::Json"]
    E["Custom validation"] -->|"manual"| F["AppError::Validation"]

    B --> G["? operator"]
    D --> G
    F --> G
    G --> H["Result&lt;T, AppError&gt;"]

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style C fill:#e8f4f8,stroke:#2980b9,color:#000
    style E fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#fdebd0,stroke:#e67e22,color:#000
    style D fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#fdebd0,stroke:#e67e22,color:#000
    style G fill:#fef9e7,stroke:#f1c40f,color:#000
    style H fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Error Hierarchy with thiserror ★★ (~30 min)

Design an error type hierarchy for a file-processing application that can fail during I/O, parsing (JSON and CSV), and validation. Use thiserror and demonstrate ? propagation.

🔑 Solution
use thiserror::Error;

#[derive(Error, Debug)]
pub enum AppError {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),

    #[error("JSON parse error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("CSV error at line {line}: {message}")]
    Csv { line: usize, message: String },

    #[error("validation error: {field} — {reason}")]
    Validation { field: String, reason: String },
}

fn read_file(path: &str) -> Result<String, AppError> {
    Ok(std::fs::read_to_string(path)?) // io::Error → AppError::Io via #[from]
}

fn parse_json(content: &str) -> Result<serde_json::Value, AppError> {
    Ok(serde_json::from_str(content)?) // serde_json::Error → AppError::Json
}

fn validate_name(value: &serde_json::Value) -> Result<String, AppError> {
    let name = value.get("name")
        .and_then(|v| v.as_str())
        .ok_or_else(|| AppError::Validation {
            field: "name".into(),
            reason: "must be a non-null string".into(),
        })?;

    if name.is_empty() {
        return Err(AppError::Validation {
            field: "name".into(),
            reason: "must not be empty".into(),
        });
    }

    Ok(name.to_string())
}

fn process_file(path: &str) -> Result<String, AppError> {
    let content = read_file(path)?;
    let json = parse_json(&content)?;
    let name = validate_name(&json)?;
    Ok(name)
}

fn main() {
    match process_file("config.json") {
        Ok(name) => println!("Name: {name}"),
        Err(e) => eprintln!("Error: {e}"),
    }
}

10. Serialization, Zero-Copy, and Binary Data 🟡

What you’ll learn:

  • serde fundamentals: derive macros, attributes, and enum representations
  • Zero-copy deserialization for high-performance read-heavy workloads
  • The serde format ecosystem (JSON, TOML, bincode, MessagePack)
  • Binary data handling with repr(C), zerocopy, and bytes::Bytes

serde Fundamentals

serde (SERialize/DEserialize) is the universal serialization framework for Rust. It separates data model (your structs) from format (JSON, TOML, binary):

use serde::{Serialize, Deserialize};

#[derive(Debug, Serialize, Deserialize)]
struct ServerConfig {
    name: String,
    port: u16,
    #[serde(default)]                    // Use Default::default() if missing
    max_connections: usize,
    #[serde(skip_serializing_if = "Option::is_none")]
    tls_cert_path: Option<String>,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Deserialize from JSON:
    let json_input = r#"{
        "name": "hw-diag",
        "port": 8080
    }"#;
    let config: ServerConfig = serde_json::from_str(json_input)?;
    println!("{config:?}");
    // ServerConfig { name: "hw-diag", port: 8080, max_connections: 0, tls_cert_path: None }

    // Serialize to JSON:
    let output = serde_json::to_string_pretty(&config)?;
    println!("{output}");

    // Same struct, different format — no code changes:
    let toml_input = r#"
        name = "hw-diag"
        port = 8080
    "#;
    let config: ServerConfig = toml::from_str(toml_input)?;
    println!("{config:?}");

    Ok(())
}

Key insight: Your struct derives Serialize and Deserialize once. Then it works with every serde-compatible format — JSON, TOML, YAML, bincode, MessagePack, CBOR, postcard, and dozens more.

Common serde Attributes

serde provides fine-grained control over serialization through field and container attributes:

use serde::{Serialize, Deserialize};

// --- Container attributes (on the struct/enum) ---
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]       // JSON convention: field_name → fieldName
#[serde(deny_unknown_fields)]            // Reject extra keys — strict parsing
struct DiagResult {
    test_name: String,                   // Serialized as "testName"
    pass_count: u32,                     // Serialized as "passCount"
    fail_count: u32,                     // Serialized as "failCount"
}

// --- Field attributes ---
#[derive(Serialize, Deserialize)]
struct Sensor {
    #[serde(rename = "sensor_id")]       // Override field name for serialization
    id: u64,

    #[serde(default)]                    // Use Default if missing from input
    enabled: bool,

    #[serde(default = "default_threshold")]
    threshold: f64,

    #[serde(skip)]                       // Never serialize or deserialize
    cached_value: Option<f64>,

    #[serde(skip_serializing_if = "Vec::is_empty")]
    tags: Vec<String>,

    #[serde(flatten)]                    // Inline nested struct fields
    metadata: Metadata,

    #[serde(with = "hex_bytes")]         // Custom ser/de module
    raw_data: Vec<u8>,
}

fn default_threshold() -> f64 { 1.0 }

#[derive(Serialize, Deserialize)]
struct Metadata {
    vendor: String,
    model: String,
}
// With #[serde(flatten)], the JSON looks like:
// { "sensor_id": 1, "vendor": "Intel", "model": "X200", ... }
// NOT: { "sensor_id": 1, "metadata": { "vendor": "Intel", ... } }

Most-used attributes cheat sheet:

AttributeLevelEffect
rename_all = "camelCase"ContainerRename all fields to camelCase/snake_case/SCREAMING_SNAKE_CASE
deny_unknown_fieldsContainerError on unexpected keys (strict mode)
defaultFieldUse Default::default() when field missing
rename = "..."FieldCustom serialized name
skipFieldExclude from ser/de entirely
skip_serializing_if = "fn"FieldConditionally exclude (e.g., Option::is_none)
flattenFieldInline a nested struct’s fields
with = "module"FieldUse custom serialize/deserialize functions
alias = "..."FieldAccept alternative names during deserialization
deserialize_with = "fn"FieldCustom deserialize function only
untaggedEnumTry each variant in order (no discriminant in output)

Enum Representations

serde provides four representations for enums in formats like JSON:

use serde::{Serialize, Deserialize};

// 1. Externally tagged (DEFAULT):
#[derive(Serialize, Deserialize)]
enum Command {
    Reboot,
    RunDiag { test_name: String, timeout_secs: u64 },
    SetFanSpeed(u8),
}
// "Reboot"                                          → Command::Reboot
// {"RunDiag": {"test_name": "gpu", "timeout_secs": 60}}  → Command::RunDiag { ... }

// 2. Internally tagged — #[serde(tag = "type")]:
#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum Event {
    Start { timestamp: u64 },
    Error { code: i32, message: String },
    End   { timestamp: u64, success: bool },
}
// {"type": "Start", "timestamp": 1706000000}
// {"type": "Error", "code": 42, "message": "timeout"}

// 3. Adjacently tagged — #[serde(tag = "t", content = "c")]:
#[derive(Serialize, Deserialize)]
#[serde(tag = "t", content = "c")]
enum Payload {
    Text(String),
    Binary(Vec<u8>),
}
// {"t": "Text", "c": "hello"}
// {"t": "Binary", "c": [0, 1, 2]}

// 4. Untagged — #[serde(untagged)]:
#[derive(Serialize, Deserialize)]
#[serde(untagged)]
enum StringOrNumber {
    Str(String),
    Num(f64),
}
// "hello" → StringOrNumber::Str("hello")
// 42.0    → StringOrNumber::Num(42.0)
// ⚠️ Tried IN ORDER — first matching variant wins

Which representation to choose: Use internally tagged (tag = "type") for most JSON APIs — it’s the most readable and matches conventions in Go, Python, and TypeScript. Use untagged only for “union” types where the shape alone disambiguates.

Zero-Copy Deserialization

serde can deserialize without allocating new strings — borrowing directly from the input buffer. This is the key to high-performance parsing:

use serde::Deserialize;

// --- Owned (allocating) ---
// Each String field copies bytes from the input into new heap allocations.
#[derive(Deserialize)]
struct OwnedRecord {
    name: String,           // Allocates a new String
    value: String,          // Allocates another String
}

// --- Zero-copy (borrowing) ---
// &'de str fields borrow directly from the input — ZERO allocation.
#[derive(Deserialize)]
struct BorrowedRecord<'a> {
    name: &'a str,          // Points into the input buffer
    value: &'a str,         // Points into the input buffer
}

fn main() {
    let input = r#"{"name": "cpu_temp", "value": "72.5"}"#;

    // Owned: allocates two String objects
    let owned: OwnedRecord = serde_json::from_str(input).unwrap();

    // Zero-copy: `name` and `value` point into `input` — no allocation
    let borrowed: BorrowedRecord = serde_json::from_str(input).unwrap();

    // The output is lifetime-bound: borrowed can't outlive input
    println!("{}: {}", borrowed.name, borrowed.value);
}

Understanding the lifetime:

// Deserialize<'de> — the struct can borrow from data with lifetime 'de:
//   struct BorrowedRecord<'a> where 'a == 'de
//   Only works when the input buffer lives long enough

// DeserializeOwned — the struct owns all its data, no borrowing:
//   trait DeserializeOwned: for<'de> Deserialize<'de> {}
//   Works with any input lifetime (the struct is independent)

use serde::de::DeserializeOwned;

// This function requires owned types — input can be temporary
fn parse_owned<T: DeserializeOwned>(input: &str) -> T {
    serde_json::from_str(input).unwrap()
}

// This function allows borrowing — more efficient but restricts lifetimes
fn parse_borrowed<'a, T: Deserialize<'a>>(input: &'a str) -> T {
    serde_json::from_str(input).unwrap()
}

When to use zero-copy:

  • Parsing large files where you only need a few fields
  • High-throughput pipelines (network packets, log lines)
  • When the input buffer already lives long enough (e.g., memory-mapped file)

When NOT to use zero-copy:

  • Input is ephemeral (network read buffer that’s reused)
  • You need to store the result beyond the input’s lifetime
  • Fields need transformation (escapes, normalization)

Practical tip: Cow<'a, str> gives you the best of both — borrow when possible, allocate when necessary (e.g., when JSON escape sequences need unescaping). serde supports Cow natively.

The Format Ecosystem

FormatCrateHuman-ReadableSizeSpeedUse Case
JSONserde_jsonLargeGoodConfig files, REST APIs, logging
TOMLtomlMediumGoodConfig files (Cargo.toml style)
YAMLserde_yamlMediumGoodConfig files (complex nesting)
bincodebincodeSmallFastIPC, caches, Rust-to-Rust
postcardpostcardTinyVery fastEmbedded systems, no_std
MessagePackrmp-serdeSmallFastCross-language binary protocol
CBORciboriumSmallFastIoT, constrained environments
#![allow(unused)]
fn main() {
// Same struct, many formats — serde's power:

#[derive(serde::Serialize, serde::Deserialize, Debug)]
struct DiagConfig {
    name: String,
    tests: Vec<String>,
    timeout_secs: u64,
}

let config = DiagConfig {
    name: "accel_diag".into(),
    tests: vec!["memory".into(), "compute".into()],
    timeout_secs: 300,
};

// JSON:   {"name":"accel_diag","tests":["memory","compute"],"timeout_secs":300}
let json = serde_json::to_string(&config).unwrap();       // 67 bytes

// bincode: compact binary — ~40 bytes, no field names
let bin = bincode::serialize(&config).unwrap();            // Much smaller

// postcard: even smaller, varint encoding — great for embedded
// let post = postcard::to_allocvec(&config).unwrap();
}

Choose your format:

  • Config files humans edit → TOML or JSON
  • Rust-to-Rust IPC/caching → bincode (fast, compact, not cross-language)
  • Cross-language binary → MessagePack or CBOR
  • Embedded / no_std → postcard

Binary Data and repr(C)

For hardware diagnostics, parsing binary protocol data is common. Rust provides tools for safe, zero-copy binary data handling:

#![allow(unused)]
fn main() {
// --- #[repr(C)]: Predictable memory layout ---
// Ensures fields are laid out in declaration order with C padding rules.
// Essential for matching hardware register layouts and protocol headers.

#[repr(C)]
#[derive(Debug, Clone, Copy)]
struct IpmiHeader {
    rs_addr: u8,
    net_fn_lun: u8,
    checksum: u8,
    rq_addr: u8,
    rq_seq_lun: u8,
    cmd: u8,
}

// --- Safe binary parsing with manual deserialization ---
impl IpmiHeader {
    fn from_bytes(data: &[u8]) -> Option<Self> {
        if data.len() < std::mem::size_of::<Self>() {
            return None;
        }
        Some(IpmiHeader {
            rs_addr:     data[0],
            net_fn_lun:  data[1],
            checksum:    data[2],
            rq_addr:     data[3],
            rq_seq_lun:  data[4],
            cmd:         data[5],
        })
    }

    fn net_fn(&self) -> u8 { self.net_fn_lun >> 2 }
    fn lun(&self)    -> u8 { self.net_fn_lun & 0x03 }
}

// --- Endianness-aware parsing ---
fn read_u16_le(data: &[u8], offset: usize) -> u16 {
    u16::from_le_bytes([data[offset], data[offset + 1]])
}

fn read_u32_be(data: &[u8], offset: usize) -> u32 {
    u32::from_be_bytes([
        data[offset], data[offset + 1],
        data[offset + 2], data[offset + 3],
    ])
}

// --- #[repr(C, packed)]: Remove padding (alignment = 1) ---
#[repr(C, packed)]
#[derive(Debug, Clone, Copy)]
struct PcieCapabilityHeader {
    cap_id: u8,        // Capability ID
    next_cap: u8,      // Pointer to next capability
    cap_reg: u16,      // Capability-specific register
}
// ⚠️ Packed structs: taking &field creates an unaligned reference — UB.
// Always copy fields out: let id = header.cap_id;  // OK (Copy)
// Never do: let r = &header.cap_reg;               // UB if unaligned
}

zerocopy and bytemuck — Safe Transmutation

Instead of unsafe transmute, use crates that verify layout safety at compile time:

#![allow(unused)]
fn main() {
// --- zerocopy: Compile-time checked zero-copy conversions ---
// Cargo.toml: zerocopy = { version = "0.8", features = ["derive"] }

use zerocopy::{FromBytes, IntoBytes, KnownLayout, Immutable};

#[derive(FromBytes, IntoBytes, KnownLayout, Immutable, Debug)]
#[repr(C)]
struct SensorReading {
    sensor_id: u16,
    flags: u8,
    _reserved: u8,
    value: u32,     // Fixed-point: actual = value / 1000.0
}

fn parse_sensor(raw: &[u8]) -> Option<&SensorReading> {
    // Safe zero-copy: verifies alignment and size AT COMPILE TIME
    SensorReading::ref_from_bytes(raw).ok()
    // Returns &SensorReading pointing INTO raw — no copy, no allocation
}

// --- bytemuck: Simple, battle-tested ---
// Cargo.toml: bytemuck = { version = "1", features = ["derive"] }

use bytemuck::{Pod, Zeroable};

#[derive(Pod, Zeroable, Clone, Copy, Debug)]
#[repr(C)]
struct GpuRegister {
    address: u32,
    value: u32,
}

fn cast_registers(data: &[u8]) -> &[GpuRegister] {
    // Safe cast: Pod guarantees all bit patterns are valid
    bytemuck::cast_slice(data)
}
}

When to use which:

ApproachSafetyOverheadUse When
Manual field-by-field parsing✅ SafeCopy fieldsSmall structs, complex layouts
zerocopy✅ SafeZero-copyLarge buffers, many reads, compile-time checks
bytemuck✅ SafeZero-copySimple Pod types, casting slices
unsafe { transmute() }❌ UnsafeZero-copyLast resort — avoid in application code

bytes::Bytes — Reference-Counted Buffers

The bytes crate (used by tokio, hyper, tonic) provides zero-copy byte buffers with reference counting — Bytes is to Vec<u8> what Arc<[u8]> is to owned slices:

use bytes::{Bytes, BytesMut, Buf, BufMut};

fn main() {
    // --- BytesMut: mutable buffer for building data ---
    let mut buf = BytesMut::with_capacity(1024);
    buf.put_u8(0x01);                    // Write a byte
    buf.put_u16(0x1234);                 // Write u16 (big-endian)
    buf.put_slice(b"hello");             // Write raw bytes
    buf.put(&b"world"[..]);              // Write from slice

    // Freeze into immutable Bytes (zero cost):
    let data: Bytes = buf.freeze();

    // --- Bytes: immutable, reference-counted, cloneable ---
    let data2 = data.clone();            // Cheap: increments refcount, NOT deep copy
    let slice = data.slice(3..8);        // Zero-copy sub-slice (shares buffer)

    // Read from Bytes using the Buf trait:
    let mut reader = &data[..];
    let byte = reader.get_u8();          // 0x01
    let short = reader.get_u16();        // 0x1234

    // Split without copying:
    let mut original = Bytes::from_static(b"HEADER\x00PAYLOAD");
    let header = original.split_to(6);   // header = "HEADER", original = "\x00PAYLOAD"

    println!("header: {:?}", &header[..]);
    println!("payload: {:?}", &original[1..]);
}

bytes vs Vec<u8>:

FeatureVec<u8>Bytes
Clone costO(n) deep copyO(1) refcount increment
Sub-slicingBorrows with lifetimeOwned, refcount-tracked
Thread safetyNot Sync (needs Arc)Send + Sync built in
MutabilityDirect &mutSplit into BytesMut first
EcosystemStandard librarytokio, hyper, tonic, axum

When to use bytes: Network protocols, packet parsing, any scenario where you receive a buffer and need to split it into parts that are processed by different components or threads. The zero-copy splitting is the killer feature.

Key Takeaways — Serialization & Binary Data

  • serde’s derive macros handle 90% of cases; use attributes (rename, skip, default) for the rest
  • Zero-copy deserialization (&'a str in structs) avoids allocation for read-heavy workloads
  • repr(C) + zerocopy/bytemuck for hardware register layouts; bytes::Bytes for reference-counted buffers

See also: Ch 9 — Error Handling for combining serde errors with thiserror. Ch 11 — Unsafe for repr(C) and FFI data layouts.

flowchart LR
    subgraph Input
        JSON["JSON"]
        TOML["TOML"]
        Bin["bincode"]
        MsgP["MessagePack"]
    end

    subgraph serde["serde data model"]
        Ser["Serialize"]
        De["Deserialize"]
    end

    subgraph Output
        Struct["Rust struct"]
        Enum["Rust enum"]
    end

    JSON --> De
    TOML --> De
    Bin --> De
    MsgP --> De
    De --> Struct
    De --> Enum
    Struct --> Ser
    Enum --> Ser
    Ser --> JSON
    Ser --> Bin

    style JSON fill:#e8f4f8,stroke:#2980b9,color:#000
    style TOML fill:#e8f4f8,stroke:#2980b9,color:#000
    style Bin fill:#e8f4f8,stroke:#2980b9,color:#000
    style MsgP fill:#e8f4f8,stroke:#2980b9,color:#000
    style Ser fill:#fef9e7,stroke:#f1c40f,color:#000
    style De fill:#fef9e7,stroke:#f1c40f,color:#000
    style Struct fill:#d4efdf,stroke:#27ae60,color:#000
    style Enum fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Custom serde Deserialization ★★★ (~45 min)

Design a HumanDuration wrapper that deserializes from human-readable strings like "30s", "5m", "2h" using a custom serde deserializer. It should also serialize back to the same format.

🔑 Solution
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

#[derive(Debug, Clone, PartialEq)]
struct HumanDuration(std::time::Duration);

impl HumanDuration {
    fn from_str(s: &str) -> Result<Self, String> {
        let s = s.trim();
        if s.is_empty() { return Err("empty duration string".into()); }

        let (num_str, suffix) = s.split_at(
            s.find(|c: char| !c.is_ascii_digit()).unwrap_or(s.len())
        );
        let value: u64 = num_str.parse()
            .map_err(|_| format!("invalid number: {num_str}"))?;

        let duration = match suffix {
            "s" | "sec"  => std::time::Duration::from_secs(value),
            "m" | "min"  => std::time::Duration::from_secs(value * 60),
            "h" | "hr"   => std::time::Duration::from_secs(value * 3600),
            "ms"         => std::time::Duration::from_millis(value),
            other        => return Err(format!("unknown suffix: {other}")),
        };
        Ok(HumanDuration(duration))
    }
}

impl fmt::Display for HumanDuration {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let secs = self.0.as_secs();
        if secs == 0 {
            write!(f, "{}ms", self.0.as_millis())
        } else if secs % 3600 == 0 {
            write!(f, "{}h", secs / 3600)
        } else if secs % 60 == 0 {
            write!(f, "{}m", secs / 60)
        } else {
            write!(f, "{}s", secs)
        }
    }
}

impl Serialize for HumanDuration {
    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        serializer.serialize_str(&self.to_string())
    }
}

impl<'de> Deserialize<'de> for HumanDuration {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        let s = String::deserialize(deserializer)?;
        HumanDuration::from_str(&s).map_err(serde::de::Error::custom)
    }
}

#[derive(Debug, Deserialize, Serialize)]
struct Config {
    timeout: HumanDuration,
    retry_interval: HumanDuration,
}

fn main() {
    let json = r#"{ "timeout": "30s", "retry_interval": "5m" }"#;
    let config: Config = serde_json::from_str(json).unwrap();

    assert_eq!(config.timeout.0, std::time::Duration::from_secs(30));
    assert_eq!(config.retry_interval.0, std::time::Duration::from_secs(300));

    let serialized = serde_json::to_string(&config).unwrap();
    assert!(serialized.contains("30s"));
    println!("Config: {serialized}");
}

11. Unsafe Rust — Controlled Danger 🔴

What you’ll learn:

  • The five unsafe superpowers and when each is needed
  • Writing sound abstractions: safe API, unsafe internals
  • FFI patterns for calling C from Rust (and back)
  • Common UB pitfalls and arena/slab allocator patterns

The Five Unsafe Superpowers

unsafe unlocks five operations that the compiler can’t verify:

#![allow(unused)]
fn main() {
// SAFETY: each operation is explained inline below.
unsafe {
    // 1. Dereference a raw pointer
    let ptr: *const i32 = &42;
    let value = *ptr; // Could be a dangling/null pointer

    // 2. Call an unsafe function
    let layout = std::alloc::Layout::new::<u64>();
    let mem = std::alloc::alloc(layout);

    // 3. Access a mutable static variable
    static mut COUNTER: u32 = 0;
    COUNTER += 1; // Data race if multiple threads access

    // 4. Implement an unsafe trait
    // unsafe impl Send for MyType {}

    // 5. Access fields of a union
    // union IntOrFloat { i: i32, f: f32 }
    // let u = IntOrFloat { i: 42 };
    // let f = u.f; // Reinterpret bits — could be garbage
}
}

Key principle: unsafe doesn’t turn off the borrow checker or type system. It only unlocks these five specific capabilities. All other Rust rules still apply.

Writing Sound Abstractions

The purpose of unsafe is to build safe abstractions around unsafe operations:

#![allow(unused)]
fn main() {
/// A fixed-capacity stack-allocated buffer.
/// All public methods are safe — the unsafe is encapsulated.
pub struct StackBuf<T, const N: usize> {
    data: [std::mem::MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> StackBuf<T, N> {
    pub fn new() -> Self {
        StackBuf {
            // Each element is individually MaybeUninit — no unsafe needed.
            // `const { ... }` blocks (Rust 1.79+) let us repeat a non-Copy
            // const expression N times.
            data: [const { std::mem::MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N {
            return Err(value); // Buffer full — return value to caller
        }
        // SAFETY: len < N, so data[len] is within bounds.
        // We write a valid T into the MaybeUninit slot.
        self.data[self.len] = std::mem::MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn get(&self, index: usize) -> Option<&T> {
        if index < self.len {
            // SAFETY: index < len, and data[0..len] are all initialized.
            Some(unsafe { self.data[index].assume_init_ref() })
        } else {
            None
        }
    }
}

impl<T, const N: usize> Drop for StackBuf<T, N> {
    fn drop(&mut self) {
        // SAFETY: data[0..len] are initialized — drop them properly.
        for i in 0..self.len {
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}
}

The three rules of sound unsafe code:

  1. Document invariants — every // SAFETY: comment explains why the operation is valid
  2. Encapsulate — the unsafe is inside a safe API; users can’t trigger UB
  3. Minimize — only the smallest possible block is unsafe

FFI Patterns: Calling C from Rust

#![allow(unused)]
fn main() {
// Declare the C function signature:
extern "C" {
    fn strlen(s: *const std::ffi::c_char) -> usize;
    fn printf(format: *const std::ffi::c_char, ...) -> std::ffi::c_int;
}

// Safe wrapper:
fn safe_strlen(s: &str) -> usize {
    let c_string = std::ffi::CString::new(s).expect("string contains null byte");
    // SAFETY: c_string is a valid null-terminated string, alive for the call.
    unsafe { strlen(c_string.as_ptr()) }
}

// Calling Rust from C (export a function):
#[no_mangle]
pub extern "C" fn rust_add(a: i32, b: i32) -> i32 {
    a + b
}
}

Common FFI types:

RustCNotes
i32 / u32int32_t / uint32_tFixed-width, safe
*const T / *mut Tconst T* / T*Raw pointers
std::ffi::CStrconst char* (borrowed)Null-terminated, borrowed
std::ffi::CStringchar* (owned)Null-terminated, owned
std::ffi::c_voidvoidOpaque pointer target
Option<fn(...)>Nullable function pointerNone = NULL

Common UB Pitfalls

PitfallExampleWhy It’s UB
Null dereference*std::ptr::null::<i32>()Dereferencing null is always UB
Dangling pointerDereference after drop()Memory may be reused
Data raceTwo threads write to static mutUnsynchronized concurrent writes
Wrong assume_initMaybeUninit::<String>::uninit().assume_init()Reading uninitialized memory. Note: [const { MaybeUninit::uninit() }; N] (Rust 1.79+) is the safe way to create an array of MaybeUninit — no unsafe or assume_init needed (see StackBuf::new() above).
Aliasing violationCreating two &mut to same dataViolates Rust’s aliasing model
Invalid enum valuestd::mem::transmute::<u8, bool>(2)bool can only be 0 or 1

When to use unsafe in production:

  • FFI boundaries (calling C/C++ code)
  • Performance-critical inner loops (avoid bounds checks)
  • Building primitives (Vec, HashMap — these use unsafe internally)
  • Never in application logic if you can avoid it

Custom Allocators — Arena and Slab Patterns

In C, you’d write custom malloc() replacements for specific allocation patterns — arena allocators that free everything at once, slab allocators for fixed-size objects, or pool allocators for high-throughput systems. Rust provides the same power through the GlobalAlloc trait and allocator crates, with the added benefit of lifetime-scoped arenas that prevent use-after-free at compile time.

Arena Allocators — Bulk Allocation, Bulk Free

An arena allocates by bumping a pointer forward. Individual items can’t be freed — the entire arena is freed at once. This is perfect for request-scoped or frame-scoped allocations:

#![allow(unused)]
fn main() {
use bumpalo::Bump;

fn process_sensor_frame(raw_data: &[u8]) {
    // Create an arena for this frame's allocations
    let arena = Bump::new();

    // Allocate objects in the arena — ~2ns each (just a pointer bump)
    let header = arena.alloc(parse_header(raw_data));
    let readings: &mut [f32] = arena.alloc_slice_fill_default(header.sensor_count);

    for (i, chunk) in raw_data[header.payload_offset..].chunks(4).enumerate() {
        if i < readings.len() {
            readings[i] = f32::from_le_bytes(chunk.try_into().unwrap());
        }
    }

    // Use readings...
    let avg = readings.iter().sum::<f32>() / readings.len() as f32;
    println!("Frame avg: {avg:.2}");

    // `arena` drops here — ALL allocations freed at once in O(1)
    // No per-object destructor overhead, no fragmentation
}
fn parse_header(_: &[u8]) -> Header { Header { sensor_count: 4, payload_offset: 8 } }
struct Header { sensor_count: usize, payload_offset: usize }
}

Arena vs standard allocator:

AspectVec::new() / Box::new()Bump arena
Alloc speed~25ns (malloc)~2ns (pointer bump)
Free speedPer-object destructorO(1) bulk free
FragmentationYes (long-lived processes)None within arena
Lifetime safetyHeap — freed on DropArena reference — compile-time scoped
Use caseGeneral purposeRequest/frame/batch processing

typed-arena — Type-Safe Arena

When all arena objects are the same type, typed-arena provides a simpler API that returns references with the arena’s lifetime:

#![allow(unused)]
fn main() {
use typed_arena::Arena;

struct AstNode<'a> {
    value: i32,
    children: Vec<&'a AstNode<'a>>,
}

fn build_tree() {
    let arena: Arena<AstNode<'_>> = Arena::new();

    // Allocate nodes — returns &AstNode tied to arena's lifetime
    let root = arena.alloc(AstNode { value: 1, children: vec![] });
    let left = arena.alloc(AstNode { value: 2, children: vec![] });
    let right = arena.alloc(AstNode { value: 3, children: vec![] });

    // Build the tree — all references valid as long as `arena` lives
    // (Mutable access requires interior mutability for truly mutable trees)

    println!("Root: {}, Left: {}, Right: {}", root.value, left.value, right.value);

    // `arena` drops here — all nodes freed at once
}
}

Slab Allocators — Fixed-Size Object Pools

A slab allocator pre-allocates a pool of fixed-size slots. Objects are allocated and returned individually, but all slots are the same size — eliminating fragmentation and enabling O(1) alloc/free:

#![allow(unused)]
fn main() {
use slab::Slab;

struct Connection {
    id: u64,
    buffer: [u8; 1024],
    active: bool,
}

fn connection_pool_example() {
    // Pre-allocate a slab for connections
    let mut connections: Slab<Connection> = Slab::with_capacity(256);

    // Insert returns a key (usize index) — O(1)
    let key1 = connections.insert(Connection {
        id: 1001,
        buffer: [0; 1024],
        active: true,
    });

    let key2 = connections.insert(Connection {
        id: 1002,
        buffer: [0; 1024],
        active: true,
    });

    // Access by key — O(1)
    if let Some(conn) = connections.get_mut(key1) {
        conn.buffer[0..5].copy_from_slice(b"hello");
    }

    // Remove returns the value — O(1), slot is reused for next insert
    let removed = connections.remove(key2);
    assert_eq!(removed.id, 1002);

    // Next insert reuses the freed slot — no fragmentation
    let key3 = connections.insert(Connection {
        id: 1003,
        buffer: [0; 1024],
        active: true,
    });
    assert_eq!(key3, key2); // Same slot reused!
}
}

Implementing a Minimal Arena (for no_std)

For bare-metal environments where you can’t pull in bumpalo, here’s a minimal arena built on unsafe:

#![allow(unused)]
#![cfg_attr(not(test), no_std)]

fn main() {
use core::alloc::Layout;
use core::cell::{Cell, UnsafeCell};

/// A simple bump allocator backed by a fixed-size byte array.
/// Not thread-safe — use per-core or with a lock for multi-threaded contexts.
///
/// **Important**: Like `bumpalo`, this arena does NOT call destructors on
/// allocated items when the arena is dropped. Types with `Drop` impls will
/// leak their resources (file handles, sockets, etc.). Only allocate types
/// without meaningful `Drop` impls, or manually drop them before the arena.
pub struct FixedArena<const N: usize> {
    // UnsafeCell is REQUIRED here: we mutate `buf` through `&self`.
    // Without UnsafeCell, casting &self.buf to *mut u8 would be UB
    // (violates Rust's aliasing model — shared ref implies immutable).
    buf: UnsafeCell<[u8; N]>,
    offset: Cell<usize>, // Interior mutability for &self allocation
}

impl<const N: usize> FixedArena<N> {
    pub const fn new() -> Self {
        FixedArena {
            buf: UnsafeCell::new([0; N]),
            offset: Cell::new(0),
        }
    }

    /// Allocate a `T` in the arena. Returns `None` if out of space.
    pub fn alloc<T>(&self, value: T) -> Option<&mut T> {
        let layout = Layout::new::<T>();
        let current = self.offset.get();

        // Align up
        let aligned = (current + layout.align() - 1) & !(layout.align() - 1);
        let new_offset = aligned + layout.size();

        if new_offset > N {
            return None; // Arena full
        }

        self.offset.set(new_offset);

        // SAFETY:
        // - `aligned` is within `buf` bounds (checked above)
        // - Alignment is correct (aligned to T's requirement)
        // - No aliasing: each alloc returns a unique, non-overlapping region
        // - UnsafeCell grants permission to mutate through &self
        // - The arena outlives the returned reference (caller must ensure)
        let ptr = unsafe {
            let base = (self.buf.get() as *mut u8).add(aligned);
            let typed = base as *mut T;
            typed.write(value);
            &mut *typed
        };

        Some(ptr)
    }

    /// Reset the arena — invalidates all previous allocations.
    ///
    /// # Safety
    /// Caller must ensure no references to arena-allocated data exist.
    pub unsafe fn reset(&self) {
        self.offset.set(0);
    }

    pub fn used(&self) -> usize {
        self.offset.get()
    }

    pub fn remaining(&self) -> usize {
        N - self.offset.get()
    }
}
}

Choosing an Allocator Strategy

Note: The diagram below uses Mermaid syntax. It renders on GitHub and in tools that support Mermaid (mdBook with mermaid plugin, VS Code with Mermaid extension). In plain Markdown viewers, you’ll see the raw source.

graph TD
    A["What's your allocation pattern?"] --> B{All same type?}
    A --> I{"Environment?"}
    B -->|Yes| C{Need individual free?}
    B -->|No| D{Need individual free?}
    C -->|Yes| E["<b>Slab</b><br/>slab crate<br/>O(1) alloc + free<br/>Index-based access"]
    C -->|No| F["<b>typed-arena</b><br/>Bulk alloc, bulk free<br/>Lifetime-scoped refs"]
    D -->|Yes| G["<b>Standard allocator</b><br/>Box, Vec, etc.<br/>General-purpose malloc"]
    D -->|No| H["<b>Bump arena</b><br/>bumpalo crate<br/>~2ns alloc, O(1) bulk free"]
    
    I -->|no_std| J["FixedArena (custom)<br/>or embedded-alloc"]
    I -->|std| K["bumpalo / typed-arena / slab"]
    
    style E fill:#91e5a3,color:#000
    style F fill:#91e5a3,color:#000
    style G fill:#89CFF0,color:#000
    style H fill:#91e5a3,color:#000
    style J fill:#ffa07a,color:#000
    style K fill:#91e5a3,color:#000
C PatternRust EquivalentKey Advantage
Custom malloc() pool#[global_allocator] implType-safe, debuggable
obstack (GNU)bumpalo::BumpLifetime-scoped, no use-after-free
Kernel slab (kmem_cache)slab::Slab<T>Type-safe, index-based
Stack-allocated temp bufferFixedArena<N> (above)No heap, const constructible
alloca()[T; N] or SmallVecCompile-time sized, no UB

Cross-reference: For bare-metal allocator setup (#[global_allocator] with embedded-alloc), see the Rust Training for C Programmers, Chapter 15.1 “Global Allocator Setup” which covers the embedded-specific bootstrapping.

Key Takeaways — Unsafe Rust

  • Document invariants (SAFETY: comments), encapsulate behind safe APIs, minimize unsafe scope
  • [const { MaybeUninit::uninit() }; N] (Rust 1.79+) replaces the old assume_init anti-pattern
  • FFI requires extern "C", #[repr(C)], and careful null/lifetime handling
  • Arena and slab allocators trade general-purpose flexibility for allocation speed

See also: Ch 4 — PhantomData for variance and drop-check interactions with unsafe code. Ch 8 — Smart Pointers for Pin and self-referential types.


Exercise: Safe Wrapper around Unsafe ★★★ (~45 min)

Write a FixedVec<T, const N: usize> — a fixed-capacity, stack-allocated vector. Requirements:

  • push(&mut self, value: T) -> Result<(), T> returns Err(value) when full
  • pop(&mut self) -> Option<T> returns and removes the last element
  • as_slice(&self) -> &[T] borrows initialized elements
  • All public methods must be safe; all unsafe must be encapsulated with SAFETY: comments
  • Drop must clean up initialized elements
🔑 Solution
use std::mem::MaybeUninit;

pub struct FixedVec<T, const N: usize> {
    data: [MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> FixedVec<T, N> {
    pub fn new() -> Self {
        FixedVec {
            data: [const { MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N { return Err(value); }
        // SAFETY: len < N, so data[len] is within bounds.
        self.data[self.len] = MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn pop(&mut self) -> Option<T> {
        if self.len == 0 { return None; }
        self.len -= 1;
        // SAFETY: data[len] was initialized (len was > 0 before decrement).
        Some(unsafe { self.data[self.len].assume_init_read() })
    }

    pub fn as_slice(&self) -> &[T] {
        // SAFETY: data[0..len] are all initialized, and MaybeUninit<T>
        // has the same layout as T.
        unsafe { std::slice::from_raw_parts(self.data.as_ptr() as *const T, self.len) }
    }

    pub fn len(&self) -> usize { self.len }
    pub fn is_empty(&self) -> bool { self.len == 0 }
}

impl<T, const N: usize> Drop for FixedVec<T, N> {
    fn drop(&mut self) {
        // SAFETY: data[0..len] are initialized — drop each one.
        for i in 0..self.len {
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}

fn main() {
    let mut v = FixedVec::<String, 4>::new();
    v.push("hello".into()).unwrap();
    v.push("world".into()).unwrap();
    assert_eq!(v.as_slice(), &["hello", "world"]);
    assert_eq!(v.pop(), Some("world".into()));
    assert_eq!(v.len(), 1);
}

12. Macros — Code That Writes Code 🟡

What you’ll learn:

  • Declarative macros (macro_rules!) with pattern matching and repetition
  • When macros are the right tool vs generics/traits
  • Procedural macros: derive, attribute, and function-like
  • Writing a custom derive macro with syn and quote

Declarative Macros (macro_rules!)

Macros match patterns on syntax and expand to code at compile time:

#![allow(unused)]
fn main() {
// A simple macro that creates a HashMap
macro_rules! hashmap {
    // Match: key => value pairs separated by commas
    ( $( $key:expr => $value:expr ),* $(,)? ) => {
        {
            let mut map = std::collections::HashMap::new();
            $( map.insert($key, $value); )*
            map
        }
    };
}

let scores = hashmap! {
    "Alice" => 95,
    "Bob" => 87,
    "Carol" => 92,
};
// Expands to:
// let mut map = HashMap::new();
// map.insert("Alice", 95);
// map.insert("Bob", 87);
// map.insert("Carol", 92);
// map
}

Macro fragment types:

FragmentMatchesExample
$x:exprAny expression42, a + b, foo()
$x:tyA typei32, Vec<String>
$x:identAn identifiermy_var, Config
$x:patA patternSome(x), _
$x:stmtA statementlet x = 5;
$x:ttA single token treeAnything (most flexible)
$x:literalA literal value42, "hello", true

Repetition: $( ... ),* means “zero or more, comma-separated”

#![allow(unused)]
fn main() {
// Generate test functions automatically
macro_rules! test_cases {
    ( $( $name:ident: $input:expr => $expected:expr ),* $(,)? ) => {
        $(
            #[test]
            fn $name() {
                assert_eq!(process($input), $expected);
            }
        )*
    };
}

test_cases! {
    test_empty: "" => "",
    test_hello: "hello" => "HELLO",
    test_trim: "  spaces  " => "SPACES",
}
// Generates three separate #[test] functions
}

When (Not) to Use Macros

Use macros when:

  • Reducing boilerplate that traits/generics can’t handle (variadic arguments, DRY test generation)
  • Creating DSLs (html!, sql!, vec!)
  • Conditional code generation (cfg!, compile_error!)

Don’t use macros when:

  • A function or generic would work (macros are harder to debug, autocomplete doesn’t help)
  • You need type checking inside the macro (macros operate on tokens, not types)
  • The pattern is used once or twice (not worth the abstraction cost)
#![allow(unused)]
fn main() {
// ❌ Unnecessary macro — a function works fine:
macro_rules! double {
    ($x:expr) => { $x * 2 };
}

// ✅ Just use a function:
fn double(x: i32) -> i32 { x * 2 }

// ✅ Good macro use — variadic, can't be a function:
macro_rules! println {
    ($($arg:tt)*) => { /* format string + args */ };
}
}

Procedural Macros Overview

Procedural macros are Rust functions that transform token streams. They require a separate crate with proc-macro = true:

#![allow(unused)]
fn main() {
// Three types of proc macros:

// 1. Derive macros — #[derive(MyTrait)]
// Generate trait implementations from struct definitions
#[derive(Debug, Clone, Serialize, Deserialize)]
struct Config {
    name: String,
    port: u16,
}

// 2. Attribute macros — #[my_attribute]
// Transform the annotated item
#[route(GET, "/api/users")]
async fn list_users() -> Json<Vec<User>> { /* ... */ }

// 3. Function-like macros — my_macro!(...)
// Custom syntax
let query = sql!(SELECT * FROM users WHERE id = ?);
}

Derive Macros in Practice

The most common proc macro type. Here’s how #[derive(Debug)] works conceptually:

#![allow(unused)]
fn main() {
// Input (your struct):
#[derive(Debug)]
struct Point {
    x: f64,
    y: f64,
}

// The derive macro generates:
impl std::fmt::Debug for Point {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("Point")
            .field("x", &self.x)
            .field("y", &self.y)
            .finish()
    }
}
}

Commonly used derive macros:

DeriveCrateWhat It Generates
Debugstdfmt::Debug impl (debug printing)
Clone, CopystdValue duplication
PartialEq, EqstdEquality comparison
HashstdHashing for HashMap keys
Serialize, DeserializeserdeJSON/YAML/etc. encoding
Errorthiserrorstd::error::Error + Display
ParserclapCLI argument parsing
Builderderive_builderBuilder pattern

Practical advice: Use derive macros liberally — they eliminate error-prone boilerplate. Writing your own proc macros is an advanced topic; use existing ones (serde, thiserror, clap) before building custom ones.

Macro Hygiene and $crate

Hygiene means that identifiers created inside a macro don’t collide with identifiers in the caller’s scope. Rust’s macro_rules! is partially hygienic:

macro_rules! make_var {
    () => {
        let x = 42; // This 'x' is in the MACRO's scope
    };
}

fn main() {
    let x = 10;
    make_var!();   // Creates a different 'x' (hygienic)
    println!("{x}"); // Prints 10, not 42 — macro's x doesn't leak
}

$crate: When writing macros in a library, use $crate to refer to your own crate — it resolves correctly regardless of how users import your crate:

#![allow(unused)]
fn main() {
// In my_diagnostics crate:

pub fn log_result(msg: &str) {
    println!("[diag] {msg}");
}

#[macro_export]
macro_rules! diag_log {
    ($($arg:tt)*) => {
        // ✅ $crate always resolves to my_diagnostics, even if the user
        // renamed the crate in their Cargo.toml
        $crate::log_result(&format!($($arg)*))
    };
}

// ❌ Without $crate:
// my_diagnostics::log_result(...)  ← breaks if user writes:
//   [dependencies]
//   diag = { package = "my_diagnostics", version = "1" }
}

Rule: Always use $crate:: in #[macro_export] macros. Never use your crate’s name directly.

Recursive Macros and tt Munching

Recursive macros process input one token at a time — a technique called tt munching (token-tree munching):

// Count the number of expressions passed to the macro
macro_rules! count {
    // Base case: no tokens left
    () => { 0usize };
    // Recursive case: consume one expression, count the rest
    ($head:expr $(, $tail:expr)* $(,)?) => {
        1usize + count!($($tail),*)
    };
}

fn main() {
    let n = count!("a", "b", "c", "d");
    assert_eq!(n, 4);

    // Works at compile time too:
    const N: usize = count!(1, 2, 3);
    assert_eq!(N, 3);
}
#![allow(unused)]
fn main() {
// Build a heterogeneous tuple from a list of expressions:
macro_rules! tuple_from {
    // Base: single element
    ($single:expr $(,)?) => { ($single,) };
    // Recursive: first element + rest
    ($head:expr, $($tail:expr),+ $(,)?) => {
        ($head, tuple_from!($($tail),+))
    };
}

let t = tuple_from!(1, "hello", 3.14, true);
// Expands to: (1, ("hello", (3.14, (true,))))
}

Fragment specifier subtleties:

FragmentGotcha
$x:exprGreedily parses — 1 + 2 is ONE expression, not three tokens
$x:tyGreedily parses — Vec<String> is one type; can’t be followed by + or <
$x:ttMatches exactly ONE token tree — most flexible, least checked
$x:identOnly plain identifiers — not paths like std::io
$x:patIn Rust 2021, matches A | B patterns; use $x:pat_param for single patterns

When to use tt: When you need to forward tokens to another macro without the parser constraining them. $($args:tt)* is the “accept everything” pattern (used by println!, format!, vec!).

Writing a Derive Macro with syn and quote

Derive macros live in a separate crate (proc-macro = true) and transform a token stream using syn (parse Rust) and quote (generate Rust):

my_derive/Cargo.toml

[lib] proc-macro = true

[dependencies] syn = { version = “2”, features = [“full”] } quote = “1” proc-macro2 = “1”

#![allow(unused)]
fn main() {
// my_derive/src/lib.rs
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};

/// Derive macro that generates a `describe()` method
/// returning the struct name and field names.
#[proc_macro_derive(Describe)]
pub fn derive_describe(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;
    let name_str = name.to_string();

    // Extract field names (only for structs with named fields)
    let fields = match &input.data {
        syn::Data::Struct(data) => {
            data.fields.iter()
                .filter_map(|f| f.ident.as_ref())
                .map(|id| id.to_string())
                .collect::<Vec<_>>()
        }
        _ => vec![],
    };

    let field_list = fields.join(", ");

    let expanded = quote! {
        impl #name {
            pub fn describe() -> String {
                format!("{} {{ {} }}", #name_str, #field_list)
            }
        }
    };

    TokenStream::from(expanded)
}
}
// In the application crate:
use my_derive::Describe;

#[derive(Describe)]
struct SensorReading {
    sensor_id: u16,
    value: f64,
    timestamp: u64,
}

fn main() {
    println!("{}", SensorReading::describe());
    // "SensorReading { sensor_id, value, timestamp }"
}

The workflow: TokenStream (raw tokens) → syn::parse (AST) → inspect/transform → quote! (generate tokens) → TokenStream (back to compiler).

CrateRoleKey types
proc-macroCompiler interfaceTokenStream
synParse Rust source into ASTDeriveInput, ItemFn, Type
quoteGenerate Rust tokens from templatesquote!{}, #variable interpolation
proc-macro2Bridge between syn/quote and proc-macroTokenStream, Span

Practical tip: Start by studying the source of a simple derive macro like thiserror or derive_more before writing your own. The cargo expand command (via cargo-expand) shows what any macro expands to — invaluable for debugging.

Key Takeaways — Macros

  • macro_rules! for simple code generation; proc macros (syn + quote) for complex derives
  • Prefer generics/traits over macros when possible — macros are harder to debug and maintain
  • $crate ensures hygiene; tt munching enables recursive pattern matching

See also: Ch 2 — Traits for when traits/generics beat macros. Ch 13 — Testing for testing macro-generated code.

flowchart LR
    A["Source code"] --> B["macro_rules!<br>pattern matching"]
    A --> C["#[derive(MyMacro)]<br>proc macro"]

    B --> D["Token expansion"]
    C --> E["syn: parse AST"]
    E --> F["Transform"]
    F --> G["quote!: generate tokens"]
    G --> D

    D --> H["Compiled code"]

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#d4efdf,stroke:#27ae60,color:#000
    style C fill:#fdebd0,stroke:#e67e22,color:#000
    style D fill:#fef9e7,stroke:#f1c40f,color:#000
    style E fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#fdebd0,stroke:#e67e22,color:#000
    style G fill:#fdebd0,stroke:#e67e22,color:#000
    style H fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Declarative Macro — map! ★ (~15 min)

Write a map! macro that creates a HashMap from key-value pairs:

let m = map! {
    "host" => "localhost",
    "port" => "8080",
};
assert_eq!(m.get("host"), Some(&"localhost"));

Requirements: support trailing comma and empty invocation map!{}.

🔑 Solution
macro_rules! map {
    () => { std::collections::HashMap::new() };
    ( $( $key:expr => $val:expr ),+ $(,)? ) => {{
        let mut m = std::collections::HashMap::new();
        $( m.insert($key, $val); )+
        m
    }};
}

fn main() {
    let config = map! {
        "host" => "localhost",
        "port" => "8080",
        "timeout" => "30",
    };
    assert_eq!(config.len(), 3);
    assert_eq!(config["host"], "localhost");

    let empty: std::collections::HashMap<String, String> = map!();
    assert!(empty.is_empty());

    let scores = map! { 1 => 100, 2 => 200 };
    assert_eq!(scores[&1], 100);
}

13. Testing and Benchmarking Patterns 🟢

What you’ll learn:

  • Rust’s three test tiers: unit, integration, and doc tests
  • Property-based testing with proptest for discovering edge cases
  • Benchmarking with criterion for reliable performance measurement
  • Mocking strategies without heavyweight frameworks

Unit Tests, Integration Tests, Doc Tests

Rust has three testing tiers built into the language:

#![allow(unused)]
fn main() {
// --- Unit tests: in the same file as the code ---
pub fn factorial(n: u64) -> u64 {
    (1..=n).product()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_factorial_zero() {
        // (1..=0).product() returns 1 — the multiplication identity for empty ranges
        assert_eq!(factorial(0), 1);
    }

    #[test]
    fn test_factorial_five() {
        assert_eq!(factorial(5), 120);
    }

    #[test]
    #[cfg(debug_assertions)] // overflow checks are only enabled in debug mode
    #[should_panic(expected = "overflow")]
    fn test_factorial_overflow() {
        // ⚠️ This test only passes in debug mode (overflow checks enabled).
        // In release mode (`cargo test --release`), u64 arithmetic wraps
        // silently and no panic occurs. Use `checked_mul` or the
        // `overflow-checks = true` profile setting for release-mode safety.
        factorial(100); // Should panic on overflow
    }

    #[test]
    fn test_with_result() -> Result<(), Box<dyn std::error::Error>> {
        // Tests can return Result — ? works inside!
        let value: u64 = "42".parse()?;
        assert_eq!(value, 42);
        Ok(())
    }
}
}
#![allow(unused)]
fn main() {
// --- Integration tests: in tests/ directory ---
// tests/integration_test.rs
// These test your crate's PUBLIC API only

use my_crate::factorial;

#[test]
fn test_factorial_from_outside() {
    assert_eq!(factorial(10), 3_628_800);
}
}
#![allow(unused)]
fn main() {
// --- Doc tests: in documentation comments ---
/// Computes the factorial of `n`.
///
/// # Examples
///
/// ```
/// use my_crate::factorial;
/// assert_eq!(factorial(5), 120);
/// ```
///
/// # Panics
///
/// Panics if the result overflows `u64`.
///
/// ```should_panic
/// my_crate::factorial(100);
/// ```
pub fn factorial(n: u64) -> u64 {
    (1..=n).product()
}
// Doc tests are compiled and run by `cargo test` — they keep examples honest.
}

Test Fixtures and Setup

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    // Shared setup — create a helper function
    fn setup_database() -> TestDb {
        let db = TestDb::new_in_memory();
        db.run_migrations();
        db.seed_test_data();
        db
    }

    #[test]
    fn test_user_creation() {
        let db = setup_database();
        let user = db.create_user("Alice", "alice@test.com").unwrap();
        assert_eq!(user.name, "Alice");
    }

    #[test]
    fn test_user_deletion() {
        let db = setup_database();
        db.create_user("Bob", "bob@test.com").unwrap();
        assert!(db.delete_user("Bob").is_ok());
        assert!(db.get_user("Bob").is_none());
    }

    // Cleanup with Drop (RAII):
    struct TempDir {
        path: std::path::PathBuf,
    }

    impl TempDir {
        fn new() -> Self {
            // Cargo.toml: rand = "0.8"
            let path = std::env::temp_dir().join(format!("test_{}", rand::random::<u32>()));
            std::fs::create_dir_all(&path).unwrap();
            TempDir { path }
        }
    }

    impl Drop for TempDir {
        fn drop(&mut self) {
            let _ = std::fs::remove_dir_all(&self.path);
        }
    }

    #[test]
    fn test_file_operations() {
        let dir = TempDir::new(); // Created
        std::fs::write(dir.path.join("test.txt"), "hello").unwrap();
        assert!(dir.path.join("test.txt").exists());
    } // dir dropped here → temp directory cleaned up
}
}

Property-Based Testing (proptest)

Instead of testing specific values, test properties that should always hold:

#![allow(unused)]
fn main() {
// Cargo.toml: proptest = "1"
use proptest::prelude::*;

fn reverse(v: &[i32]) -> Vec<i32> {
    v.iter().rev().cloned().collect()
}

proptest! {
    #[test]
    fn test_reverse_twice_is_identity(v in prop::collection::vec(any::<i32>(), 0..100)) {
        // Property: reversing twice gives back the original
        assert_eq!(reverse(&reverse(&v)), v);
    }

    #[test]
    fn test_reverse_preserves_length(v in prop::collection::vec(any::<i32>(), 0..100)) {
        assert_eq!(reverse(&v).len(), v.len());
    }

    #[test]
    fn test_sort_is_idempotent(mut v in prop::collection::vec(any::<i32>(), 0..100)) {
        v.sort();
        let sorted_once = v.clone();
        v.sort();
        assert_eq!(v, sorted_once); // Sorting twice = sorting once
    }

    #[test]
    fn test_parse_roundtrip(x in any::<f64>().prop_filter("finite", |x| x.is_finite())) {
        // Property: formatting then parsing gives back the same value
        let s = format!("{x}");
        let parsed: f64 = s.parse().unwrap();
        prop_assert!((x - parsed).abs() < f64::EPSILON);
    }
}
}

When to use proptest: When you’re testing a function with a large input space and want confidence it works for edge cases you didn’t think of. proptest generates hundreds of random inputs and shrinks failures to the minimal reproducing case.

Benchmarking with criterion

#![allow(unused)]
fn main() {
// Cargo.toml:
// [dev-dependencies]
// criterion = { version = "0.5", features = ["html_reports"] }
//
// [[bench]]
// name = "my_benchmarks"
// harness = false

// benches/my_benchmarks.rs
use criterion::{criterion_group, criterion_main, Criterion, black_box};

fn fibonacci(n: u64) -> u64 {
    match n {
        0 | 1 => n,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

fn bench_fibonacci(c: &mut Criterion) {
    c.bench_function("fibonacci 20", |b| {
        b.iter(|| fibonacci(black_box(20)))
    });

    // Compare different implementations:
    let mut group = c.benchmark_group("fibonacci_compare");
    for size in [10, 15, 20, 25] {
        group.bench_with_input(
            criterion::BenchmarkId::from_parameter(size),
            &size,
            |b, &size| b.iter(|| fibonacci(black_box(size))),
        );
    }
    group.finish();
}

criterion_group!(benches, bench_fibonacci);
criterion_main!(benches);

// Run: cargo bench
// Produces HTML reports in target/criterion/
}

Mocking Strategies without Frameworks

Rust’s trait system provides natural dependency injection — no mocking framework required:

#![allow(unused)]
fn main() {
// Define behavior as a trait
trait Clock {
    fn now(&self) -> std::time::Instant;
}

trait HttpClient {
    fn get(&self, url: &str) -> Result<String, String>;
}

// Production implementations
struct RealClock;
impl Clock for RealClock {
    fn now(&self) -> std::time::Instant { std::time::Instant::now() }
}

// Service depends on abstractions
struct CacheService<C: Clock, H: HttpClient> {
    clock: C,
    client: H,
    ttl: std::time::Duration,
}

impl<C: Clock, H: HttpClient> CacheService<C, H> {
    fn fetch(&self, url: &str) -> Result<String, String> {
        // Uses self.clock and self.client — injectable
        self.client.get(url)
    }
}

// Test with mock implementations — no framework needed!
#[cfg(test)]
mod tests {
    use super::*;

    struct MockClock {
        fixed_time: std::time::Instant,
    }
    impl Clock for MockClock {
        fn now(&self) -> std::time::Instant { self.fixed_time }
    }

    struct MockHttpClient {
        response: String,
    }
    impl HttpClient for MockHttpClient {
        fn get(&self, _url: &str) -> Result<String, String> {
            Ok(self.response.clone())
        }
    }

    #[test]
    fn test_cache_service() {
        let service = CacheService {
            clock: MockClock { fixed_time: std::time::Instant::now() },
            client: MockHttpClient { response: "cached data".into() },
            ttl: std::time::Duration::from_secs(300),
        };

        assert_eq!(service.fetch("http://example.com").unwrap(), "cached data");
    }
}
}

Test philosophy: Prefer real dependencies in integration tests, trait-based mocks in unit tests. Avoid mocking frameworks unless your dependency graph is complex — Rust’s trait generics handle most cases naturally.

Key Takeaways — Testing

  • Doc tests (///) double as documentation and regression tests — they’re compiled and run
  • proptest generates random inputs to find edge cases you’d never write manually
  • criterion provides statistically rigorous benchmarks with HTML reports
  • Mock via trait generics + test doubles, not mock frameworks

See also: Ch 12 — Macros for testing macro-generated code. Ch 14 — API Design for how module layout affects test organization.


Exercise: Property-Based Testing with proptest ★★ (~25 min)

Write a SortedVec<T: Ord> wrapper that maintains a sorted invariant. Use proptest to verify that:

  1. After any sequence of insertions, the internal vec is always sorted
  2. contains() agrees with the stdlib Vec::contains()
  3. The length equals the number of insertions
🔑 Solution
#[derive(Debug)]
struct SortedVec<T: Ord> {
    inner: Vec<T>,
}

impl<T: Ord> SortedVec<T> {
    fn new() -> Self { SortedVec { inner: Vec::new() } }

    fn insert(&mut self, value: T) {
        let pos = self.inner.binary_search(&value).unwrap_or_else(|p| p);
        self.inner.insert(pos, value);
    }

    fn contains(&self, value: &T) -> bool {
        self.inner.binary_search(value).is_ok()
    }

    fn len(&self) -> usize { self.inner.len() }
    fn as_slice(&self) -> &[T] { &self.inner }
}

#[cfg(test)]
mod tests {
    use super::*;
    use proptest::prelude::*;

    proptest! {
        #[test]
        fn always_sorted(values in proptest::collection::vec(-1000i32..1000, 0..100)) {
            let mut sv = SortedVec::new();
            for v in &values {
                sv.insert(*v);
            }
            for w in sv.as_slice().windows(2) {
                prop_assert!(w[0] <= w[1]);
            }
            prop_assert_eq!(sv.len(), values.len());
        }

        #[test]
        fn contains_matches_stdlib(values in proptest::collection::vec(0i32..50, 1..30)) {
            let mut sv = SortedVec::new();
            for v in &values {
                sv.insert(*v);
            }
            for v in &values {
                prop_assert!(sv.contains(v));
            }
            prop_assert!(!sv.contains(&9999));
        }
    }
}

14. Crate Architecture and API Design 🟡

What you’ll learn:

  • Module layout conventions and re-export strategies
  • The public API design checklist for polished crates
  • Ergonomic parameter patterns: impl Into, AsRef, Cow
  • “Parse, don’t validate” with TryFrom and validated types
  • Feature flags, conditional compilation, and workspace organization

Module Layout Conventions

my_crate/
├── Cargo.toml
├── src/
│   ├── lib.rs          # Crate root — re-exports and public API
│   ├── config.rs       # Feature module
│   ├── parser/         # Complex module with sub-modules
│   │   ├── mod.rs      # or parser.rs at parent level (Rust 2018+)
│   │   ├── lexer.rs
│   │   └── ast.rs
│   ├── error.rs        # Error types
│   └── utils.rs        # Internal helpers (pub(crate))
├── tests/
│   └── integration.rs  # Integration tests
├── benches/
│   └── perf.rs         # Benchmarks
└── examples/
    └── basic.rs        # cargo run --example basic
#![allow(unused)]
fn main() {
// lib.rs — curate your public API with re-exports:
mod config;
mod error;
mod parser;
mod utils;

// Re-export what users need:
pub use config::Config;
pub use error::Error;
pub use parser::Parser;

// Public types are at the crate root — users write:
// use my_crate::Config;
// NOT: use my_crate::config::Config;
}

Visibility modifiers:

ModifierVisible To
pubEveryone
pub(crate)This crate only
pub(super)Parent module
pub(in path)Specific ancestor module
(none)Current module and its children

Public API Design Checklist

  1. Accept references, return ownedfn process(input: &str) -> String
  2. Use impl Trait for parametersfn read(r: impl Read) instead of fn read<R: Read>(r: R) for cleaner signatures
  3. Return Result, not panic! — let callers decide how to handle errors
  4. Implement standard traitsDebug, Display, Clone, Default, From/Into
  5. Make invalid states unrepresentable — use type states and newtypes
  6. Follow the builder pattern for complex configuration — with type-state if fields are required
  7. Seal traits you don’t want users to implementpub trait Sealed: private::Sealed {}
  8. Mark types and functions #[must_use] — prevents silent discard of important Results, guards, or values. Apply to any type where ignoring the return value is almost certainly a bug:
    #![allow(unused)]
    fn main() {
    #[must_use = "dropping the guard immediately releases the lock"]
    pub struct LockGuard<'a, T> { /* ... */ }
    
    #[must_use]
    pub fn validate(input: &str) -> Result<ValidInput, ValidationError> { /* ... */ }
    }
#![allow(unused)]
fn main() {
// Sealed trait pattern — users can use but not implement:
mod private {
    pub trait Sealed {}
}

pub trait DatabaseDriver: private::Sealed {
    fn connect(&self, url: &str) -> Connection;
}

// Only types in THIS crate can implement Sealed → only we can implement DatabaseDriver
pub struct PostgresDriver;
impl private::Sealed for PostgresDriver {}
impl DatabaseDriver for PostgresDriver {
    fn connect(&self, url: &str) -> Connection { /* ... */ }
}
}

#[non_exhaustive] — mark public enums and structs so that adding variants or fields is not a breaking change. Downstream crates must use a wildcard arm (_ =>) in match statements, and cannot construct the type with struct literal syntax:

#![allow(unused)]
fn main() {
#[non_exhaustive]
pub enum DiagError {
    Timeout,
    HardwareFault,
    // Adding a new variant in a future release is NOT a semver break.
}
}

Ergonomic Parameter Patterns — impl Into, AsRef, Cow

One of Rust’s most impactful API patterns is accepting the most general type in function parameters, so callers don’t need repetitive .to_string(), &*s, or .as_ref() at every call site. This is the Rust-specific version of “be liberal in what you accept.”

impl Into<T> — Accept Anything Convertible

#![allow(unused)]
fn main() {
// ❌ Friction: callers must convert manually
fn connect(host: String, port: u16) -> Connection {
    // ...
}
connect("localhost".to_string(), 5432);  // Annoying .to_string()
connect(hostname.clone(), 5432);          // Unnecessary clone if we already have String

// ✅ Ergonomic: accept anything that converts to String
fn connect(host: impl Into<String>, port: u16) -> Connection {
    let host = host.into();  // Convert once, inside the function
    // ...
}
connect("localhost", 5432);     // &str — zero friction
connect(hostname, 5432);        // String — moved, no clone
connect(arc_str, 5432);         // Arc<str> if From is implemented
}

This works because Rust’s From/Into trait pair provides blanket conversions. When you accept impl Into<T>, you’re saying: “give me anything that knows how to become a T.”

AsRef<T> — Borrow as a Reference

AsRef<T> is the borrowing counterpart to Into<T>. Use it when you only need to read the data, not take ownership:

#![allow(unused)]
fn main() {
use std::path::Path;

// ❌ Forces callers to convert to &Path
fn file_exists(path: &Path) -> bool {
    path.exists()
}
file_exists(Path::new("/tmp/test.txt"));  // Awkward

// ✅ Accept anything that can behave as a &Path
fn file_exists(path: impl AsRef<Path>) -> bool {
    path.as_ref().exists()
}
file_exists("/tmp/test.txt");                    // &str ✅
file_exists(String::from("/tmp/test.txt"));      // String ✅
file_exists(Path::new("/tmp/test.txt"));         // &Path ✅
file_exists(PathBuf::from("/tmp/test.txt"));     // PathBuf ✅

// Same pattern for string-like parameters:
fn log_message(msg: impl AsRef<str>) {
    println!("[LOG] {}", msg.as_ref());
}
log_message("hello");                    // &str ✅
log_message(String::from("hello"));      // String ✅
}

Cow<T> — Clone on Write

Cow<'a, T> (Clone on Write) delays allocation until mutation is needed. It holds either a borrowed &T or an owned T::Owned. This is perfect when most calls don’t need to modify the data:

#![allow(unused)]
fn main() {
use std::borrow::Cow;

/// Normalizes a diagnostic message — only allocates if changes are needed.
fn normalize_message(msg: &str) -> Cow<'_, str> {
    if msg.contains('\t') || msg.contains('\r') {
        // Must allocate — we need to modify the content
        Cow::Owned(msg.replace('\t', "    ").replace('\r', ""))
    } else {
        // No allocation — just borrow the original
        Cow::Borrowed(msg)
    }
}

// Most messages pass through without allocation:
let clean = normalize_message("All tests passed");          // Borrowed — free
let fixed = normalize_message("Error:\tfailed\r\n");        // Owned — allocated

// Cow<str> implements Deref<Target=str>, so it works like &str:
println!("{}", clean);
println!("{}", fixed.to_uppercase());
}

Quick Reference: Which to Use

Do you need ownership of the data inside the function?
├── YES → impl Into<T>
│         "Give me anything that can become a T"
└── NO  → Do you only need to read it?
     ├── YES → impl AsRef<T> or &T
     │         "Give me anything I can borrow as a &T"
     └── MAYBE (might need to modify sometimes?)
          └── Cow<'_, T>
              "Borrow if possible, clone only when you must"
PatternOwnershipAllocationWhen to use
&strBorrowedNeverSimple string params
impl AsRef<str>BorrowedNeverAccept String, &str, etc. — read only
impl Into<String>OwnedOn conversionAccept &str, String — will store/own
Cow<'_, str>EitherOnly if modifiedProcessing that usually doesn’t modify
&[u8] / impl AsRef<[u8]>BorrowedNeverByte-oriented APIs

Borrow<T> vs AsRef<T>: Both provide &T, but Borrow<T> additionally guarantees that Eq, Ord, and Hash are consistent between the original and borrowed form. This is why HashMap<String, V>::get() accepts &Q where String: Borrow<Q> — not AsRef. Use Borrow when the borrowed form is used as a lookup key; use AsRef for general “give me a reference” parameters.

Composing Conversions in APIs

#![allow(unused)]
fn main() {
/// A well-designed diagnostic API using ergonomic parameters:
pub struct DiagRunner {
    name: String,
    config_path: PathBuf,
    results: HashMap<String, TestResult>,
}

impl DiagRunner {
    /// Accept any string-like type for name, any path-like type for config.
    pub fn new(
        name: impl Into<String>,
        config_path: impl Into<PathBuf>,
    ) -> Self {
        DiagRunner {
            name: name.into(),
            config_path: config_path.into(),
        }
    }

    /// Accept any AsRef<str> for read-only lookup.
    pub fn get_result(&self, test_name: impl AsRef<str>) -> Option<&TestResult> {
        self.results.get(test_name.as_ref())
    }
}

// All of these work with zero caller friction:
let runner = DiagRunner::new("GPU Diag", "/etc/diag_tool/config.json");
let runner = DiagRunner::new(format!("Diag-{}", node_id), config_path);
let runner = DiagRunner::new(name_string, path_buf);
}

Case Study: Designing a Public Crate API — Before & After

A real-world example of evolving a stringly-typed internal API into an ergonomic, type-safe public API. Consider a configuration parser crate:

Before (stringly-typed, easy to misuse):

#![allow(unused)]
fn main() {
// ❌ All parameters are strings — no compile-time validation
pub fn parse_config(path: &str, format: &str, strict: bool) -> Result<Config, String> {
    // What formats are valid? "json"? "JSON"? "Json"?
    // Is path a file path or URL?
    // What does "strict" even mean?
    todo!()
}
}

After (type-safe, self-documenting):

#![allow(unused)]
fn main() {
use std::path::Path;

/// Supported configuration formats.
#[derive(Debug, Clone, Copy)]
#[non_exhaustive]  // Adding formats won't break downstream
pub enum Format {
    Json,
    Toml,
    Yaml,
}

/// Controls parsing strictness.
#[derive(Debug, Clone, Copy, Default)]
pub enum Strictness {
    /// Reject unknown fields (default for libraries)
    #[default]
    Strict,
    /// Ignore unknown fields (useful for forward-compatible configs)
    Lenient,
}

pub fn parse_config(
    path: &Path,          // Type-enforced: must be a filesystem path
    format: Format,       // Enum: impossible to pass invalid format
    strictness: Strictness,  // Named alternatives, not a bare bool
) -> Result<Config, ConfigError> {
    todo!()
}
}

What improved:

AspectBeforeAfter
Format validationRuntime string comparisonCompile-time enum
Path typeRaw &str (could be anything)&Path (filesystem-specific)
StrictnessMystery boolSelf-documenting enum
Error typeString (opaque)ConfigError (structured)
ExtensibilityBreaking changes#[non_exhaustive]

Rule of thumb: If you find yourself writing a match on string values, consider replacing the parameter with an enum. If a parameter is a boolean that isn’t obvious from context, use a two-variant enum instead.


Parse Don’t Validate — TryFrom and Validated Types

“Parse, don’t validate” is a principle that says: don’t check data and then pass around the raw unchecked form — instead, parse it into a type that can only exist if the data is valid. Rust’s TryFrom trait is the standard tool for this.

The Problem: Validation Without Enforcement

#![allow(unused)]
fn main() {
// ❌ Validate-then-use: nothing prevents using an invalid value after the check
fn process_port(port: u16) {
    if port == 0 || port > 65535 {
        panic!("Invalid port");           // We checked, but...
    }
    start_server(port);                    // What if someone calls start_server(0) directly?
}

// ❌ Stringly-typed: an email is just a String — any garbage gets through
fn send_email(to: String, body: String) {
    // Is `to` actually a valid email? We don't know.
    // Someone could pass "not-an-email" and we only find out at the SMTP server.
}
}

The Solution: Parse Into Validated Newtypes with TryFrom

use std::convert::TryFrom;
use std::fmt;

/// A validated TCP port number (1–65535).
/// If you have a `Port`, it is guaranteed valid.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Port(u16);

impl TryFrom<u16> for Port {
    type Error = PortError;

    fn try_from(value: u16) -> Result<Self, Self::Error> {
        if value == 0 {
            Err(PortError::Zero)
        } else {
            Ok(Port(value))
        }
    }
}

impl Port {
    pub fn get(&self) -> u16 { self.0 }
}

#[derive(Debug)]
pub enum PortError {
    Zero,
    InvalidFormat,
}

impl fmt::Display for PortError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            PortError::Zero => write!(f, "port must be non-zero"),
            PortError::InvalidFormat => write!(f, "invalid port format"),
        }
    }
}

impl std::error::Error for PortError {}

// Now the type system enforces validity:
fn start_server(port: Port) {
    // No validation needed — Port can only be constructed via TryFrom,
    // which already verified it's valid.
    println!("Listening on port {}", port.get());
}

// Usage:
fn main() -> Result<(), Box<dyn std::error::Error>> {
    let port = Port::try_from(8080)?;   // ✅ Validated once at the boundary
    start_server(port);                  // No re-validation anywhere downstream

    let bad = Port::try_from(0);         // ❌ Err(PortError::Zero)
    Ok(())
}

Real-World Example: Validated IPMI Address

#![allow(unused)]
fn main() {
/// A validated IPMI slave address (0x20–0xFE, even only).
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct IpmiAddr(u8);

#[derive(Debug)]
pub enum IpmiAddrError {
    Odd(u8),
    OutOfRange(u8),
}

impl fmt::Display for IpmiAddrError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            IpmiAddrError::Odd(v) => write!(f, "IPMI address 0x{v:02X} must be even"),
            IpmiAddrError::OutOfRange(v) => {
                write!(f, "IPMI address 0x{v:02X} out of range (0x20..=0xFE)")
            }
        }
    }
}

impl TryFrom<u8> for IpmiAddr {
    type Error = IpmiAddrError;

    fn try_from(value: u8) -> Result<Self, Self::Error> {
        if value % 2 != 0 {
            Err(IpmiAddrError::Odd(value))
        } else if value < 0x20 || value > 0xFE {
            Err(IpmiAddrError::OutOfRange(value))
        } else {
            Ok(IpmiAddr(value))
        }
    }
}

impl IpmiAddr {
    pub fn get(&self) -> u8 { self.0 }
}

// Downstream code never needs to re-check:
fn send_ipmi_command(addr: IpmiAddr, cmd: u8, data: &[u8]) -> Result<Vec<u8>, IpmiError> {
    // addr.get() is guaranteed to be a valid, even IPMI address
    raw_ipmi_send(addr.get(), cmd, data)
}
}

Parsing Strings with FromStr

For types that are commonly parsed from text (CLI args, config files), implement FromStr:

#![allow(unused)]
fn main() {
use std::str::FromStr;

impl FromStr for Port {
    type Err = PortError;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        let n: u16 = s.parse().map_err(|_| PortError::InvalidFormat)?;
        Port::try_from(n)
    }
}

// Now works with .parse():
let port: Port = "8080".parse()?;   // Validates in one step

// And with clap CLI parsing:
// #[derive(Parser)]
// struct Args {
//     #[arg(short, long)]
//     port: Port,   // clap calls FromStr automatically
// }
}

TryFrom Chain for Complex Validation

#![allow(unused)]
fn main() {
// Stub types for this example — in production these would be in
// separate modules with their own TryFrom implementations.
}
#![allow(unused)]
fn main() {
struct Hostname(String);
impl TryFrom<String> for Hostname {
    type Error = String;
    fn try_from(s: String) -> Result<Self, String> { Ok(Hostname(s)) }
}
struct Timeout(u64);
impl TryFrom<u64> for Timeout {
    type Error = String;
    fn try_from(ms: u64) -> Result<Self, String> {
        if ms == 0 { Err("timeout must be > 0".into()) } else { Ok(Timeout(ms)) }
    }
}
struct RawConfig { host: String, port: u16, timeout_ms: u64 }
#[derive(Debug)]
enum ConfigError {
    InvalidHost(String),
    InvalidPort(PortError),
    InvalidTimeout(String),
}
impl From<std::io::Error> for ConfigError {
    fn from(e: std::io::Error) -> Self { ConfigError::InvalidHost(e.to_string()) }
}
impl From<serde_json::Error> for ConfigError {
    fn from(e: serde_json::Error) -> Self { ConfigError::InvalidHost(e.to_string()) }
}
/// A validated configuration that can only exist if all fields are valid.
pub struct ValidConfig {
    pub host: Hostname,
    pub port: Port,
    pub timeout_ms: Timeout,
}

impl TryFrom<RawConfig> for ValidConfig {
    type Error = ConfigError;

    fn try_from(raw: RawConfig) -> Result<Self, Self::Error> {
        Ok(ValidConfig {
            host: Hostname::try_from(raw.host)
                .map_err(ConfigError::InvalidHost)?,
            port: Port::try_from(raw.port)
                .map_err(ConfigError::InvalidPort)?,
            timeout_ms: Timeout::try_from(raw.timeout_ms)
                .map_err(ConfigError::InvalidTimeout)?,
        })
    }
}

// Parse once at the boundary, use the validated type everywhere:
fn load_config(path: &str) -> Result<ValidConfig, ConfigError> {
    let raw: RawConfig = serde_json::from_str(&std::fs::read_to_string(path)?)?;
    ValidConfig::try_from(raw)  // All validation happens here
}
}

Summary: Validate vs Parse

ApproachData checked?Compiler enforces validity?Re-validation needed?
Runtime checks (if/assert)Every function boundary
Validated newtype + TryFromNever — type is proof

The rule: parse at the boundary, use validated types everywhere inside. Raw strings, integers, and byte slices enter your system, get parsed into validated types via TryFrom/FromStr, and from that point forward the type system guarantees they’re valid.

Feature Flags and Conditional Compilation

Cargo.toml

[features] default = [“json”] # Enabled by default json = [“dep:serde_json”] # Enables JSON support xml = [“dep:quick-xml”] # Enables XML support full = [“json”, “xml”] # Meta-feature: enables all

[dependencies] serde = “1” serde_json = { version = “1”, optional = true } quick-xml = { version = “0.31”, optional = true }

#![allow(unused)]
fn main() {
// Conditional compilation based on features:
#[cfg(feature = "json")]
pub fn to_json<T: serde::Serialize>(value: &T) -> String {
    serde_json::to_string(value).unwrap()
}

#[cfg(feature = "xml")]
pub fn to_xml<T: serde::Serialize>(value: &T) -> String {
    quick_xml::se::to_string(value).unwrap()
}

// Compile error if a required feature isn't enabled:
#[cfg(not(any(feature = "json", feature = "xml")))]
compile_error!("At least one format feature (json, xml) must be enabled");
}

Best practices:

  • Keep default features minimal — users can opt in
  • Use dep: syntax (Rust 1.60+) for optional dependencies to avoid creating implicit features
  • Document features in your README and crate-level docs

Workspace Organization

For large projects, use a Cargo workspace to share dependencies and build artifacts:

Root Cargo.toml

[workspace] members = [ “core”, # Shared types and traits “parser”, # Parsing library “server”, # Binary — the main application “client”, # Client library “cli”, # CLI binary ]

Shared dependency versions:

[workspace.dependencies] serde = { version = “1”, features = [“derive”] } tokio = { version = “1”, features = [“full”] } tracing = “0.1”

In each member’s Cargo.toml:

[dependencies]

serde =

#![allow(unused)]
fn main() {

**Benefits**:
}
  • Single Cargo.lock — all crates use the same dependency versions
  • cargo test --workspace runs all tests
  • Shared build cache — compiling one crate benefits all
  • Clean dependency boundaries between components

.cargo/config.toml: Project-Level Configuration

The .cargo/config.toml file (at the workspace root or in $HOME/.cargo/) customizes Cargo behavior without modifying Cargo.toml:

.cargo/config.toml

Default target for this workspace

[build] target = “x86_64-unknown-linux-gnu”

Custom runner — e.g., run via QEMU for cross-compiled binaries

[target.aarch64-unknown-linux-gnu] runner = “qemu-aarch64-static” linker = “aarch64-linux-gnu-gcc”

Cargo aliases — custom shortcut commands

[alias] xt = “test –workspace –release” # cargo xt = run all tests in release ci = “clippy –workspace – -D warnings” # cargo ci = lint with errors on warnings cov = “llvm-cov –workspace” # cargo cov = coverage (requires cargo-llvm-cov)

Environment variables for build scripts

[env] IPMI_LIB_PATH = “/usr/lib/bmc”

Use a custom registry (for internal packages)

[registries.internal]

index = “https://gitlab.internal/crates/index”

#![allow(unused)]
fn main() {

Common configuration patterns:

}
SettingPurposeExample
[build] targetDefault compilation targetx86_64-unknown-linux-musl for static builds
[target.X] runnerHow to run the binary"qemu-aarch64-static" for cross-compiled
[target.X] linkerWhich linker to use"aarch64-linux-gnu-gcc"
[alias]Custom cargo subcommandsxt = "test --workspace"
[env]Build-time environment variablesLibrary paths, feature toggles
[net] offlinePrevent network accesstrue for air-gapped builds

Compile-Time Environment Variables: env!() and option_env!()

Rust can embed environment variables into the binary at compile time — useful for version strings, build metadata, and configuration:

#![allow(unused)]
fn main() {
// env!() — panics at compile time if the variable is missing
const VERSION: &str = env!("CARGO_PKG_VERSION"); // "0.1.0" from Cargo.toml
const PKG_NAME: &str = env!("CARGO_PKG_NAME");   // Crate name from Cargo.toml

// option_env!() — returns Option<&str>, doesn't panic if missing
const BUILD_SHA: Option<&str> = option_env!("GIT_SHA");
const BUILD_TIME: Option<&str> = option_env!("BUILD_TIMESTAMP");

fn print_version() {
    println!("{PKG_NAME} v{VERSION}");
    if let Some(sha) = BUILD_SHA {
        println!("  commit: {sha}");
    }
    if let Some(time) = BUILD_TIME {
        println!("  built:  {time}");
    }
}
}

Cargo automatically sets many useful environment variables:

VariableValueUse case
CARGO_PKG_VERSION"1.2.3"Version reporting
CARGO_PKG_NAME"diag_tool"Binary identification
CARGO_PKG_AUTHORSFrom Cargo.tomlAbout/help text
CARGO_MANIFEST_DIRAbsolute path to Cargo.tomlLocating test data files
OUT_DIRBuild output directorybuild.rs code generation target
TARGETTarget triplePlatform-specific logic in build.rs

You can set custom env vars from build.rs:

// build.rs
fn main() {
    println!("cargo::rustc-env=GIT_SHA={}", git_sha());
    println!("cargo::rustc-env=BUILD_TIMESTAMP={}", timestamp());
}

cfg_attr: Conditional Attributes

cfg_attr applies an attribute only when a condition is true. This is more targeted than #[cfg()], which includes/excludes entire items:

#![allow(unused)]
fn main() {
// Derive Serialize only when the "serde" feature is enabled:
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[derive(Debug, Clone)]
pub struct DiagResult {
    pub fc: u32,
    pub passed: bool,
    pub message: String,
}
// Without "serde" feature: no serde dependency needed at all
// With "serde" feature: DiagResult is serializable

// Conditional attribute for testing:
#[cfg_attr(test, derive(PartialEq))]  // Only derive PartialEq in test builds
pub struct LargeStruct { /* ... */ }

// Platform-specific function attributes:
#[cfg_attr(target_os = "linux", link_name = "ioctl")]
#[cfg_attr(target_os = "freebsd", link_name = "__ioctl")]
extern "C" fn platform_ioctl(fd: i32, request: u64) -> i32;
}
PatternWhat it does
#[cfg(feature = "x")]Include/exclude the entire item
#[cfg_attr(feature = "x", derive(Foo))]Add derive(Foo) only when feature “x” is on
#[cfg_attr(test, allow(unused))]Suppress warnings only in test builds
#[cfg_attr(doc, doc = "...")]Documentation visible only in cargo doc

cargo deny and cargo audit: Supply-Chain Security

Install security audit tools

cargo install cargo-deny cargo install cargo-audit

Check for known vulnerabilities in dependencies

cargo audit

Comprehensive checks: licenses, bans, advisories, sources

cargo deny check

#![allow(unused)]
fn main() {

Configure `cargo deny` with a `deny.toml` at the workspace root:

}

deny.toml

[advisories] vulnerability = “deny” # Fail on known vulnerabilities unmaintained = “warn” # Warn on unmaintained crates

[licenses] allow = [“MIT”, “Apache-2.0”, “BSD-2-Clause”, “BSD-3-Clause”] deny = [“GPL-3.0”] # Reject copyleft licenses

[bans] multiple-versions = “warn” # Warn if multiple versions of same crate deny = [

#![allow(unused)]
fn main() {
    { name = "openssl" },   # Force use of rustls instead
]

[sources]
allow-git = []              # No git dependencies in production
}
ToolPurposeWhen to run
cargo auditCheck for known CVEs in dependenciesCI pipeline, pre-release
cargo deny checkLicenses, bans, advisories, sourcesCI pipeline
cargo deny check licensesLicense compliance onlyBefore open-sourcing
cargo deny check bansPrevent specific cratesEnforce architecture decisions

Doc Tests: Tests Inside Documentation

Rust doc comments (///) can contain code blocks that are compiled and run as tests:

#![allow(unused)]
fn main() {
/// Parses a diagnostic fault code from a string.
///
/// # Examples
///
/// ```
/// use my_crate::parse_fc;
///
/// let fc = parse_fc("FC:12345").unwrap();
/// assert_eq!(fc, 12345);
/// ```
///
/// Invalid input returns an error:
///
/// ```
/// use my_crate::parse_fc;
///
/// assert!(parse_fc("not-a-fc").is_err());
/// ```
pub fn parse_fc(input: &str) -> Result<u32, ParseError> {
    input.strip_prefix("FC:")
        .ok_or(ParseError::MissingPrefix)?
        .parse()
        .map_err(ParseError::InvalidNumber)
}
}
cargo test --doc  # Run only doc tests
cargo test        # Runs unit + integration + doc tests

Module-level documentation uses //! at the top of a file:

#![allow(unused)]
fn main() {
//! # Diagnostic Framework
//!
//! This crate provides the core diagnostic execution engine.
//! It supports running diagnostic tests, collecting results,
//! and reporting to the BMC via IPMI.
//!
//! ## Quick Start
//!
//! ```no_run
//! use diag_framework::Framework;
//!
//! let mut fw = Framework::new("config.json")?;
//! fw.run_all_tests()?;
//! ```
}

Benchmarking with Criterion

Full coverage: See the Benchmarking with criterion section in Chapter 13 (Testing and Benchmarking Patterns) for complete criterion setup, API examples, and a comparison table vs cargo bench. Below is a quick-reference for architecture-specific usage.

When benchmarking your crate’s public API, place benchmarks in benches/ and keep them focused on the hot path — typically parsers, serializers, or validation boundaries:

cargo bench                  # Run all benchmarks
cargo bench -- parse_config  # Run specific benchmark
# Results in target/criterion/ with HTML reports

Key Takeaways — Architecture & API Design

  • Accept the most general type (impl Into, impl AsRef, Cow); return the most specific
  • Parse Don’t Validate: use TryFrom to create types that are valid by construction
  • #[non_exhaustive] on public enums prevents breaking changes when adding variants
  • #[must_use] catches silent discards of important values

See also: Ch 9 — Error Handling for error type design in public APIs. Ch 13 — Testing for testing your crate’s public API.


Exercise: Crate API Refactoring ★★ (~30 min)

Refactor the following “stringly-typed” API into one that uses TryFrom, newtypes, and builder pattern:

// BEFORE: Easy to misuse
fn create_server(host: &str, port: &str, max_conn: &str) -> Server { ... }

Design a ServerConfig with validated types Host, Port (1–65535), and MaxConnections (1–10000) that reject invalid values at parse time.

🔑 Solution
#[derive(Debug, Clone)]
struct Host(String);

impl TryFrom<&str> for Host {
    type Error = String;
    fn try_from(s: &str) -> Result<Self, String> {
        if s.is_empty() { return Err("host cannot be empty".into()); }
        if s.contains(' ') { return Err("host cannot contain spaces".into()); }
        Ok(Host(s.to_string()))
    }
}

#[derive(Debug, Clone, Copy)]
struct Port(u16);

impl TryFrom<u16> for Port {
    type Error = String;
    fn try_from(p: u16) -> Result<Self, String> {
        if p == 0 { return Err("port must be >= 1".into()); }
        Ok(Port(p))
    }
}

#[derive(Debug, Clone, Copy)]
struct MaxConnections(u32);

impl TryFrom<u32> for MaxConnections {
    type Error = String;
    fn try_from(n: u32) -> Result<Self, String> {
        if n == 0 || n > 10_000 {
            return Err(format!("max_connections must be 1–10000, got {n}"));
        }
        Ok(MaxConnections(n))
    }
}

#[derive(Debug)]
struct ServerConfig {
    host: Host,
    port: Port,
    max_connections: MaxConnections,
}

impl ServerConfig {
    fn new(host: Host, port: Port, max_connections: MaxConnections) -> Self {
        ServerConfig { host, port, max_connections }
    }
}

fn main() {
    let config = ServerConfig::new(
        Host::try_from("localhost").unwrap(),
        Port::try_from(8080).unwrap(),
        MaxConnections::try_from(100).unwrap(),
    );
    println!("{config:?}");

    // Invalid values caught at parse time:
    assert!(Host::try_from("").is_err());
    assert!(Port::try_from(0).is_err());
    assert!(MaxConnections::try_from(99999).is_err());
}

15. Async/Await Essentials 🔴

What you’ll learn:

  • How Rust’s Future trait differs from Go’s goroutines and Python’s asyncio
  • Tokio quick-start: spawning tasks, join!, and runtime configuration
  • Common async pitfalls and how to fix them
  • When to offload blocking work with spawn_blocking

Futures, Runtimes, and async fn

Rust’s async model is fundamentally different from Go’s goroutines or Python’s asyncio. Understanding three concepts is enough to get started:

  1. A Future is a lazy state machine — calling async fn doesn’t execute anything; it returns a Future that must be polled.
  2. You need a runtime to poll futures — tokio, async-std, or smol. The standard library defines Future but provides no runtime.
  3. async fn is sugar — the compiler transforms it into a state machine that implements Future.
#![allow(unused)]
fn main() {
// A Future is just a trait:
pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

// async fn desugars to:
// fn fetch_data(url: &str) -> impl Future<Output = Result<Vec<u8>, Error>>
async fn fetch_data(url: &str) -> Result<Vec<u8>, reqwest::Error> {
    let response = reqwest::get(url).await?;  // .await yields until ready
    let bytes = response.bytes().await?;
    Ok(bytes.to_vec())
}
}

Tokio Quick Start

Cargo.toml

[dependencies] tokio = { version = “1”, features = [“full”] }

use tokio::time::{sleep, Duration};
use tokio::task;

#[tokio::main]
async fn main() {
    // Spawn concurrent tasks (like lightweight threads):
    let handle_a = task::spawn(async {
        sleep(Duration::from_millis(100)).await;
        "task A done"
    });

    let handle_b = task::spawn(async {
        sleep(Duration::from_millis(50)).await;
        "task B done"
    });

    // .await both — they run concurrently, not sequentially:
    let (a, b) = tokio::join!(handle_a, handle_b);
    println!("{}, {}", a.unwrap(), b.unwrap());
}

Async Common Pitfalls

PitfallWhy It HappensFix
Blocking in asyncstd::thread::sleep or CPU work blocks the executorUse tokio::task::spawn_blocking or rayon
Send bound errorsFuture held across .await contains !Send type (e.g., Rc, MutexGuard)Restructure to drop non-Send values before .await
Future not polledCalling async fn without .await or spawning — nothing happensAlways .await or tokio::spawn the returned future
Holding MutexGuard across .awaitstd::sync::MutexGuard is !Send; async tasks may resume on different threadUse tokio::sync::Mutex or drop the guard before .await
Accidental sequential executionlet a = foo().await; let b = bar().await; runs sequentiallyUse tokio::join! or tokio::spawn for concurrency
#![allow(unused)]
fn main() {
// ❌ Blocking the async executor:
async fn bad() {
    std::thread::sleep(std::time::Duration::from_secs(5)); // Blocks entire thread!
}

// ✅ Offload blocking work:
async fn good() {
    tokio::task::spawn_blocking(|| {
        std::thread::sleep(std::time::Duration::from_secs(5)); // Runs on blocking pool
    }).await.unwrap();
}
}

Comprehensive async coverage: For Stream, select!, cancellation safety, structured concurrency, and tower middleware, see our dedicated Async Rust Training guide. This section covers just enough to read and write basic async code.

Spawning and Structured Concurrency

Tokio’s spawn creates a new asynchronous task — similar to thread::spawn but much lighter:

use tokio::task;
use tokio::time::{sleep, Duration};

#[tokio::main]
async fn main() {
    // Spawn three concurrent tasks
    let h1 = task::spawn(async {
        sleep(Duration::from_millis(200)).await;
        "fetched user profile"
    });

    let h2 = task::spawn(async {
        sleep(Duration::from_millis(100)).await;
        "fetched order history"
    });

    let h3 = task::spawn(async {
        sleep(Duration::from_millis(150)).await;
        "fetched recommendations"
    });

    // Wait for all three concurrently (not sequentially!)
    let (r1, r2, r3) = tokio::join!(h1, h2, h3);
    println!("{}", r1.unwrap());
    println!("{}", r2.unwrap());
    println!("{}", r3.unwrap());
}

join! vs try_join! vs select!:

MacroBehaviorUse when
join!Waits for ALL futuresAll tasks must complete
try_join!Waits for all, short-circuits on first ErrTasks return Result
select!Returns when FIRST future completesTimeouts, cancellation
use tokio::time::{timeout, Duration};

async fn fetch_with_timeout() -> Result<String, Box<dyn std::error::Error>> {
    let result = timeout(Duration::from_secs(5), async {
        // Simulate slow network call
        tokio::time::sleep(Duration::from_millis(100)).await;
        Ok::<_, Box<dyn std::error::Error>>("data".to_string())
    }).await??; // First ? unwraps Elapsed, second ? unwraps inner Result

    Ok(result)
}

Send Bounds and Why Futures Must Be Send

When you tokio::spawn a future, it may resume on a different OS thread. This means the future must be Send. Common pitfalls:

use std::rc::Rc;

async fn not_send() {
    let rc = Rc::new(42); // Rc is !Send
    tokio::time::sleep(std::time::Duration::from_millis(10)).await;
    println!("{}", rc); // rc is held across .await — future is !Send
}

// Fix 1: Drop before .await
async fn fixed_drop() {
    let data = {
        let rc = Rc::new(42);
        *rc // Copy the value out
    }; // rc dropped here
    tokio::time::sleep(std::time::Duration::from_millis(10)).await;
    println!("{}", data); // Just an i32, which is Send
}

// Fix 2: Use Arc instead of Rc
async fn fixed_arc() {
    let arc = std::sync::Arc::new(42); // Arc is Send
    tokio::time::sleep(std::time::Duration::from_millis(10)).await;
    println!("{}", arc); // ✅ Future is Send
}

Comprehensive async coverage: For Stream, select!, cancellation safety, structured concurrency, and tower middleware, see our dedicated Async Rust Training guide. This section covers just enough to read and write basic async code.

See also: Ch 5 — Channels for synchronous channels. Ch 6 — Concurrency for OS threads vs async tasks.

Key Takeaways — Async

  • async fn returns a lazy Future — nothing runs until you .await or spawn it
  • Use tokio::task::spawn_blocking for CPU-heavy or blocking work inside async contexts
  • Don’t hold std::sync::MutexGuard across .await — use tokio::sync::Mutex instead
  • Futures must be Send when spawned — drop !Send types before .await points

Exercise: Concurrent Fetcher with Timeout ★★ (~25 min)

Write an async function fetch_all that spawns three tokio::spawn tasks, each simulating a network call with tokio::time::sleep. Join all three with tokio::try_join! wrapped in tokio::time::timeout(Duration::from_secs(5), ...). Return Result<Vec<String>, ...> or an error if any task fails or the deadline expires.

🔑 Solution
use tokio::time::{sleep, timeout, Duration};

async fn fake_fetch(name: &'static str, delay_ms: u64) -> Result<String, String> {
    sleep(Duration::from_millis(delay_ms)).await;
    Ok(format!("{name}: OK"))
}

async fn fetch_all() -> Result<Vec<String>, Box<dyn std::error::Error>> {
    let deadline = Duration::from_secs(5);

    let (a, b, c) = timeout(deadline, async {
        let h1 = tokio::spawn(fake_fetch("svc-a", 100));
        let h2 = tokio::spawn(fake_fetch("svc-b", 200));
        let h3 = tokio::spawn(fake_fetch("svc-c", 150));
        tokio::try_join!(h1, h2, h3)
    })
    .await??;

    Ok(vec![a?, b?, c?])
}

#[tokio::main]
async fn main() {
    let results = fetch_all().await.unwrap();
    for r in &results {
        println!("{r}");
    }
}

17. Exercises

Exercises

Exercise 1: Type-Safe State Machine ★★ (~30 min)

Build a traffic light state machine using the type-state pattern. The light must transition Red → Green → Yellow → Red and no other order should be possible.

🔑 Solution
use std::marker::PhantomData;

struct Red;
struct Green;
struct Yellow;

struct TrafficLight<State> {
    _state: PhantomData<State>,
}

impl TrafficLight<Red> {
    fn new() -> Self {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }

    fn go(self) -> TrafficLight<Green> {
        println!("🟢 Green — GO");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Green> {
    fn caution(self) -> TrafficLight<Yellow> {
        println!("🟡 Yellow — CAUTION");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Yellow> {
    fn stop(self) -> TrafficLight<Red> {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }
}

fn main() {
    let light = TrafficLight::new(); // Red
    let light = light.go();          // Green
    let light = light.caution();     // Yellow
    let light = light.stop();        // Red

    // light.caution(); // ❌ Compile error: no method `caution` on Red
    // TrafficLight::new().stop(); // ❌ Compile error: no method `stop` on Red
}

Key takeaway: Invalid transitions are compile errors, not runtime panics.


Exercise 2: Unit-of-Measure with PhantomData ★★ (~30 min)

Extend the unit-of-measure pattern from Ch4 to support:

  • Meters, Seconds, Kilograms
  • Addition of same units
  • Multiplication: Meters * Meters = SquareMeters
  • Division: Meters / Seconds = MetersPerSecond
🔑 Solution
use std::marker::PhantomData;
use std::ops::{Add, Mul, Div};

#[derive(Clone, Copy)]
struct Meters;
#[derive(Clone, Copy)]
struct Seconds;
#[derive(Clone, Copy)]
struct Kilograms;
#[derive(Clone, Copy)]
struct SquareMeters;
#[derive(Clone, Copy)]
struct MetersPerSecond;

#[derive(Debug, Clone, Copy)]
struct Qty<U> {
    value: f64,
    _unit: PhantomData<U>,
}

impl<U> Qty<U> {
    fn new(v: f64) -> Self { Qty { value: v, _unit: PhantomData } }
}

impl<U> Add for Qty<U> {
    type Output = Qty<U>;
    fn add(self, rhs: Self) -> Self::Output { Qty::new(self.value + rhs.value) }
}

impl Mul<Qty<Meters>> for Qty<Meters> {
    type Output = Qty<SquareMeters>;
    fn mul(self, rhs: Qty<Meters>) -> Qty<SquareMeters> {
        Qty::new(self.value * rhs.value)
    }
}

impl Div<Qty<Seconds>> for Qty<Meters> {
    type Output = Qty<MetersPerSecond>;
    fn div(self, rhs: Qty<Seconds>) -> Qty<MetersPerSecond> {
        Qty::new(self.value / rhs.value)
    }
}

fn main() {
    let width = Qty::<Meters>::new(5.0);
    let height = Qty::<Meters>::new(3.0);
    let area = width * height; // Qty<SquareMeters>
    println!("Area: {:.1} m²", area.value);

    let dist = Qty::<Meters>::new(100.0);
    let time = Qty::<Seconds>::new(9.58);
    let speed = dist / time;
    println!("Speed: {:.2} m/s", speed.value);

    let sum = width + height; // Same unit ✅
    println!("Sum: {:.1} m", sum.value);

    // let bad = width + time; // ❌ Compile error: can't add Meters + Seconds
}

Exercise 3: Channel-Based Worker Pool ★★★ (~45 min)

Build a worker pool using channels where:

  • A dispatcher sends Job structs through a channel
  • N workers consume jobs and send results back
  • Use crossbeam-channel (or std::sync::mpsc if crossbeam is unavailable)
🔑 Solution
use std::sync::mpsc;
use std::thread;

struct Job {
    id: u64,
    data: String,
}

struct JobResult {
    job_id: u64,
    output: String,
    worker_id: usize,
}

fn worker_pool(jobs: Vec<Job>, num_workers: usize) -> Vec<JobResult> {
    let (job_tx, job_rx) = mpsc::channel::<Job>();
    let (result_tx, result_rx) = mpsc::channel::<JobResult>();

    // Wrap receiver in Arc<Mutex> for sharing among workers
    let job_rx = std::sync::Arc::new(std::sync::Mutex::new(job_rx));

    // Spawn workers
    let mut handles = Vec::new();
    for worker_id in 0..num_workers {
        let job_rx = job_rx.clone();
        let result_tx = result_tx.clone();
        handles.push(thread::spawn(move || {
            loop {
                // Lock, receive, unlock — short critical section
                let job = {
                    let rx = job_rx.lock().unwrap();
                    rx.recv() // Blocks until a job or channel closes
                };
                match job {
                    Ok(job) => {
                        let output = format!("processed '{}' by worker {worker_id}", job.data);
                        result_tx.send(JobResult {
                            job_id: job.id,
                            output,
                            worker_id,
                        }).unwrap();
                    }
                    Err(_) => break, // Channel closed — exit
                }
            }
        }));
    }
    drop(result_tx); // Drop our copy so result channel closes when workers finish

    // Dispatch jobs
    let num_jobs = jobs.len();
    for job in jobs {
        job_tx.send(job).unwrap();
    }
    drop(job_tx); // Close the job channel — workers will exit after draining

    // Collect results
    let mut results = Vec::new();
    for result in result_rx {
        results.push(result);
    }
    assert_eq!(results.len(), num_jobs);

    for h in handles { h.join().unwrap(); }
    results
}

fn main() {
    let jobs: Vec<Job> = (0..20).map(|i| Job {
        id: i,
        data: format!("task-{i}"),
    }).collect();

    let results = worker_pool(jobs, 4);
    for r in &results {
        println!("[worker {}] job {}: {}", r.worker_id, r.job_id, r.output);
    }
}

Exercise 4: Higher-Order Combinator Pipeline ★★ (~25 min)

Create a Pipeline struct that chains transformations. It should support .pipe(f) to add a transformation and .execute(input) to run the full chain.

🔑 Solution
struct Pipeline<T> {
    transforms: Vec<Box<dyn Fn(T) -> T>>,
}

impl<T: 'static> Pipeline<T> {
    fn new() -> Self {
        Pipeline { transforms: Vec::new() }
    }

    fn pipe(mut self, f: impl Fn(T) -> T + 'static) -> Self {
        self.transforms.push(Box::new(f));
        self
    }

    fn execute(self, input: T) -> T {
        self.transforms.into_iter().fold(input, |val, f| f(val))
    }
}

fn main() {
    let result = Pipeline::new()
        .pipe(|s: String| s.trim().to_string())
        .pipe(|s| s.to_uppercase())
        .pipe(|s| format!(">>> {s} <<<"))
        .execute("  hello world  ".to_string());

    println!("{result}"); // >>> HELLO WORLD <<<

    // Numeric pipeline:
    let result = Pipeline::new()
        .pipe(|x: i32| x * 2)
        .pipe(|x| x + 10)
        .pipe(|x| x * x)
        .execute(5);

    println!("{result}"); // (5*2 + 10)^2 = 400
}

Bonus: Generic pipeline that changes type between stages would use a different design — each .pipe() returns a Pipeline with a different output type (this requires more advanced generic plumbing).


Exercise 5: Error Hierarchy with thiserror ★★ (~30 min)

Design an error type hierarchy for a file-processing application that can fail during I/O, parsing (JSON and CSV), and validation. Use thiserror and demonstrate ? propagation.

🔑 Solution
use thiserror::Error;

#[derive(Error, Debug)]
pub enum AppError {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),

    #[error("JSON parse error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("CSV error at line {line}: {message}")]
    Csv { line: usize, message: String },

    #[error("validation error: {field} — {reason}")]
    Validation { field: String, reason: String },
}

fn read_file(path: &str) -> Result<String, AppError> {
    Ok(std::fs::read_to_string(path)?) // io::Error → AppError::Io via #[from]
}

fn parse_json(content: &str) -> Result<serde_json::Value, AppError> {
    Ok(serde_json::from_str(content)?) // serde_json::Error → AppError::Json
}

fn validate_name(value: &serde_json::Value) -> Result<String, AppError> {
    let name = value.get("name")
        .and_then(|v| v.as_str())
        .ok_or_else(|| AppError::Validation {
            field: "name".into(),
            reason: "must be a non-null string".into(),
        })?;

    if name.is_empty() {
        return Err(AppError::Validation {
            field: "name".into(),
            reason: "must not be empty".into(),
        });
    }

    Ok(name.to_string())
}

fn process_file(path: &str) -> Result<String, AppError> {
    let content = read_file(path)?;
    let json = parse_json(&content)?;
    let name = validate_name(&json)?;
    Ok(name)
}

fn main() {
    match process_file("config.json") {
        Ok(name) => println!("Name: {name}"),
        Err(e) => eprintln!("Error: {e}"),
    }
}

Exercise 6: Generic Trait with Associated Types ★★★ (~40 min)

Design a Repository<T> trait with associated Error and Id types. Implement it for an in-memory store and demonstrate compile-time type safety.

🔑 Solution
use std::collections::HashMap;

trait Repository {
    type Item;
    type Id;
    type Error;

    fn get(&self, id: &Self::Id) -> Result<Option<&Self::Item>, Self::Error>;
    fn insert(&mut self, item: Self::Item) -> Result<Self::Id, Self::Error>;
    fn delete(&mut self, id: &Self::Id) -> Result<bool, Self::Error>;
}

#[derive(Debug, Clone)]
struct User {
    name: String,
    email: String,
}

struct InMemoryUserRepo {
    data: HashMap<u64, User>,
    next_id: u64,
}

impl InMemoryUserRepo {
    fn new() -> Self {
        InMemoryUserRepo { data: HashMap::new(), next_id: 1 }
    }
}

// Error type is Infallible — in-memory ops never fail
impl Repository for InMemoryUserRepo {
    type Item = User;
    type Id = u64;
    type Error = std::convert::Infallible;

    fn get(&self, id: &u64) -> Result<Option<&User>, Self::Error> {
        Ok(self.data.get(id))
    }

    fn insert(&mut self, item: User) -> Result<u64, Self::Error> {
        let id = self.next_id;
        self.next_id += 1;
        self.data.insert(id, item);
        Ok(id)
    }

    fn delete(&mut self, id: &u64) -> Result<bool, Self::Error> {
        Ok(self.data.remove(id).is_some())
    }
}

// Generic function works with ANY repository:
fn create_and_fetch<R: Repository>(repo: &mut R, item: R::Item) -> Result<(), R::Error>
where
    R::Item: std::fmt::Debug,
    R::Id: std::fmt::Debug,
{
    let id = repo.insert(item)?;
    println!("Inserted with id: {id:?}");
    let retrieved = repo.get(&id)?;
    println!("Retrieved: {retrieved:?}");
    Ok(())
}

fn main() {
    let mut repo = InMemoryUserRepo::new();
    create_and_fetch(&mut repo, User {
        name: "Alice".into(),
        email: "alice@example.com".into(),
    }).unwrap();
}

Exercise 7: Safe Wrapper around Unsafe (Ch11) ★★★ (~45 min)

Write a FixedVec<T, const N: usize> — a fixed-capacity, stack-allocated vector. Requirements:

  • push(&mut self, value: T) -> Result<(), T> returns Err(value) when full
  • pop(&mut self) -> Option<T> returns and removes the last element
  • as_slice(&self) -> &[T] borrows initialized elements
  • All public methods must be safe; all unsafe must be encapsulated with SAFETY: comments
  • Drop must clean up initialized elements

Hint: Use MaybeUninit<T> and [const { MaybeUninit::uninit() }; N].

🔑 Solution
use std::mem::MaybeUninit;

pub struct FixedVec<T, const N: usize> {
    data: [MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> FixedVec<T, N> {
    pub fn new() -> Self {
        FixedVec {
            data: [const { MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N { return Err(value); }
        // SAFETY: len < N, so data[len] is within bounds.
        self.data[self.len] = MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn pop(&mut self) -> Option<T> {
        if self.len == 0 { return None; }
        self.len -= 1;
        // SAFETY: data[len] was initialized (len was > 0 before decrement).
        Some(unsafe { self.data[self.len].assume_init_read() })
    }

    pub fn as_slice(&self) -> &[T] {
        // SAFETY: data[0..len] are all initialized, and MaybeUninit<T>
        // has the same layout as T.
        unsafe { std::slice::from_raw_parts(self.data.as_ptr() as *const T, self.len) }
    }

    pub fn len(&self) -> usize { self.len }
    pub fn is_empty(&self) -> bool { self.len == 0 }
}

impl<T, const N: usize> Drop for FixedVec<T, N> {
    fn drop(&mut self) {
        // SAFETY: data[0..len] are initialized — drop each one.
        for i in 0..self.len {
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}

fn main() {
    let mut v = FixedVec::<String, 4>::new();
    v.push("hello".into()).unwrap();
    v.push("world".into()).unwrap();
    assert_eq!(v.as_slice(), &["hello", "world"]);
    assert_eq!(v.pop(), Some("world".into()));
    assert_eq!(v.len(), 1);
    // Drop cleans up remaining "hello"
}

Exercise 8: Declarative Macro — map! (Ch12) ★ (~15 min)

Write a map! macro that creates a HashMap from key-value pairs, similar to vec![]:

#![allow(unused)]
fn main() {
let m = map! {
    "host" => "localhost",
    "port" => "8080",
};
assert_eq!(m.get("host"), Some(&"localhost"));
assert_eq!(m.len(), 2);
}

Requirements:

  • Support trailing comma
  • Support empty invocation map!{}
  • Work with any types that implement Into<K> and Into<V> for maximum flexibility
🔑 Solution
macro_rules! map {
    // Empty case
    () => {
        std::collections::HashMap::new()
    };
    // One or more key => value pairs (trailing comma optional)
    ( $( $key:expr => $val:expr ),+ $(,)? ) => {{
        let mut m = std::collections::HashMap::new();
        $( m.insert($key, $val); )+
        m
    }};
}

fn main() {
    // Basic usage:
    let config = map! {
        "host" => "localhost",
        "port" => "8080",
        "timeout" => "30",
    };
    assert_eq!(config.len(), 3);
    assert_eq!(config["host"], "localhost");

    // Empty map:
    let empty: std::collections::HashMap<String, String> = map!();
    assert!(empty.is_empty());

    // Different types:
    let scores = map! {
        1 => 100,
        2 => 200,
    };
    assert_eq!(scores[&1], 100);
}

Exercise 9: Custom serde Deserialization (Ch10) ★★★ (~45 min)

Design a Duration wrapper that deserializes from human-readable strings like "30s", "5m", "2h" using a custom serde deserializer. The struct should also serialize back to the same format.

🔑 Solution
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

#[derive(Debug, Clone, PartialEq)]
struct HumanDuration(std::time::Duration);

impl HumanDuration {
    fn from_str(s: &str) -> Result<Self, String> {
        let s = s.trim();
        if s.is_empty() { return Err("empty duration string".into()); }

        let (num_str, suffix) = s.split_at(
            s.find(|c: char| !c.is_ascii_digit()).unwrap_or(s.len())
        );
        let value: u64 = num_str.parse()
            .map_err(|_| format!("invalid number: {num_str}"))?;

        let duration = match suffix {
            "s" | "sec"  => std::time::Duration::from_secs(value),
            "m" | "min"  => std::time::Duration::from_secs(value * 60),
            "h" | "hr"   => std::time::Duration::from_secs(value * 3600),
            "ms"         => std::time::Duration::from_millis(value),
            other        => return Err(format!("unknown suffix: {other}")),
        };
        Ok(HumanDuration(duration))
    }
}

impl fmt::Display for HumanDuration {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let secs = self.0.as_secs();
        if secs == 0 {
            write!(f, "{}ms", self.0.as_millis())
        } else if secs % 3600 == 0 {
            write!(f, "{}h", secs / 3600)
        } else if secs % 60 == 0 {
            write!(f, "{}m", secs / 60)
        } else {
            write!(f, "{}s", secs)
        }
    }
}

impl Serialize for HumanDuration {
    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        serializer.serialize_str(&self.to_string())
    }
}

impl<'de> Deserialize<'de> for HumanDuration {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        let s = String::deserialize(deserializer)?;
        HumanDuration::from_str(&s).map_err(serde::de::Error::custom)
    }
}

#[derive(Debug, Deserialize, Serialize)]
struct Config {
    timeout: HumanDuration,
    retry_interval: HumanDuration,
}

fn main() {
    let json = r#"{ "timeout": "30s", "retry_interval": "5m" }"#;
    let config: Config = serde_json::from_str(json).unwrap();

    assert_eq!(config.timeout.0, std::time::Duration::from_secs(30));
    assert_eq!(config.retry_interval.0, std::time::Duration::from_secs(300));

    // Round-trips correctly:
    let serialized = serde_json::to_string(&config).unwrap();
    assert!(serialized.contains("30s"));
    assert!(serialized.contains("5m"));
    println!("Config: {serialized}");
}

Exercise 10 — Concurrent Fetcher with Timeout ★★ (~25 min)

Write an async function fetch_all that spawns three tokio::spawn tasks, each simulating a network call with tokio::time::sleep. Join all three with tokio::try_join! wrapped in tokio::time::timeout(Duration::from_secs(5), ...). Return Result<Vec<String>, ...> or an error if any task fails or the deadline expires.

Learning goals: tokio::spawn, try_join!, timeout, error propagation across task boundaries.

Hint

Each spawned task returns Result<String, _>. try_join! unwraps all three. Wrap the whole try_join! in timeout() — the Elapsed error means you hit the deadline.

Solution
use tokio::time::{sleep, timeout, Duration};

async fn fake_fetch(name: &'static str, delay_ms: u64) -> Result<String, String> {
    sleep(Duration::from_millis(delay_ms)).await;
    Ok(format!("{name}: OK"))
}

async fn fetch_all() -> Result<Vec<String>, Box<dyn std::error::Error>> {
    let deadline = Duration::from_secs(5);

    let (a, b, c) = timeout(deadline, async {
        let h1 = tokio::spawn(fake_fetch("svc-a", 100));
        let h2 = tokio::spawn(fake_fetch("svc-b", 200));
        let h3 = tokio::spawn(fake_fetch("svc-c", 150));
        tokio::try_join!(h1, h2, h3)
    })
    .await??; // first ? = timeout, second ? = join

    Ok(vec![a?, b?, c?]) // unwrap inner Results
}

#[tokio::main]
async fn main() {
    let results = fetch_all().await.unwrap();
    for r in &results {
        println!("{r}");
    }
}

Exercise 11 — Async Channel Pipeline ★★★ (~40 min)

Build a producer → transformer → consumer pipeline using tokio::sync::mpsc:

  1. Producer: sends integers 1..=20 into channel A (capacity 4).
  2. Transformer: reads from channel A, squares each value, sends into channel B.
  3. Consumer: reads from channel B, collects into a Vec<u64>, returns it.

All three stages run as concurrent tokio::spawn tasks. Use bounded channels to demonstrate back-pressure. Assert the final vec equals [1, 4, 9, ..., 400].

Learning goals: mpsc::channel, bounded back-pressure, tokio::spawn with move closures, graceful shutdown via channel close.

Solution
use tokio::sync::mpsc;

#[tokio::main]
async fn main() {
    let (tx_a, mut rx_a) = mpsc::channel::<u64>(4); // bounded — back-pressure
    let (tx_b, mut rx_b) = mpsc::channel::<u64>(4);

    // Producer
    let producer = tokio::spawn(async move {
        for i in 1..=20u64 {
            tx_a.send(i).await.unwrap();
        }
        // tx_a dropped here → channel A closes
    });

    // Transformer
    let transformer = tokio::spawn(async move {
        while let Some(val) = rx_a.recv().await {
            tx_b.send(val * val).await.unwrap();
        }
        // tx_b dropped here → channel B closes
    });

    // Consumer
    let consumer = tokio::spawn(async move {
        let mut results = Vec::new();
        while let Some(val) = rx_b.recv().await {
            results.push(val);
        }
        results
    });

    producer.await.unwrap();
    transformer.await.unwrap();
    let results = consumer.await.unwrap();

    let expected: Vec<u64> = (1..=20).map(|x: u64| x * x).collect();
    assert_eq!(results, expected);
    println!("Pipeline complete: {results:?}");
}

Summary and Reference Card

Quick Reference Card

Pattern Decision Guide

Need type safety for primitives?
└── Newtype pattern (Ch3)

Need compile-time state enforcement?
└── Type-state pattern (Ch3)

Need a "tag" with no runtime data?
└── PhantomData (Ch4)

Need to break Rc/Arc reference cycles?
└── Weak<T> / sync::Weak<T> (Ch8)

Need to wait for a condition without busy-looping?
└── Condvar + Mutex (Ch6)

Need to handle "one of N types"?
├── Known closed set → Enum
├── Open set, hot path → Generics
├── Open set, cold path → dyn Trait
└── Completely unknown types → Any + TypeId (Ch2)

Need shared state across threads?
├── Simple counter/flag → Atomics
├── Short critical section → Mutex
├── Read-heavy → RwLock
├── Lazy one-time init → OnceLock / LazyLock (Ch6)
└── Complex state → Actor + Channels

Need to parallelize computation?
├── Collection processing → rayon::par_iter
├── Background task → thread::spawn
└── Borrow local data → thread::scope

Need async I/O or concurrent networking?
├── Basic → tokio + async/await (Ch15)
└── Advanced (streams, middleware) → see Async Rust Training

Need error handling?
├── Library → thiserror (#[derive(Error)])
└── Application → anyhow (Result<T>)

Need to prevent a value from being moved?
└── Pin<T> (Ch8) — required for Futures, self-referential types

Trait Bounds Cheat Sheet

BoundMeaning
T: CloneCan be duplicated
T: SendCan be moved to another thread
T: Sync&T can be shared between threads
T: 'staticContains no non-static references
T: SizedSize known at compile time (default)
T: ?SizedSize may not be known ([T], dyn Trait)
T: UnpinSafe to move after pinning
T: DefaultHas a default value
T: Into<U>Can be converted to U
T: AsRef<U>Can be borrowed as &U
T: Deref<Target = U>Auto-derefs to &U
F: Fn(A) -> BCallable, borrows state immutably
F: FnMut(A) -> BCallable, may mutate state
F: FnOnce(A) -> BCallable exactly once, may consume state

Lifetime Elision Rules

The compiler inserts lifetimes automatically in three cases (so you don’t have to):

#![allow(unused)]
fn main() {
// Rule 1: Each reference parameter gets its own lifetime
// fn foo(x: &str, y: &str)  →  fn foo<'a, 'b>(x: &'a str, y: &'b str)

// Rule 2: If there's exactly ONE input lifetime, it's used for all outputs
// fn foo(x: &str) -> &str   →  fn foo<'a>(x: &'a str) -> &'a str

// Rule 3: If one parameter is &self or &mut self, its lifetime is used
// fn foo(&self, x: &str) -> &str  →  fn foo<'a>(&'a self, x: &str) -> &'a str
}

When you MUST write explicit lifetimes:

  • Multiple input references and a reference output (compiler can’t guess which input)
  • Struct fields that hold references: struct Ref<'a> { data: &'a str }
  • 'static bounds when you need data without borrowed references

Common Derive Traits

#![allow(unused)]
fn main() {
#[derive(
    Debug,          // {:?} formatting
    Clone,          // .clone()
    Copy,           // Implicit copy (only for simple types)
    PartialEq, Eq,  // == comparison
    PartialOrd, Ord, // < > comparison + sorting
    Hash,           // HashMap/HashSet key
    Default,        // Type::default()
)]
struct MyType { /* ... */ }
}

Module Visibility Quick Reference

pub           → visible everywhere
pub(crate)    → visible within the crate
pub(super)    → visible to parent module
pub(in path)  → visible within a specific path
(nothing)     → private to current module + children

Further Reading

ResourceWhy
Rust Design PatternsCatalog of idiomatic patterns and anti-patterns
Rust API GuidelinesOfficial checklist for polished public APIs
Rust Atomics and LocksMara Bos’s deep dive into concurrency primitives
The RustonomiconOfficial guide to unsafe Rust and dark corners
Error Handling in RustAndrew Gallant’s comprehensive guide
Jon Gjengset — Crust of Rust seriesDeep dives into iterators, lifetimes, channels, etc.
Effective Rust35 specific ways to improve your Rust code

End of Rust Patterns & Engineering How-Tos

Capstone Project: Type-Safe Task Scheduler

This project integrates patterns from across the book into a single, production-style system. You’ll build a type-safe, concurrent task scheduler that uses generics, traits, typestate, channels, error handling, and testing.

Estimated time: 4–6 hours | Difficulty: ★★★

What you’ll practice:

  • Generics and trait bounds (Ch 1–2)
  • Typestate pattern for task lifecycle (Ch 3)
  • PhantomData for zero-cost state markers (Ch 4)
  • Channels for worker communication (Ch 5)
  • Concurrency with scoped threads (Ch 6)
  • Error handling with thiserror (Ch 9)
  • Testing with property-based tests (Ch 13)
  • API design with TryFrom and validated types (Ch 14)

The Problem

Build a task scheduler where:

  1. Tasks have a typed lifecycle: Pending → Running → Completed (or Failed)
  2. Workers pull tasks from a channel, execute them, and report results
  3. The scheduler manages task submission, worker coordination, and result collection
  4. Invalid state transitions are compile-time errors
stateDiagram-v2
    [*] --> Pending: scheduler.submit(task)
    Pending --> Running: worker picks up task
    Running --> Completed: task succeeds
    Running --> Failed: task returns Err
    Completed --> [*]: scheduler.results()
    Failed --> [*]: scheduler.results()

    Pending --> Pending: ❌ can't execute directly
    Completed --> Running: ❌ can't re-run

Step 1: Define the Task Types

Start with the typestate markers and a generic Task:

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// --- State markers (zero-sized) ---
struct Pending;
struct Running;
struct Completed;
struct Failed;

// --- Task ID (newtype for type safety) ---
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
struct TaskId(u64);

// --- The Task struct, parameterized by lifecycle state ---
struct Task<State, R> {
    id: TaskId,
    name: String,
    _state: PhantomData<State>,
    _result: PhantomData<R>,
}
}

Your job: Implement state transitions so that:

  • Task<Pending, R> can transition to Task<Running, R> (via start())
  • Task<Running, R> can transition to Task<Completed, R> or Task<Failed, R>
  • No other transitions compile
💡 Hint

Each transition method should consume self and return the new state:

#![allow(unused)]
fn main() {
impl<R> Task<Pending, R> {
    fn start(self) -> Task<Running, R> {
        Task {
            id: self.id,
            name: self.name,
            _state: PhantomData,
            _result: PhantomData,
        }
    }
}
}

Step 2: Define the Work Function

Tasks need a function to execute. Use a boxed closure:

#![allow(unused)]
fn main() {
struct WorkItem<R: Send + 'static> {
    id: TaskId,
    name: String,
    work: Box<dyn FnOnce() -> Result<R, String> + Send>,
}
}

Your job: Implement WorkItem::new() that accepts a task name and closure. Add a TaskId generator (simple atomic counter or mutex-protected counter).

Step 3: Error Handling

Define the scheduler’s error types using thiserror:

use thiserror::Error;

#[derive(Error, Debug)]
pub enum SchedulerError {
    #[error("scheduler is shut down")]
    ShutDown,

    #[error("task {0:?} failed: {1}")]
    TaskFailed(TaskId, String),

    #[error("channel send error")]
    ChannelError(#[from] std::sync::mpsc::SendError<()>),

    #[error("worker panicked")]
    WorkerPanic,
}

Step 4: The Scheduler

Build the scheduler using channels (Ch 5) and scoped threads (Ch 6):

#![allow(unused)]
fn main() {
use std::sync::mpsc;

struct Scheduler<R: Send + 'static> {
    sender: Option<mpsc::Sender<WorkItem<R>>>,
    results: mpsc::Receiver<TaskResult<R>>,
    num_workers: usize,
}

struct TaskResult<R> {
    id: TaskId,
    name: String,
    outcome: Result<R, String>,
}
}

Your job: Implement:

  • Scheduler::new(num_workers: usize) -> Self — creates channels and spawns workers
  • Scheduler::submit(&self, item: WorkItem<R>) -> Result<TaskId, SchedulerError>
  • Scheduler::shutdown(self) -> Vec<TaskResult<R>> — drops the sender, joins workers, collects results
💡 Hint — Worker loop
#![allow(unused)]
fn main() {
fn worker_loop<R: Send + 'static>(
    rx: std::sync::Arc<std::sync::Mutex<mpsc::Receiver<WorkItem<R>>>>,
    result_tx: mpsc::Sender<TaskResult<R>>,
    worker_id: usize,
) {
    loop {
        let item = {
            let rx = rx.lock().unwrap();
            rx.recv()
        };
        match item {
            Ok(work_item) => {
                let outcome = (work_item.work)();
                let _ = result_tx.send(TaskResult {
                    id: work_item.id,
                    name: work_item.name,
                    outcome,
                });
            }
            Err(_) => break, // Channel closed
        }
    }
}
}

Step 5: Integration Test

Write tests that verify:

  1. Happy path: Submit 10 tasks, shut down, verify all 10 results are Ok
  2. Error handling: Submit tasks that fail, verify TaskResult.outcome is Err
  3. Empty scheduler: Create and immediately shut down — no panics
  4. Property test (bonus): Use proptest to verify that for any N tasks (1..100), the scheduler always returns exactly N results
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn happy_path() {
        let scheduler = Scheduler::<String>::new(4);

        for i in 0..10 {
            let item = WorkItem::new(
                format!("task-{i}"),
                move || Ok(format!("result-{i}")),
            );
            scheduler.submit(item).unwrap();
        }

        let results = scheduler.shutdown();
        assert_eq!(results.len(), 10);
        for r in &results {
            assert!(r.outcome.is_ok());
        }
    }

    #[test]
    fn handles_failures() {
        let scheduler = Scheduler::<String>::new(2);

        scheduler.submit(WorkItem::new("good", || Ok("ok".into()))).unwrap();
        scheduler.submit(WorkItem::new("bad", || Err("boom".into()))).unwrap();

        let results = scheduler.shutdown();
        assert_eq!(results.len(), 2);

        let failures: Vec<_> = results.iter()
            .filter(|r| r.outcome.is_err())
            .collect();
        assert_eq!(failures.len(), 1);
    }
}
}

Step 6: Put It All Together

Here’s the main() that demonstrates the full system:

fn main() {
    let scheduler = Scheduler::<String>::new(4);

    // Submit tasks with varying workloads
    for i in 0..20 {
        let item = WorkItem::new(
            format!("compute-{i}"),
            move || {
                // Simulate work
                std::thread::sleep(std::time::Duration::from_millis(10));
                if i % 7 == 0 {
                    Err(format!("task {i} hit a simulated error"))
                } else {
                    Ok(format!("task {i} completed with value {}", i * i))
                }
            },
        );
        // NOTE: .unwrap() is used for brevity — handle SendError in production.
        scheduler.submit(item).unwrap();
    }

    println!("All tasks submitted. Shutting down...");
    let results = scheduler.shutdown();

    let (ok, err): (Vec<_>, Vec<_>) = results.iter()
        .partition(|r| r.outcome.is_ok());

    println!("\n✅ Succeeded: {}", ok.len());
    for r in &ok {
        println!("  {} → {}", r.name, r.outcome.as_ref().unwrap());
    }

    println!("\n❌ Failed: {}", err.len());
    for r in &err {
        println!("  {} → {}", r.name, r.outcome.as_ref().unwrap_err());
    }
}

Evaluation Criteria

CriterionTarget
Type safetyInvalid state transitions don’t compile
ConcurrencyWorkers run in parallel, no data races
Error handlingAll failures captured in TaskResult, no panics
TestingAt least 3 tests; bonus for proptest
Code organizationClean module structure, public API uses validated types
DocumentationKey types have doc comments explaining invariants

Extension Ideas

Once the basic scheduler works, try these enhancements:

  1. Priority queue: Add a Priority newtype (1–10) and process higher-priority tasks first
  2. Retry policy: Failed tasks retry up to N times before being marked permanently failed
  3. Cancellation: Add a cancel(TaskId) method that removes pending tasks
  4. Async version: Port to tokio::spawn with tokio::sync::mpsc channels (Ch 15)
  5. Metrics: Track per-worker task counts, average execution time, and failure rates