Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rust Engineering Practices — Beyond cargo build

Speaker Intro

  • Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
  • Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
  • Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since

A practical guide to the Rust toolchain features that most teams discover too late: build scripts, cross-compilation, benchmarking, code coverage, and safety verification with Miri and Valgrind. Each chapter uses concrete examples drawn from a real hardware-diagnostics codebase — a large multi-crate workspace — so every technique maps directly to production code.

How to Use This Book

This book is designed for self-paced study or team workshops. Each chapter is largely independent — read them in order or jump to the topic you need.

Difficulty Legend

SymbolLevelMeaning
🟢StarterStraightforward tools with clear patterns — useful on day one
🟡IntermediateRequires understanding of toolchain internals or platform concepts
🔴AdvancedDeep toolchain knowledge, nightly features, or multi-tool orchestration

Pacing Guide

PartChaptersEst. TimeKey Outcome
I — Build & Shipch01–023–4 hBuild metadata, cross-compilation, static binaries
II — Measure & Verifych03–054–5 hStatistical benchmarking, coverage gates, Miri/sanitizers
III — Harden & Optimizech06–106–8 hSupply chain security, release profiles, compile-time tools, no_std, Windows
IV — Integratech11–133–4 hProduction CI/CD pipeline, tricks, capstone exercise
16–21 hFull production engineering pipeline

Working Through Exercises

Each chapter contains 🏋️ exercises with difficulty indicators. Solutions are provided in expandable <details> blocks — try the exercise first, then check your work.

  • 🟢 exercises can often be done in 10–15 minutes
  • 🟡 exercises require 20–40 minutes and may involve running tools locally
  • 🔴 exercises require significant setup and experimentation (1+ hour)

Prerequisites

ConceptWhere to learn it
Cargo workspace layoutRust Book ch14.3
Feature flagsCargo Reference — Features
#[cfg(test)] and basic testingRust Patterns ch12
unsafe blocks and FFI basicsRust Patterns ch10

Chapter Dependency Map

                 ┌──────────┐
                 │ ch00     │
                 │  Intro   │
                 └────┬─────┘
        ┌─────┬───┬──┴──┬──────┬──────┐
        ▼     ▼   ▼     ▼      ▼      ▼
      ch01  ch03 ch04  ch05   ch06   ch09
      Build Bench Cov  Miri   Deps   no_std
        │     │    │    │      │      │
        │     └────┴────┘      │      ▼
        │          │           │    ch10
        ▼          ▼           ▼   Windows
       ch02      ch07        ch07    │
       Cross    RelProf     RelProf  │
        │          │           │     │
        │          ▼           │     │
        │        ch08          │     │
        │      CompTime        │     │
        └──────────┴───────────┴─────┘
                   │
                   ▼
                 ch11
               CI/CD Pipeline
                   │
                   ▼
                ch12 ─── ch13
              Tricks    Quick Ref

Read in any order: ch01, ch03, ch04, ch05, ch06, ch09 are independent. Read after prerequisites: ch02 (needs ch01), ch07–ch08 (benefit from ch03–ch06), ch10 (benefits from ch09). Read last: ch11 (ties everything together), ch12 (tricks), ch13 (reference).

Annotated Table of Contents

Part I — Build & Ship

#ChapterDifficultyDescription
1Build Scripts — build.rs in Depth🟢Compile-time constants, compiling C code, protobuf generation, system library linking, anti-patterns
2Cross-Compilation — One Source, Many Targets🟡Target triples, musl static binaries, ARM cross-compile, cross tool, cargo-zigbuild, GitHub Actions

Part II — Measure & Verify

#ChapterDifficultyDescription
3Benchmarking — Measuring What Matters🟡Criterion.rs, Divan, perf flamegraphs, PGO, continuous benchmarking in CI
4Code Coverage — Seeing What Tests Miss🟢cargo-llvm-cov, cargo-tarpaulin, grcov, Codecov/Coveralls CI integration
5Miri, Valgrind, and Sanitizers🔴MIR interpreter, Valgrind memcheck/Helgrind, ASan/MSan/TSan, cargo-fuzz, loom

Part III — Harden & Optimize

#ChapterDifficultyDescription
6Dependency Management and Supply Chain Security🟢cargo-audit, cargo-deny, cargo-vet, cargo-outdated, cargo-semver-checks
7Release Profiles and Binary Size🟡Release profile anatomy, LTO trade-offs, cargo-bloat, cargo-udeps
8Compile-Time and Developer Tools🟡sccache, mold, cargo-nextest, cargo-expand, cargo-geiger, workspace lints, MSRV
9no_std and Feature Verification🔴cargo-hack, core/alloc/std layers, custom panic handlers, testing no_std code
10Windows and Conditional Compilation🟡#[cfg] patterns, windows-sys/windows crates, cargo-xwin, platform abstraction

Part IV — Integrate

#ChapterDifficultyDescription
11Putting It All Together — A Production CI/CD Pipeline🟡GitHub Actions workflow, cargo-make, pre-commit hooks, cargo-dist, capstone
12Tricks from the Trenches🟡10 battle-tested patterns: deny(warnings) trap, cache tuning, dep dedup, RUSTFLAGS, more
13Quick Reference CardCommands at a glance, 60+ decision table entries, further reading links

Build Scripts — build.rs in Depth 🟢

What you’ll learn:

  • How build.rs fits into the Cargo build pipeline and when it runs
  • Five production patterns: compile-time constants, C/C++ compilation, protobuf codegen, pkg-config linking, and feature detection
  • Anti-patterns that slow builds or break cross-compilation
  • How to balance traceability with reproducible builds

Cross-references: Cross-Compilation uses build scripts for target-aware builds · no_std & Features extends cfg flags set here · CI/CD Pipeline orchestrates build scripts in automation

Every Cargo package can include a file named build.rs at the crate root. Cargo compiles and executes this file before compiling your crate. The build script communicates back to Cargo through println! instructions on stdout.

What build.rs Is and When It Runs

┌─────────────────────────────────────────────────────────┐
│                    Cargo Build Pipeline                  │
│                                                         │
│  1. Resolve dependencies                                │
│  2. Download crates                                     │
│  3. Compile build.rs  ← ordinary Rust, runs on HOST     │
│  4. Execute build.rs  ← stdout → Cargo instructions     │
│  5. Compile the crate (using instructions from step 4)  │
│  6. Link                                                │
└─────────────────────────────────────────────────────────┘

Key facts:

  • build.rs runs on the host machine, not the target. During cross-compilation, the build script runs on your development machine even when the final binary targets a different architecture.
  • The build script’s scope is limited to its own package. It cannot affect how other crates compile — unless the package declares a links key in Cargo.toml, which enables passing metadata to dependent crates via cargo::metadata=KEY=VALUE.
  • It runs every time Cargo detects a change — unless you emit cargo::rerun-if-changed instructions to limit re-runs.

Note (Rust 1.71+): Since Rust 1.71, Cargo fingerprints the compiled build.rs binary — if the binary is identical, it won’t re-run even if source timestamps changed. However, cargo::rerun-if-changed=build.rs is still valuable: without any rerun-if-changed instruction, Cargo re-runs build.rs whenever any file in the package changes (not just build.rs). Emitting cargo::rerun-if-changed=build.rs limits re-runs to only when build.rs itself changes — a significant compile-time saving in large crates.

  • It can emit cfg flags, environment variables, linker arguments, and file paths that the main crate consumes.

The minimal Cargo.toml entry:

[package]
name = "my-crate"
version = "0.1.0"
edition = "2021"
build = "build.rs"       # default — Cargo looks for build.rs automatically
# build = "src/build.rs" # or put it elsewhere

The Cargo Instruction Protocol

Your build script communicates with Cargo by printing instructions to stdout. Since Rust 1.77, the preferred prefix is cargo:: (replacing the older cargo: single-colon form).

InstructionPurpose
cargo::rerun-if-changed=PATHOnly re-run build.rs when PATH changes
cargo::rerun-if-env-changed=VAROnly re-run when environment variable VAR changes
cargo::rustc-link-lib=NAMELink against native library NAME
cargo::rustc-link-search=PATHAdd PATH to the library search path
cargo::rustc-cfg=KEYSet a #[cfg(KEY)] flag for conditional compilation
cargo::rustc-cfg=KEY="VALUE"Set a #[cfg(KEY = "VALUE")] flag
cargo::rustc-env=KEY=VALUESet an environment variable accessible via env!()
cargo::rustc-cdylib-link-arg=FLAGPass FLAG to the linker for cdylib targets
cargo::warning=MESSAGEDisplay a warning during compilation
cargo::metadata=KEY=VALUEStore metadata readable by dependent crates
// build.rs — minimal example
fn main() {
    // Only re-run if build.rs itself changes
    println!("cargo::rerun-if-changed=build.rs");

    // Set a compile-time environment variable
    let timestamp = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs().to_string())
        .unwrap_or_else(|_| "0".into());
    println!("cargo::rustc-env=BUILD_TIMESTAMP={timestamp}");
}

Pattern 1: Compile-Time Constants

The most common use case: baking build metadata into the binary so you can report it at runtime (git hash, build date, CI job ID).

// build.rs
use std::process::Command;

fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=.git/refs");

    // Git commit hash
    let output = Command::new("git")
        .args(["rev-parse", "--short", "HEAD"])
        .output()
        .expect("git not found");
    let git_hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
    println!("cargo::rustc-env=GIT_HASH={git_hash}");

    // Build profile (debug or release)
    let profile = std::env::var("PROFILE").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=BUILD_PROFILE={profile}");

    // Target triple
    let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=BUILD_TARGET={target}");
}
#![allow(unused)]
fn main() {
// src/main.rs — consuming the build-time values
fn print_version() {
    println!(
        "{} {} (git:{} target:{} profile:{})",
        env!("CARGO_PKG_NAME"),
        env!("CARGO_PKG_VERSION"),
        env!("GIT_HASH"),
        env!("BUILD_TARGET"),
        env!("BUILD_PROFILE"),
    );
}
}

Built-in Cargo environment variables you get for free, no build.rs needed: CARGO_PKG_NAME, CARGO_PKG_VERSION, CARGO_PKG_AUTHORS, CARGO_PKG_DESCRIPTION, CARGO_MANIFEST_DIR. See the full list.

Pattern 2: Compiling C/C++ Code with the cc Crate

When your Rust crate wraps a C library or needs a small C helper (common in hardware interfaces), the cc crate simplifies compilation inside build.rs.

# Cargo.toml
[build-dependencies]
cc = "1.0"
// build.rs
fn main() {
    println!("cargo::rerun-if-changed=csrc/");

    cc::Build::new()
        .file("csrc/ipmi_raw.c")
        .file("csrc/smbios_parser.c")
        .include("csrc/include")
        .flag("-Wall")
        .flag("-Wextra")
        .opt_level(2)
        .compile("diag_helpers");
    // This produces libdiag_helpers.a and emits the right
    // cargo::rustc-link-lib and cargo::rustc-link-search instructions.
}
#![allow(unused)]
fn main() {
// src/lib.rs — FFI bindings to the compiled C code
extern "C" {
    fn ipmi_raw_command(
        netfn: u8,
        cmd: u8,
        data: *const u8,
        data_len: usize,
        response: *mut u8,
        response_len: *mut usize,
    ) -> i32;
}

/// Safe wrapper around the raw IPMI command interface.
/// Assumes: enum IpmiError { CommandFailed(i32), ... }
pub fn send_ipmi_command(netfn: u8, cmd: u8, data: &[u8]) -> Result<Vec<u8>, IpmiError> {
    let mut response = vec![0u8; 256];
    let mut response_len: usize = response.len();

    // SAFETY: response buffer is large enough and response_len is correctly initialized.
    let rc = unsafe {
        ipmi_raw_command(
            netfn,
            cmd,
            data.as_ptr(),
            data.len(),
            response.as_mut_ptr(),
            &mut response_len,
        )
    };

    if rc != 0 {
        return Err(IpmiError::CommandFailed(rc));
    }
    response.truncate(response_len);
    Ok(response)
}
}

For C++ code, use .cpp(true) and .flag("-std=c++17"):

// build.rs — C++ variant
fn main() {
    println!("cargo::rerun-if-changed=cppsrc/");

    cc::Build::new()
        .cpp(true)
        .file("cppsrc/vendor_parser.cpp")
        .flag("-std=c++17")
        .flag("-fno-exceptions")    // match Rust's no-exception model
        .compile("vendor_helpers");
}

Pattern 3: Protocol Buffers and Code Generation

Build scripts excel at code generation — turning .proto, .fbs, or .json schema files into Rust source at compile time. Here’s the protobuf pattern using prost-build:

# Cargo.toml
[build-dependencies]
prost-build = "0.13"
// build.rs
fn main() {
    println!("cargo::rerun-if-changed=proto/");

    prost_build::compile_protos(
        &["proto/diagnostics.proto", "proto/telemetry.proto"],
        &["proto/"],
    )
    .expect("Failed to compile protobuf definitions");
}
#![allow(unused)]
fn main() {
// src/lib.rs — include the generated code
pub mod diagnostics {
    include!(concat!(env!("OUT_DIR"), "/diagnostics.rs"));
}

pub mod telemetry {
    include!(concat!(env!("OUT_DIR"), "/telemetry.rs"));
}
}

OUT_DIR is a Cargo-provided directory where build scripts should place generated files. Each crate gets its own OUT_DIR under target/.

Pattern 4: Linking System Libraries with pkg-config

For system libraries that provide .pc files (systemd, OpenSSL, libpci), the pkg-config crate probes the system and emits the right link instructions:

# Cargo.toml
[build-dependencies]
pkg-config = "0.3"
// build.rs
fn main() {
    // Probe for libpci (used for PCIe device enumeration)
    pkg_config::Config::new()
        .atleast_version("3.6.0")
        .probe("libpci")
        .expect("libpci >= 3.6.0 not found — install pciutils-dev");

    // Probe for libsystemd (optional — for sd_notify integration)
    if pkg_config::probe_library("libsystemd").is_ok() {
        println!("cargo::rustc-cfg=has_systemd");
    }
}
#![allow(unused)]
fn main() {
// src/lib.rs — conditional compilation based on pkg-config probing
#[cfg(has_systemd)]
mod systemd_notify {
    extern "C" {
        fn sd_notify(unset_environment: i32, state: *const std::ffi::c_char) -> i32;
    }

    pub fn notify_ready() {
        let state = std::ffi::CString::new("READY=1").unwrap();
        // SAFETY: state is a valid null-terminated C string.
        unsafe { sd_notify(0, state.as_ptr()) };
    }
}

#[cfg(not(has_systemd))]
mod systemd_notify {
    pub fn notify_ready() {
        // no-op on systems without systemd
    }
}
}

Pattern 5: Feature Detection and Conditional Compilation

Build scripts can probe the compilation environment and set cfg flags that the main crate uses for conditional code paths.

CPU architecture and OS detection (safe — these are compile-time constants):

// build.rs — detect CPU features and OS capabilities
fn main() {
    println!("cargo::rerun-if-changed=build.rs");

    let target = std::env::var("TARGET").unwrap();
    let target_os = std::env::var("CARGO_CFG_TARGET_OS").unwrap();

    // Enable AVX2-optimized paths on x86_64
    if target.starts_with("x86_64") {
        println!("cargo::rustc-cfg=has_x86_64");
    }

    // Enable ARM NEON paths on aarch64
    if target.starts_with("aarch64") {
        println!("cargo::rustc-cfg=has_aarch64");
    }

    // Detect if /dev/ipmi0 is available (build-time check)
    if target_os == "linux" && std::path::Path::new("/dev/ipmi0").exists() {
        println!("cargo::rustc-cfg=has_ipmi_device");
    }
}

⚠️ Anti-pattern demonstration — The code below shows a tempting but problematic approach. Do not use this in production.

// build.rs — BAD: runtime hardware detection at build time
fn main() {
    // ANTI-PATTERN: Binary is baked to the BUILD machine's hardware.
    // If you build on a machine with a GPU and deploy to one without,
    // the binary silently assumes a GPU is present.
    if std::process::Command::new("accel-query")
        .arg("--query-gpu=name")
        .arg("--format=csv,noheader")
        .output()
        .is_ok()
    {
        println!("cargo::rustc-cfg=has_accel_device");
    }
}
#![allow(unused)]
fn main() {
// src/gpu.rs — code that adapts based on build-time detection
pub fn query_gpu_info() -> GpuResult {
    #[cfg(has_accel_device)]
    {
        run_accel_query()
    }

    #[cfg(not(has_accel_device))]
    {
        GpuResult::NotAvailable("accel-query not found at build time".into())
    }
}
}

⚠️ Why this is wrong: Runtime device detection is almost always better than build-time detection for optional hardware. The binary produced above is tied to the build machine’s hardware configuration — it will behave differently on the deployment target. Use build-time detection only for capabilities that are truly fixed at compile time (architecture, OS, library availability). For hardware like GPUs, detect at runtime with which accel-query or accel-mgmt probing.

Anti-Patterns and Pitfalls

Anti-PatternWhy It’s BadFix
No rerun-if-changedbuild.rs runs on every build, slowing iterationAlways emit at least cargo::rerun-if-changed=build.rs
Network calls in build.rsBuilds fail offline, non-reproducibleVendor files or use a separate fetch step
Writing to src/Cargo doesn’t expect source to change during buildWrite to OUT_DIR and use include!()
Heavy computationSlows every cargo buildCache results in OUT_DIR, gate with rerun-if-changed
Ignoring cross-compilationUsing Command::new("gcc") without respecting $CCUse the cc crate which handles cross-compilation toolchains
Panicking without contextunwrap() gives opaque “build script failed” errorUse .expect("descriptive message") or print cargo::warning=

Application: Embedding Build Metadata

The project currently uses env!("CARGO_PKG_VERSION") for version reporting. A build script would extend this with richer metadata:

// build.rs — proposed addition
fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=.git/refs");
    println!("cargo::rerun-if-changed=build.rs");

    // Embed git hash for traceability in diagnostic reports
    if let Ok(output) = std::process::Command::new("git")
        .args(["rev-parse", "--short=10", "HEAD"])
        .output()
    {
        let hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
        println!("cargo::rustc-env=APP_GIT_HASH={hash}");
    } else {
        println!("cargo::rustc-env=APP_GIT_HASH=unknown");
    }

    // Embed build timestamp for report correlation
    let timestamp = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs().to_string())
        .unwrap_or_else(|_| "0".into());
    println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");

    // Emit target triple — useful in multi-arch deployment
    let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=APP_TARGET={target}");
}
#![allow(unused)]
fn main() {
// src/version.rs — consuming the metadata
pub struct BuildInfo {
    pub version: &'static str,
    pub git_hash: &'static str,
    pub build_epoch: &'static str,
    pub target: &'static str,
}

pub const BUILD_INFO: BuildInfo = BuildInfo {
    version: env!("CARGO_PKG_VERSION"),
    git_hash: env!("APP_GIT_HASH"),
    build_epoch: env!("APP_BUILD_EPOCH"),
    target: env!("APP_TARGET"),
};

impl BuildInfo {
    /// Parse the epoch at runtime when needed (const &str → u64 is not
    /// possible on stable Rust — there is no const fn for str-to-int).
    pub fn build_epoch_secs(&self) -> u64 {
        self.build_epoch.parse().unwrap_or(0)
    }
}

impl std::fmt::Display for BuildInfo {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        write!(
            f,
            "DiagTool v{} (git:{} target:{})",
            self.version, self.git_hash, self.target
        )
    }
}
}

Key insight from the project: The codebase has zero build.rs files across all many crates because it’s pure Rust with no C dependencies, no codegen, and no system library linking. When you need these, build.rs is the tool — but don’t add it “just because.” The absence of build scripts in a large codebase is a feature, not a gap. See Dependency Management for how the project manages its supply chain without custom build logic. is a positive signal of a clean architecture.

Try It Yourself

  1. Embed git metadata: Create a build.rs that emits APP_GIT_HASH and APP_BUILD_EPOCH as environment variables. Consume them with env!() in main.rs and print the build info. Verify the hash changes after a commit.

  2. Probe a system library: Write a build.rs that uses pkg-config to probe for libz (zlib). Emit cargo::rustc-cfg=has_zlib if found. In main.rs, conditionally print “zlib available” or “zlib not found” based on the cfg flag.

  3. Trigger a build failure intentionally: Remove the rerun-if-changed line from your build.rs and observe how many times it reruns during cargo build and cargo test. Then add it back and compare.

Reproducible Builds

Chapter 1 teaches embedding timestamps and git hashes into binaries. This is useful for traceability, but it conflicts with reproducible builds — the property that building the same source always produces the same binary.

The tension:

GoalAchievementCost
TraceabilityAPP_BUILD_EPOCH in binaryEvery build is unique — can’t verify integrity
Reproducibilitycargo build --locked always produces same outputNo build-time metadata

Practical resolution:

# 1. Always use --locked in CI (ensures Cargo.lock is respected)
cargo build --release --locked
# Fails if Cargo.lock is missing or outdated — catches "works on my machine"

# 2. For reproducibility-critical builds, set SOURCE_DATE_EPOCH
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) cargo build --release --locked
# Uses the last commit timestamp instead of "now" — same commit = same binary
#![allow(unused)]
fn main() {
// In build.rs: respect SOURCE_DATE_EPOCH for reproducibility
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
    .unwrap_or_else(|_| {
        std::time::SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .map(|d| d.as_secs().to_string())
            .unwrap_or_else(|_| "0".into())
    });
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
}

Best practice: Use SOURCE_DATE_EPOCH in build scripts so release builds are reproducible (git-hash + locked deps + deterministic timestamp = same binary), while dev builds still get live timestamps for convenience.

Build Pipeline Decision Diagram

flowchart TD
    START["Need compile-time work?"] -->|No| SKIP["No build.rs needed"]
    START -->|Yes| WHAT{"What kind?"}
    
    WHAT -->|"Embed metadata"| P1["Pattern 1\nCompile-Time Constants"]
    WHAT -->|"Compile C/C++"| P2["Pattern 2\ncc crate"]
    WHAT -->|"Code generation"| P3["Pattern 3\nprost-build / tonic-build"]
    WHAT -->|"Link system lib"| P4["Pattern 4\npkg-config"]
    WHAT -->|"Detect features"| P5["Pattern 5\ncfg flags"]
    
    P1 --> RERUN["Always emit\ncargo::rerun-if-changed"]
    P2 --> RERUN
    P3 --> RERUN
    P4 --> RERUN
    P5 --> RERUN
    
    style SKIP fill:#91e5a3,color:#000
    style RERUN fill:#ffd43b,color:#000
    style P1 fill:#e3f2fd,color:#000
    style P2 fill:#e3f2fd,color:#000
    style P3 fill:#e3f2fd,color:#000
    style P4 fill:#e3f2fd,color:#000
    style P5 fill:#e3f2fd,color:#000

🏋️ Exercises

🟢 Exercise 1: Version Stamp

Create a minimal crate with a build.rs that embeds the current git hash and build profile into environment variables. Print them from main(). Verify the output changes between debug and release builds.

Solution
// build.rs
fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=build.rs");

    let hash = std::process::Command::new("git")
        .args(["rev-parse", "--short", "HEAD"])
        .output()
        .map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())
        .unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=GIT_HASH={hash}");
    println!("cargo::rustc-env=BUILD_PROFILE={}", std::env::var("PROFILE").unwrap_or_default());
}
// src/main.rs
fn main() {
    println!("{} v{} (git:{} profile:{})",
        env!("CARGO_PKG_NAME"),
        env!("CARGO_PKG_VERSION"),
        env!("GIT_HASH"),
        env!("BUILD_PROFILE"),
    );
}
cargo run          # shows profile:debug
cargo run --release # shows profile:release

🟡 Exercise 2: Conditional System Library

Write a build.rs that probes for both libz and libpci using pkg-config. Emit a cfg flag for each one found. In main.rs, print which libraries were detected at build time.

Solution
# Cargo.toml
[build-dependencies]
pkg-config = "0.3"
// build.rs
fn main() {
    println!("cargo::rerun-if-changed=build.rs");
    if pkg_config::probe_library("zlib").is_ok() {
        println!("cargo::rustc-cfg=has_zlib");
    }
    if pkg_config::probe_library("libpci").is_ok() {
        println!("cargo::rustc-cfg=has_libpci");
    }
}
// src/main.rs
fn main() {
    #[cfg(has_zlib)]
    println!("✅ zlib detected");
    #[cfg(not(has_zlib))]
    println!("❌ zlib not found");

    #[cfg(has_libpci)]
    println!("✅ libpci detected");
    #[cfg(not(has_libpci))]
    println!("❌ libpci not found");
}

Key Takeaways

  • build.rs runs on the host at compile time — always emit cargo::rerun-if-changed to avoid unnecessary rebuilds
  • Use the cc crate (not raw gcc commands) for C/C++ compilation — it handles cross-compilation toolchains correctly
  • Write generated files to OUT_DIR, never to src/ — Cargo doesn’t expect source to change during builds
  • Prefer runtime detection over build-time detection for optional hardware
  • Use SOURCE_DATE_EPOCH to make builds reproducible when embedding timestamps

Cross-Compilation — One Source, Many Targets 🟡

What you’ll learn:

  • How Rust target triples work and how to add them with rustup
  • Building static musl binaries for container/cloud deployment
  • Cross-compiling to ARM (aarch64) with native toolchains, cross, and cargo-zigbuild
  • Setting up GitHub Actions matrix builds for multi-architecture CI

Cross-references: Build Scripts — build.rs runs on HOST during cross-compilation · Release Profiles — LTO and strip settings for cross-compiled release binaries · Windows — Windows cross-compilation and no_std targets

Cross-compilation means building an executable on one machine (the host) that runs on a different machine (the target). The host might be your x86_64 laptop; the target might be an ARM server, a musl-based container, or even a Windows machine. Rust makes this remarkably feasible because rustc is already a cross-compiler — it just needs the right target libraries and a compatible linker.

The Target Triple Anatomy

Every Rust compilation target is identified by a target triple (which often has four parts despite the name):

<arch>-<vendor>-<os>-<env>

Examples:
  x86_64  - unknown - linux  - gnu      ← standard Linux (glibc)
  x86_64  - unknown - linux  - musl     ← static Linux (musl libc)
  aarch64 - unknown - linux  - gnu      ← ARM 64-bit Linux
  x86_64  - pc      - windows- msvc     ← Windows with MSVC
  aarch64 - apple   - darwin             ← macOS on Apple Silicon
  x86_64  - unknown - none              ← bare metal (no OS)

List all available targets:

# Show all targets rustc can compile to (~250 targets)
rustc --print target-list | wc -l

# Show installed targets on your system
rustup target list --installed

# Show current default target
rustc -vV | grep host

Installing Toolchains with rustup

# Add target libraries (Rust std for that target)
rustup target add x86_64-unknown-linux-musl
rustup target add aarch64-unknown-linux-gnu

# Now you can cross-compile:
cargo build --target x86_64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu  # needs a linker — see below

What rustup target add gives you: the pre-compiled std, core, and alloc libraries for that target. It does not give you a C linker or C library. For targets that need a C toolchain (most gnu targets), you need to install one separately.

# Ubuntu/Debian — install the cross-linker for aarch64
sudo apt install gcc-aarch64-linux-gnu

# Ubuntu/Debian — install musl toolchain for static builds
sudo apt install musl-tools

# Fedora
sudo dnf install gcc-aarch64-linux-gnu

.cargo/config.toml — Per-Target Configuration

Instead of passing --target on every command, configure defaults in .cargo/config.toml at your project root or home directory:

# .cargo/config.toml

# Default target for this project (optional — omit to keep native default)
# [build]
# target = "x86_64-unknown-linux-musl"

# Linker for aarch64 cross-compilation
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
rustflags = ["-C", "target-feature=+crc"]

# Linker for musl static builds (usually just the system gcc works)
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
rustflags = ["-C", "target-feature=+crc,+aes"]

# ARM 32-bit (Raspberry Pi, embedded)
[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"

# Environment variables for all targets
[env]
# Example: set a custom sysroot
# SYSROOT = "/opt/cross/sysroot"

Config file search order (first match wins):

  1. <project>/.cargo/config.toml
  2. <project>/../.cargo/config.toml (parent directories, walking up)
  3. $CARGO_HOME/config.toml (usually ~/.cargo/config.toml)

Static Binaries with musl

For deploying to minimal containers (Alpine, scratch Docker images) or systems where you can’t control the glibc version, build with musl:

# Install musl target
rustup target add x86_64-unknown-linux-musl
sudo apt install musl-tools  # provides musl-gcc

# Build a fully static binary
cargo build --release --target x86_64-unknown-linux-musl

# Verify it's static
file target/x86_64-unknown-linux-musl/release/diag_tool
# → ELF 64-bit LSB executable, x86-64, statically linked

ldd target/x86_64-unknown-linux-musl/release/diag_tool
# → not a dynamic executable

Static vs dynamic trade-offs:

Aspectglibc (dynamic)musl (static)
Binary sizeSmaller (shared libs)Larger (~5-15 MB increase)
PortabilityNeeds matching glibc versionRuns anywhere on Linux
DNS resolutionFull nsswitch supportBasic resolver (no mDNS)
DeploymentNeeds sysroot or containerSingle binary, no deps
PerformanceSlightly faster mallocSlightly slower malloc
dlopen() supportYesNo

For the project: A static musl build is ideal for deployment to diverse server hardware where you can’t guarantee the host OS version. The single-binary deployment model eliminates “works on my machine” issues.

Cross-Compiling to ARM (aarch64)

ARM servers (AWS Graviton, Ampere Altra, Grace) are increasingly common in data centers. Cross-compiling for aarch64 from an x86_64 host:

# Step 1: Install target + cross-linker
rustup target add aarch64-unknown-linux-gnu
sudo apt install gcc-aarch64-linux-gnu

# Step 2: Configure linker in .cargo/config.toml (see above)

# Step 3: Build
cargo build --release --target aarch64-unknown-linux-gnu

# Step 4: Verify the binary
file target/aarch64-unknown-linux-gnu/release/diag_tool
# → ELF 64-bit LSB executable, ARM aarch64

Running tests for the target architecture requires either:

  • An actual ARM machine
  • QEMU user-mode emulation
# Install QEMU user-mode (runs ARM binaries on x86_64)
sudo apt install qemu-user qemu-user-static binfmt-support

# Now cargo test can run cross-compiled tests through QEMU
cargo test --target aarch64-unknown-linux-gnu
# (Slow — each test binary is emulated. Use for CI validation, not daily dev.)

Configure QEMU as the test runner in .cargo/config.toml:

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64-static -L /usr/aarch64-linux-gnu"

The cross Tool — Docker-Based Cross-Compilation

The cross tool provides a zero-setup cross-compilation experience using pre-configured Docker images:

# Install cross (from crates.io — stable releases)
cargo install cross
# Or from git for latest features (less stable):
# cargo install cross --git https://github.com/cross-rs/cross

# Cross-compile — no toolchain setup needed!
cross build --release --target aarch64-unknown-linux-gnu
cross build --release --target x86_64-unknown-linux-musl
cross build --release --target armv7-unknown-linux-gnueabihf

# Cross-test — QEMU included in the Docker image
cross test --target aarch64-unknown-linux-gnu

How it works: cross replaces cargo and runs the build inside a Docker container that has the correct cross-compilation toolchain pre-installed. Your source is mounted into the container, and the output goes to your normal target/ directory.

Customizing the Docker image with Cross.toml:

# Cross.toml
[target.aarch64-unknown-linux-gnu]
# Use a custom Docker image with extra system libraries
image = "my-registry/cross-aarch64:latest"

# Pre-install system packages
pre-build = [
    "dpkg --add-architecture arm64",
    "apt-get update && apt-get install -y libpci-dev:arm64"
]

[target.aarch64-unknown-linux-gnu.env]
# Pass environment variables into the container
passthrough = ["CI", "GITHUB_TOKEN"]

cross requires Docker (or Podman) but eliminates the need to manually install cross-compilers, sysroots, and QEMU. It’s the recommended approach for CI.

Using Zig as a Cross-Compilation Linker

Zig bundles a C compiler and cross-compilation sysroot for ~40 targets in a single ~40 MB download. This makes it a remarkably convenient cross-linker for Rust:

# Install Zig (single binary, no package manager needed)
# Download from https://ziglang.org/download/
# Or via package manager:
sudo snap install zig --classic --beta  # Ubuntu
brew install zig                          # macOS

# Install cargo-zigbuild
cargo install cargo-zigbuild

Why Zig? The key advantage is glibc version targeting. Zig lets you specify the exact glibc version to link against, ensuring your binary runs on older Linux distributions:

# Build for glibc 2.17 (CentOS 7 / RHEL 7 compatibility)
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17

# Build for aarch64 with glibc 2.28 (Ubuntu 18.04+)
cargo zigbuild --release --target aarch64-unknown-linux-gnu.2.28

# Build for musl (fully static)
cargo zigbuild --release --target x86_64-unknown-linux-musl

The .2.17 suffix is a Zig extension — it tells Zig’s linker to use glibc 2.17 symbol versions, so the resulting binary runs on CentOS 7 and later. No Docker, no sysroot management, no cross-compiler installation.

Comparison: cross vs cargo-zigbuild vs manual:

FeatureManualcrosscargo-zigbuild
Setup effortHigh (install toolchain per target)Low (needs Docker)Low (single binary)
Docker requiredNoYesNo
glibc version targetingNo (uses host glibc)No (uses container glibc)Yes (exact version)
Test executionNeeds QEMUIncludedNeeds QEMU
macOS → LinuxDifficultEasyEasy
Linux → macOSVery difficultNot supportedLimited
Binary size overheadNoneNoneNone

CI Pipeline: GitHub Actions Matrix

A production-grade CI workflow that builds for multiple targets:

# .github/workflows/cross-build.yml
name: Cross-Platform Build

on: [push, pull_request]

env:
  CARGO_TERM_COLOR: always

jobs:
  build:
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-gnu
            os: ubuntu-latest
            name: linux-x86_64
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
            name: linux-x86_64-static
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
            name: linux-aarch64
            use_cross: true
          - target: x86_64-pc-windows-msvc
            os: windows-latest
            name: windows-x86_64

    runs-on: ${{ matrix.os }}
    name: Build (${{ matrix.name }})

    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}

      - name: Install musl tools
        if: matrix.target == 'x86_64-unknown-linux-musl'
        run: sudo apt-get install -y musl-tools

      - name: Install cross
        if: matrix.use_cross
        run: cargo install cross

      - name: Build (native)
        if: "!matrix.use_cross"
        run: cargo build --release --target ${{ matrix.target }}

      - name: Build (cross)
        if: matrix.use_cross
        run: cross build --release --target ${{ matrix.target }}

      - name: Run tests
        if: "!matrix.use_cross"
        run: cargo test --target ${{ matrix.target }}

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: diag_tool-${{ matrix.name }}
          path: target/${{ matrix.target }}/release/diag_tool*

Application: Multi-Architecture Server Builds

The binary currently has no cross-compilation setup. For a hardware diagnostics tool deployed across diverse server fleets, the recommended addition:

my_workspace/
├── .cargo/
│   └── config.toml          ← linker configs per target
├── Cross.toml                ← cross tool configuration
└── .github/workflows/
    └── cross-build.yml       ← CI matrix for 3 targets

Recommended .cargo/config.toml:

# .cargo/config.toml for the project

# Release profile optimizations (already in Cargo.toml, shown for reference)
# [profile.release]
# lto = true
# codegen-units = 1
# panic = "abort"
# strip = true

# aarch64 for ARM servers (Graviton, Ampere, Grace)
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"

# musl for portable static binaries
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"

Recommended build targets:

TargetUse CaseDeploy To
x86_64-unknown-linux-gnuDefault native buildStandard x86 servers
x86_64-unknown-linux-muslStatic binary, any distroContainers, minimal hosts
aarch64-unknown-linux-gnuARM serversGraviton, Ampere, Grace

Key insight: The [profile.release] in the workspace’s root Cargo.toml already has lto = true, codegen-units = 1, panic = "abort", and strip = true — an ideal release profile for cross-compiled deployment binaries (see Release Profiles for the full impact table). Combined with musl, this produces a single ~10 MB static binary with no runtime dependencies.

Troubleshooting Cross-Compilation

SymptomCauseFix
linker 'aarch64-linux-gnu-gcc' not foundMissing cross-linker toolchainsudo apt install gcc-aarch64-linux-gnu
cannot find -lssl (musl target)System OpenSSL is glibc-linkedUse vendored feature: openssl = { version = "0.10", features = ["vendored"] }
build.rs runs wrong binarybuild.rs runs on HOST, not targetCheck CARGO_CFG_TARGET_OS in build.rs, not cfg!(target_os)
Tests pass locally, fail in crossDocker image missing test fixturesMount test data via Cross.toml: [build.env] volumes = ["./TestArea:/TestArea"]
undefined reference to __cxa_thread_atexit_implOld glibc on targetUse cargo-zigbuild with explicit glibc version: --target x86_64-unknown-linux-gnu.2.17
Binary segfaults on ARMCompiled for wrong ARM variantVerify target triple matches hardware: aarch64-unknown-linux-gnu for 64-bit ARM
GLIBC_2.XX not found at runtimeBuild machine has newer glibcUse musl for static builds, or cargo-zigbuild for glibc version pinning

Cross-Compilation Decision Tree

flowchart TD
    START["Need to cross-compile?"] --> STATIC{"Static binary?"}
    
    STATIC -->|Yes| MUSL["musl target\n--target x86_64-unknown-linux-musl"]
    STATIC -->|No| GLIBC{"Need old glibc?"}
    
    GLIBC -->|Yes| ZIG["cargo-zigbuild\n--target x86_64-unknown-linux-gnu.2.17"]
    GLIBC -->|No| ARCH{"Target arch?"}
    
    ARCH -->|"Same arch"| NATIVE["Native toolchain\nrustup target add + linker"]
    ARCH -->|"ARM/other"| DOCKER{"Docker available?"}
    
    DOCKER -->|Yes| CROSS["cross build\nDocker-based, zero setup"]
    DOCKER -->|No| MANUAL["Manual sysroot\napt install gcc-aarch64-linux-gnu"]
    
    style MUSL fill:#91e5a3,color:#000
    style ZIG fill:#91e5a3,color:#000
    style CROSS fill:#91e5a3,color:#000
    style NATIVE fill:#e3f2fd,color:#000
    style MANUAL fill:#ffd43b,color:#000

🏋️ Exercises

🟢 Exercise 1: Static musl Binary

Build any Rust binary for x86_64-unknown-linux-musl. Verify it’s statically linked using file and ldd.

Solution
rustup target add x86_64-unknown-linux-musl
cargo new hello-static && cd hello-static
cargo build --release --target x86_64-unknown-linux-musl

# Verify
file target/x86_64-unknown-linux-musl/release/hello-static
# Output: ... statically linked ...

ldd target/x86_64-unknown-linux-musl/release/hello-static
# Output: not a dynamic executable

🟡 Exercise 2: GitHub Actions Cross-Build Matrix

Write a GitHub Actions workflow that builds a Rust project for three targets: x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, and aarch64-unknown-linux-gnu. Use a matrix strategy.

Solution
name: Cross-build
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        target:
          - x86_64-unknown-linux-gnu
          - x86_64-unknown-linux-musl
          - aarch64-unknown-linux-gnu
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}
      - name: Install cross
        run: cargo install cross --locked
      - name: Build
        run: cross build --release --target ${{ matrix.target }}
      - uses: actions/upload-artifact@v4
        with:
          name: binary-${{ matrix.target }}
          path: target/${{ matrix.target }}/release/my-binary

Key Takeaways

  • Rust’s rustc is already a cross-compiler — you just need the right target and linker
  • musl produces fully static binaries with zero runtime dependencies — ideal for containers
  • cargo-zigbuild solves the “which glibc version” problem for enterprise Linux targets
  • cross is the easiest path for ARM and other exotic targets — Docker handles the sysroot
  • Always test with file and ldd to verify the binary matches your deployment target

Benchmarking — Measuring What Matters 🟡

What you’ll learn:

  • Why naive timing with Instant::now() produces unreliable results
  • Statistical benchmarking with Criterion.rs and the lighter Divan alternative
  • Profiling hot spots with perf, flamegraphs, and PGO
  • Setting up continuous benchmarking in CI to catch regressions automatically

Cross-references: Release Profiles — once you find the hot spot, optimize the binary · CI/CD Pipeline — benchmark job in the pipeline · Code Coverage — coverage tells you what’s tested, benchmarks tell you what’s fast

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” — Donald Knuth

The hard part isn’t writing benchmarks — it’s writing benchmarks that produce meaningful, reproducible, actionable numbers. This chapter covers the tools and techniques that get you from “it seems fast” to “we have statistical evidence that PR #347 regressed parsing throughput by 4.2%.”

Why Not std::time::Instant?

The temptation:

// ❌ Naive benchmarking — unreliable results
use std::time::Instant;

fn main() {
    let start = Instant::now();
    let result = parse_device_query_output(&sample_data);
    let elapsed = start.elapsed();
    println!("Parsing took {:?}", elapsed);
    // Problem 1: Compiler may optimize away `result` (dead code elimination)
    // Problem 2: Single sample — no statistical significance
    // Problem 3: CPU frequency scaling, thermal throttling, other processes
    // Problem 4: Cold cache vs warm cache not controlled
}

Problems with manual timing:

  1. Dead code elimination — the compiler may skip the computation entirely if the result isn’t used.
  2. No warm-up — the first run includes cache misses, JIT effects (irrelevant in Rust, but OS page faults apply), and lazy initialization.
  3. No statistical analysis — a single measurement tells you nothing about variance, outliers, or confidence intervals.
  4. No regression detection — you can’t compare against previous runs.

Criterion.rs — Statistical Benchmarking

Criterion.rs is the de facto standard for Rust micro-benchmarks. It uses statistical methods to produce reliable measurements and detects performance regressions automatically.

Setup:

# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports", "cargo_bench_support"] }

[[bench]]
name = "parsing_bench"
harness = false  # Use Criterion's harness, not the built-in test harness

A complete benchmark:

#![allow(unused)]
fn main() {
// benches/parsing_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId};

/// Data type for parsed GPU information
#[derive(Debug, Clone)]
struct GpuInfo {
    index: u32,
    name: String,
    temp_c: u32,
    power_w: f64,
}

/// The function under test — simulate parsing device-query CSV output
fn parse_gpu_csv(input: &str) -> Vec<GpuInfo> {
    input
        .lines()
        .filter(|line| !line.starts_with('#'))
        .filter_map(|line| {
            let fields: Vec<&str> = line.split(", ").collect();
            if fields.len() >= 4 {
                Some(GpuInfo {
                    index: fields[0].parse().ok()?,
                    name: fields[1].to_string(),
                    temp_c: fields[2].parse().ok()?,
                    power_w: fields[3].parse().ok()?,
                })
            } else {
                None
            }
        })
        .collect()
}

fn bench_parse_gpu_csv(c: &mut Criterion) {
    // Representative test data
    let small_input = "0, Acme Accel-V1-80GB, 32, 65.5\n\
                       1, Acme Accel-V1-80GB, 34, 67.2\n";

    let large_input = (0..64)
        .map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
        .collect::<String>();

    c.bench_function("parse_2_gpus", |b| {
        b.iter(|| parse_gpu_csv(black_box(small_input)))
    });

    c.bench_function("parse_64_gpus", |b| {
        b.iter(|| parse_gpu_csv(black_box(&large_input)))
    });
}

criterion_group!(benches, bench_parse_gpu_csv);
criterion_main!(benches);
}

Running and reading results:

# Run all benchmarks
cargo bench

# Run a specific benchmark by name
cargo bench -- parse_64

# Output:
# parse_2_gpus        time:   [1.2345 µs  1.2456 µs  1.2578 µs]
#                      ▲            ▲           ▲
#                      │       confidence interval
#                   lower 95%    median    upper 95%
#
# parse_64_gpus       time:   [38.123 µs  38.456 µs  38.812 µs]
#                     change: [-1.2345% -0.5678% +0.1234%] (p = 0.12 > 0.05)
#                     No change in performance detected.

What black_box() does: It’s a compiler hint that prevents dead-code elimination and over-aggressive constant folding. The compiler cannot see through black_box, so it must actually compute the result.

Parameterized Benchmarks and Benchmark Groups

Compare multiple implementations or input sizes:

#![allow(unused)]
fn main() {
// benches/comparison_bench.rs
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId, Throughput};

fn bench_parsing_strategies(c: &mut Criterion) {
    let mut group = c.benchmark_group("csv_parsing");

    // Test across different input sizes
    for num_gpus in [1, 8, 32, 64, 128] {
        let input = generate_gpu_csv(num_gpus);

        // Set throughput for bytes-per-second reporting
        group.throughput(Throughput::Bytes(input.len() as u64));

        group.bench_with_input(
            BenchmarkId::new("split_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_split(input)),
        );

        group.bench_with_input(
            BenchmarkId::new("regex_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_regex(input)),
        );

        group.bench_with_input(
            BenchmarkId::new("nom_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_nom(input)),
        );
    }
    group.finish();
}

criterion_group!(benches, bench_parsing_strategies);
criterion_main!(benches);
}

Output: Criterion generates an HTML report at target/criterion/report/index.html with violin plots, comparison charts, and regression analysis — open in a browser.

Divan — A Lighter Alternative

Divan is a newer benchmarking framework that uses attribute macros instead of Criterion’s macro DSL:

# Cargo.toml
[dev-dependencies]
divan = "0.1"

[[bench]]
name = "parsing_bench"
harness = false
// benches/parsing_bench.rs
use divan::black_box;

const SMALL_INPUT: &str = "0, Acme Accel-V1-80GB, 32, 65.5\n\
                          1, Acme Accel-V1-80GB, 34, 67.2\n";

fn generate_gpu_csv(n: usize) -> String {
    (0..n)
        .map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
        .collect()
}

fn main() {
    divan::main();
}

#[divan::bench]
fn parse_2_gpus() -> Vec<GpuInfo> {
    parse_gpu_csv(black_box(SMALL_INPUT))
}

#[divan::bench(args = [1, 8, 32, 64, 128])]
fn parse_n_gpus(n: usize) -> Vec<GpuInfo> {
    let input = generate_gpu_csv(n);
    parse_gpu_csv(black_box(&input))
}

// Divan output is a clean table:
// ╰─ parse_2_gpus   fastest  │ slowest  │ median   │ mean     │ samples │ iters
//                   1.234 µs │ 1.567 µs │ 1.345 µs │ 1.350 µs │ 100     │ 1600

When to choose Divan over Criterion:

  • Simpler API (attribute macros, less boilerplate)
  • Faster compilation (fewer dependencies)
  • Good for quick perf checks during development

When to choose Criterion:

  • Statistical regression detection across runs
  • HTML reports with charts
  • Established ecosystem, more CI integrations

Profiling with perf and Flamegraphs

Benchmarks tell you how fast — profiling tells you where the time goes.

# Step 1: Build with debug info (release speed, debug symbols)
cargo build --release
# Ensure debug info is available:
# [profile.release]
# debug = true          # Add this temporarily for profiling

# Step 2: Record with perf
perf record --call-graph=dwarf ./target/release/diag_tool --run-diagnostics

# Step 3: Generate a flamegraph
# Install: cargo install flamegraph
# Install: cargo install addr2line --features=bin (optional, speedup cargo-flamegraph)
cargo flamegraph --root -- --run-diagnostics
# Opens an interactive SVG flamegraph

# Alternative: use perf + inferno
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg

Reading a flamegraph:

  • Width = time spent in that function (wider = slower)
  • Height = call stack depth (taller ≠ slower, just deeper)
  • Bottom = entry point, Top = leaf functions doing actual work
  • Look for wide plateaus at the top — those are your hot spots

Profile-guided optimization (PGO):

# Step 1: Build with instrumentation
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" cargo build --release

# Step 2: Run representative workloads
./target/release/diag_tool --run-full   # generates profiling data

# Step 3: Merge profiling data
# Use the llvm-profdata that matches rustc's LLVM version:
# $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/bin/llvm-profdata
# Or if llvm-tools is installed: rustup component add llvm-tools
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data/

# Step 4: Rebuild with profiling feedback
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" cargo build --release
# Typical improvement: 5-20% for compute-bound code (parsing, crypto, codegen).
# I/O-bound or syscall-heavy code (like a large project) will see much less benefit
# because the CPU is mostly waiting, not executing hot loops.

Tip: Before spending time on PGO, ensure your release profile already has LTO enabled — it typically delivers a bigger win for less effort.

hyperfine — Quick End-to-End Timing

hyperfine benchmarks entire commands, not individual functions. It’s perfect for measuring overall binary performance:

# Install
cargo install hyperfine
# Or: sudo apt install hyperfine  (Ubuntu 23.04+)

# Basic benchmark
hyperfine './target/release/diag_tool --run-diagnostics'

# Compare two implementations
hyperfine './target/release/diag_tool_v1 --run-diagnostics' \
          './target/release/diag_tool_v2 --run-diagnostics'

# Warm-up runs + minimum iterations
hyperfine --warmup 3 --min-runs 10 './target/release/diag_tool --run-all'

# Export results as JSON for CI comparison
hyperfine --export-json bench.json './target/release/diag_tool --run-all'

When to use hyperfine vs Criterion:

  • hyperfine: whole-binary timing, comparing before/after a refactor, I/O-bound workloads
  • Criterion: micro-benchmarks of individual functions, statistical regression detection

Continuous Benchmarking in CI

Detect performance regressions before they ship:

# .github/workflows/bench.yml
name: Benchmarks

on:
  pull_request:
    paths: ['**/*.rs', 'Cargo.toml', 'Cargo.lock']

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable

      - name: Run benchmarks
        # Requires criterion = { features = ["cargo_bench_support"] } for --output-format
        run: cargo bench -- --output-format bencher | tee bench_output.txt

      - name: Store benchmark result
        uses: benchmark-action/github-action-benchmark@v1
        with:
          tool: 'cargo'
          output-file-path: bench_output.txt
          github-token: ${{ secrets.GITHUB_TOKEN }}
          auto-push: true
          alert-threshold: '120%'    # Alert if 20% slower
          comment-on-alert: true
          fail-on-alert: true        # Block PR if regression detected

Key CI considerations:

  • Use dedicated benchmark runners (not shared CI) for consistent results
  • Pin the runner to a specific machine type if using cloud CI
  • Store historical data to detect gradual regressions
  • Set thresholds based on your workload’s tolerance (5% for hot paths, 20% for cold)

Application: Parsing Performance

The project has several performance-sensitive parsing paths that would benefit from benchmarks:

Parsing Hot SpotCrateWhy It Matters
accelerator-query CSV/XML outputdevice_diagCalled per-GPU, up to 8× per run
Sensor event parsingevent_logThousands of records on busy servers
PCIe topology JSONtopology_libComplex nested structures, golden-file validated
Report JSON serializationdiag_frameworkFinal report output, size-sensitive
Config JSON loadingconfig_loaderStartup latency

Recommended first benchmark — the topology parser, which already has golden-file test data:

#![allow(unused)]
fn main() {
// topology_lib/benches/parse_bench.rs (proposed)
use criterion::{criterion_group, criterion_main, Criterion, Throughput};
use std::fs;

fn bench_topology_parse(c: &mut Criterion) {
    let mut group = c.benchmark_group("topology_parse");

    for golden_file in ["S2001", "S1015", "S1035", "S1080"] {
        let path = format!("tests/test_data/{golden_file}.json");
        let data = fs::read_to_string(&path).expect("golden file not found");
        group.throughput(Throughput::Bytes(data.len() as u64));

        group.bench_function(golden_file, |b| {
            b.iter(|| {
                topology_lib::TopologyProfile::from_json_str(
                    criterion::black_box(&data)
                )
            });
        });
    }
    group.finish();
}

criterion_group!(benches, bench_topology_parse);
criterion_main!(benches);
}

Try It Yourself

  1. Write a Criterion benchmark: Pick any parsing function in your codebase. Create a benches/ directory, set up a Criterion benchmark that measures throughput in bytes/second. Run cargo bench and examine the HTML report.

  2. Generate a flamegraph: Build your project with debug = true in [profile.release], then run cargo flamegraph -- <your-args>. Identify the three widest stacks at the top of the flamegraph — those are your hot spots.

  3. Compare with hyperfine: Install hyperfine and benchmark the overall execution time of your binary with different flags. Compare it to the per-function times from Criterion. Where does the time go that Criterion doesn’t see? (Answer: I/O, syscalls, process startup.)

Benchmark Tool Selection

flowchart TD
    START["Want to measure performance?"] --> WHAT{"What level?"}

    WHAT -->|"Single function"| CRITERION["Criterion.rs\nStatistical, regression detection"]
    WHAT -->|"Quick function check"| DIVAN["Divan\nLighter, attribute macros"]
    WHAT -->|"Whole binary"| HYPERFINE["hyperfine\nEnd-to-end, wall-clock"]
    WHAT -->|"Find hot spots"| PERF["perf + flamegraph\nCPU sampling profiler"]

    CRITERION --> CI_BENCH["Continuous benchmarking\nin GitHub Actions"]
    PERF --> OPTIMIZE["Profile-Guided\nOptimization (PGO)"]

    style CRITERION fill:#91e5a3,color:#000
    style DIVAN fill:#91e5a3,color:#000
    style HYPERFINE fill:#e3f2fd,color:#000
    style PERF fill:#ffd43b,color:#000
    style CI_BENCH fill:#e3f2fd,color:#000
    style OPTIMIZE fill:#ffd43b,color:#000

🏋️ Exercises

🟢 Exercise 1: First Criterion Benchmark

Create a crate with a function that sorts a Vec<u64> of 10,000 random elements. Write a Criterion benchmark for it, then switch to .sort_unstable() and observe the performance difference in the HTML report.

Solution
# Cargo.toml
[[bench]]
name = "sort_bench"
harness = false

[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
rand = "0.8"
#![allow(unused)]
fn main() {
// benches/sort_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use rand::Rng;

fn generate_data(n: usize) -> Vec<u64> {
    let mut rng = rand::thread_rng();
    (0..n).map(|_| rng.gen()).collect()
}

fn bench_sort(c: &mut Criterion) {
    let mut group = c.benchmark_group("sort-10k");

    group.bench_function("stable", |b| {
        b.iter_batched(
            || generate_data(10_000),
            |mut data| { data.sort(); black_box(&data); },
            criterion::BatchSize::SmallInput,
        )
    });

    group.bench_function("unstable", |b| {
        b.iter_batched(
            || generate_data(10_000),
            |mut data| { data.sort_unstable(); black_box(&data); },
            criterion::BatchSize::SmallInput,
        )
    });

    group.finish();
}

criterion_group!(benches, bench_sort);
criterion_main!(benches);
}
cargo bench
open target/criterion/sort-10k/report/index.html

🟡 Exercise 2: Flamegraph Hot Spot

Build a project with debug = true in [profile.release], then generate a flamegraph. Identify the top 3 widest stacks.

Solution
# Cargo.toml
[profile.release]
debug = true  # Keep symbols for flamegraph
cargo install flamegraph
cargo flamegraph --release -- <your-args>
# Opens flamegraph.svg in browser
# The widest stacks at the top are your hot spots

Key Takeaways

  • Never benchmark with Instant::now() — use Criterion.rs for statistical rigor and regression detection
  • black_box() prevents the compiler from optimizing away your benchmark target
  • hyperfine measures wall-clock time for the whole binary; Criterion measures individual functions — use both
  • Flamegraphs show where time is spent; benchmarks show how much time is spent
  • Continuous benchmarking in CI catches performance regressions before they ship

Code Coverage — Seeing What Tests Miss 🟢

What you’ll learn:

  • Source-based coverage with cargo-llvm-cov (the most accurate Rust coverage tool)
  • Quick coverage checks with cargo-tarpaulin and Mozilla’s grcov
  • Setting up coverage gates in CI with Codecov and Coveralls
  • A coverage-guided testing strategy that prioritizes high-risk blind spots

Cross-references: Miri and Sanitizers — coverage finds untested code, Miri finds UB in tested code · Benchmarking — coverage shows what’s tested, benchmarks show what’s fast · CI/CD Pipeline — coverage gate in the pipeline

Code coverage measures which lines, branches, or functions your tests actually execute. It doesn’t prove correctness (a covered line can still have bugs), but it reliably reveals blind spots — code paths that no test exercises at all.

With 1,006 tests across many crates, the project has substantial test investment. Coverage analysis answers: “Is that investment reaching the code that matters?”

Source-Based Coverage with llvm-cov

Rust uses LLVM, which provides source-based coverage instrumentation — the most accurate coverage method available. The recommended tool is cargo-llvm-cov:

# Install
cargo install cargo-llvm-cov

# Or via rustup component (for the raw llvm tools)
rustup component add llvm-tools-preview

Basic usage:

# Run tests and show per-file coverage summary
cargo llvm-cov

# Generate HTML report (browsable, line-by-line highlighting)
cargo llvm-cov --html
# Output: target/llvm-cov/html/index.html

# Generate LCOV format (for CI integrations)
cargo llvm-cov --lcov --output-path lcov.info

# Workspace-wide coverage (all crates)
cargo llvm-cov --workspace

# Include only specific packages
cargo llvm-cov --package accel_diag --package topology_lib

# Coverage including doc tests
cargo llvm-cov --doctests

Reading the HTML report:

target/llvm-cov/html/index.html
├── Filename          │ Function │ Line   │ Branch │ Region
├─ accel_diag/src/lib.rs │  78.5%  │ 82.3% │ 61.2% │  74.1%
├─ sel_mgr/src/parse.rs│  95.2%  │ 96.8% │ 88.0% │  93.5%
├─ topology_lib/src/.. │  91.0%  │ 93.4% │ 79.5% │  89.2%
└─ ...

Green = covered    Red = not covered    Yellow = partially covered (branch)

Coverage types explained:

TypeWhat It MeasuresSignificance
Line coverageWhich source lines were executedBasic “was this code reached?”
Branch coverageWhich if/match arms were takenCatches untested conditions
Function coverageWhich functions were calledFinds dead code
Region coverageWhich code regions (sub-expressions) were hitMost granular

cargo-tarpaulin — The Quick Path

cargo-tarpaulin is a Linux-specific coverage tool that’s simpler to set up (no LLVM components needed):

# Install
cargo install cargo-tarpaulin

# Basic coverage report
cargo tarpaulin

# HTML output
cargo tarpaulin --out Html

# With specific options
cargo tarpaulin \
    --workspace \
    --timeout 120 \
    --out Xml Html \
    --output-dir coverage/ \
    --exclude-files "*/tests/*" "*/benches/*" \
    --ignore-panics

# Skip certain crates
cargo tarpaulin --workspace --exclude diag_tool  # exclude the binary crate

tarpaulin vs llvm-cov comparison:

Featurecargo-llvm-covcargo-tarpaulin
AccuracySource-based (most accurate)Ptrace-based (occasional overcounting)
PlatformAny (llvm-based)Linux only
Branch coverageYesLimited
Doc testsYesNo
SetupNeeds llvm-tools-previewSelf-contained
SpeedFaster (compile-time instrumentation)Slower (ptrace overhead)
StabilityVery stableOccasional false positives

Recommendation: Use cargo-llvm-cov for accuracy. Use cargo-tarpaulin when you need a quick check without installing LLVM tools.

grcov — Mozilla’s Coverage Tool

grcov is Mozilla’s coverage aggregator. It consumes raw LLVM profiling data and produces reports in multiple formats:

# Install
cargo install grcov

# Step 1: Build with coverage instrumentation
export RUSTFLAGS="-Cinstrument-coverage"
export LLVM_PROFILE_FILE="target/coverage/%p-%m.profraw"
cargo build --tests

# Step 2: Run tests (generates .profraw files)
cargo test

# Step 3: Aggregate with grcov
grcov target/coverage/ \
    --binary-path target/debug/ \
    --source-dir . \
    --output-types html,lcov \
    --output-path target/coverage/report \
    --branch \
    --ignore-not-existing \
    --ignore "*/tests/*" \
    --ignore "*/.cargo/*"

# Step 4: View report
open target/coverage/report/html/index.html

When to use grcov: It’s most useful when you need to merge coverage from multiple test runs (e.g., unit tests + integration tests + fuzz tests) into a single report.

Coverage in CI: Codecov and Coveralls

Upload coverage data to a tracking service for historical trends and PR annotations:

# .github/workflows/coverage.yml
name: Code Coverage

on: [push, pull_request]

jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview

      - name: Install cargo-llvm-cov
        uses: taiki-e/install-action@cargo-llvm-cov

      - name: Generate coverage
        run: cargo llvm-cov --workspace --lcov --output-path lcov.info

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: lcov.info
          token: ${{ secrets.CODECOV_TOKEN }}
          fail_ci_if_error: true

      # Optional: enforce minimum coverage
      - name: Check coverage threshold
        run: |
          cargo llvm-cov --workspace --fail-under-lines 80
          # Fails the build if line coverage drops below 80%

Coverage gates — enforce minimums per crate by reading the JSON output:

# Get per-crate coverage as JSON
cargo llvm-cov --workspace --json | jq '.data[0].totals.lines.percent'

# Fail if below threshold
cargo llvm-cov --workspace --fail-under-lines 80
cargo llvm-cov --workspace --fail-under-functions 70
cargo llvm-cov --workspace --fail-under-regions 60

Coverage-Guided Testing Strategy

Coverage numbers alone are meaningless without a strategy. Here’s how to use coverage data effectively:

Step 1: Triage by risk

High coverage, high risk     → ✅ Good — maintain it
High coverage, low risk      → 🔄 Possibly over-tested — skip if slow
Low coverage, high risk      → 🔴 Write tests NOW — this is where bugs hide
Low coverage, low risk       → 🟡 Track but don't panic

Step 2: Focus on branch coverage, not line coverage

#![allow(unused)]
fn main() {
// 100% line coverage, 50% branch coverage — still risky!
pub fn classify_temperature(temp_c: i32) -> ThermalState {
    if temp_c > 105 {       // ← tested with temp=110 → Critical
        ThermalState::Critical
    } else if temp_c > 85 { // ← tested with temp=90 → Warning
        ThermalState::Warning
    } else if temp_c < -10 { // ← NEVER TESTED → sensor error case missed
        ThermalState::SensorError
    } else {
        ThermalState::Normal  // ← tested with temp=25 → Normal
    }
}
}

Step 3: Exclude noise

# Exclude test code from coverage (it's always "covered")
cargo llvm-cov --workspace --ignore-filename-regex 'tests?\.rs$|benches/'

# Exclude generated code
cargo llvm-cov --workspace --ignore-filename-regex 'target/'

In code, mark untestable sections:

#![allow(unused)]
fn main() {
// Coverage tools recognize this pattern
#[cfg(not(tarpaulin_include))]  // tarpaulin
fn unreachable_hardware_path() {
    // This path requires actual GPU hardware to trigger
}

// For llvm-cov, use a more targeted approach:
// Simply accept that some paths need integration/hardware tests,
// not unit tests. Track them in a coverage exceptions list.
}

Complementary Testing Tools

proptest — Property-Based Testing finds edge cases that hand-written tests miss:

[dev-dependencies]
proptest = "1"
#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn parse_never_panics(input in "\\PC*") {
        // proptest generates thousands of random strings
        // If parse_gpu_csv panics on any input, the test fails
        // and proptest minimizes the failing case for you.
        let _ = parse_gpu_csv(&input);
    }

    #[test]
    fn temperature_roundtrip(raw in 0u16..4096) {
        let temp = Temperature::from_raw(raw);
        let md = temp.millidegrees_c();
        // Property: millidegrees should always be derivable from raw
        assert_eq!(md, (raw as i32) * 625 / 10);
    }
}
}

insta — Snapshot Testing for large structured outputs (JSON, text reports):

[dev-dependencies]
insta = { version = "1", features = ["json"] }
#![allow(unused)]
fn main() {
#[test]
fn test_der_report_format() {
    let report = generate_der_report(&test_results);
    // First run: creates a snapshot file. Subsequent runs: compares against it.
    // Run `cargo insta review` to accept changes interactively.
    insta::assert_json_snapshot!(report);
}
}

When to add proptest/insta: If your unit tests are all “happy path” examples, proptest will find the edge cases you missed. If you’re testing large output formats (JSON reports, DER records), insta snapshots are faster to write and maintain than hand-written assertions.

Application: 1,000+ Tests Coverage Map

The project has 1,000+ tests but no coverage tracking. Adding it reveals the testing investment distribution. Uncovered paths are prime candidates for Miri and sanitizer verification:

Recommended coverage configuration:

# Quick workspace coverage (proposed CI command)
cargo llvm-cov --workspace \
    --ignore-filename-regex 'tests?\.rs$' \
    --fail-under-lines 75 \
    --html

# Per-crate coverage for targeted improvement
for crate in accel_diag event_log topology_lib network_diag compute_diag fan_diag; do
    echo "=== $crate ==="
    cargo llvm-cov --package "$crate" --json 2>/dev/null | \
        jq -r '.data[0].totals | "Lines: \(.lines.percent | round)%  Branches: \(.branches.percent | round)%"'
done

Expected high-coverage crates (based on test density):

  • topology_lib — 922-line golden-file test suite
  • event_log — registry with create_test_record() helpers
  • cable_diagmake_test_event() / make_test_context() patterns

Expected coverage gaps (based on code inspection):

  • Error handling arms in IPMI communication paths
  • GPU hardware-specific branches (require actual GPU)
  • dmesg parsing edge cases (platform-dependent output)

The 80/20 rule of coverage: Getting from 0% to 80% coverage is straightforward. Getting from 80% to 95% requires increasingly contrived test scenarios. Getting from 95% to 100% requires #[cfg(not(...))] exclusions and is rarely worth the effort. Target 80% line coverage and 70% branch coverage as a practical floor.

Troubleshooting Coverage

SymptomCauseFix
llvm-cov shows 0% for all filesInstrumentation not appliedEnsure you run cargo llvm-cov, not cargo test + llvm-cov separately
Coverage counts unreachable!() as uncoveredThose branches exist in compiled codeUse #[cfg(not(tarpaulin_include))] or add to exclusion regex
Test binary crashes under coverageInstrumentation + sanitizer conflictDon’t combine cargo llvm-cov with -Zsanitizer=address; run them separately
Coverage differs between llvm-cov and tarpaulinDifferent instrumentation techniquesUse llvm-cov as source of truth (compiler-native); file issues for large discrepancies
error: profraw file is malformedTest binary crashed mid-executionFix the test failure first; profraw files are corrupt when the process exits abnormally
Branch coverage seems impossibly lowOptimizer creates branches for match arms, unwrap, etc.Focus on line coverage for practical thresholds; branch coverage is inherently lower

Try It Yourself

  1. Measure coverage on your project: Run cargo llvm-cov --workspace --html and open the report. Find the three files with the lowest coverage. Are they untested, or inherently hard to test (hardware-dependent code)?

  2. Set a coverage gate: Add cargo llvm-cov --workspace --fail-under-lines 60 to your CI. Intentionally comment out a test and verify CI fails. Then raise the threshold to your project’s actual coverage level minus 2%.

  3. Branch vs. line coverage: Write a function with a 3-arm match and test only 2 arms. Compare line coverage (may show 66%) vs. branch coverage (may show 50%). Which metric is more useful for your project?

Coverage Tool Selection

flowchart TD
    START["Need code coverage?"] --> ACCURACY{"Priority?"}
    
    ACCURACY -->|"Most accurate"| LLVM["cargo-llvm-cov\nSource-based, compiler-native"]
    ACCURACY -->|"Quick check"| TARP["cargo-tarpaulin\nLinux only, fast"]
    ACCURACY -->|"Multi-run aggregate"| GRCOV["grcov\nMozilla, combines profiles"]
    
    LLVM --> CI_GATE["CI coverage gate\n--fail-under-lines 80"]
    TARP --> CI_GATE
    
    CI_GATE --> UPLOAD{"Upload to?"}
    UPLOAD -->|"Codecov"| CODECOV["codecov/codecov-action"]
    UPLOAD -->|"Coveralls"| COVERALLS["coverallsapp/github-action"]
    
    style LLVM fill:#91e5a3,color:#000
    style TARP fill:#e3f2fd,color:#000
    style GRCOV fill:#e3f2fd,color:#000
    style CI_GATE fill:#ffd43b,color:#000

🏋️ Exercises

🟢 Exercise 1: First Coverage Report

Install cargo-llvm-cov, run it on any Rust project, and open the HTML report. Find the three files with the lowest line coverage.

Solution
cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html --open
# The report sorts files by coverage — lowest at the bottom
# Look for files under 50% — those are your blind spots

🟡 Exercise 2: CI Coverage Gate

Add a coverage gate to a GitHub Actions workflow that fails if line coverage drops below 60%. Verify it works by commenting out a test.

Solution
# .github/workflows/coverage.yml
name: Coverage
on: [push, pull_request]
jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview
      - run: cargo install cargo-llvm-cov
      - run: cargo llvm-cov --workspace --fail-under-lines 60

Comment out a test, push, and watch the workflow fail.

Key Takeaways

  • cargo-llvm-cov is the most accurate coverage tool for Rust — it uses the compiler’s own instrumentation
  • Coverage doesn’t prove correctness, but zero coverage proves zero testing — use it to find blind spots
  • Set a coverage gate in CI (e.g., --fail-under-lines 80) to prevent regressions
  • Don’t chase 100% coverage — focus on high-risk code paths (error handling, unsafe, parsing)
  • Never combine coverage instrumentation with sanitizers in the same run

Miri, Valgrind, and Sanitizers — Verifying Unsafe Code 🔴

What you’ll learn:

  • Miri as a MIR interpreter — what it catches (aliasing, UB, leaks) and what it can’t (FFI, syscalls)
  • Valgrind memcheck, Helgrind (data races), Callgrind (profiling), and Massif (heap)
  • LLVM sanitizers: ASan, MSan, TSan, LSan with nightly -Zbuild-std
  • cargo-fuzz for crash discovery and loom for concurrency model checking
  • A decision tree for choosing the right verification tool

Cross-references: Code Coverage — coverage finds untested paths, Miri verifies the tested ones · no_std & Featuresno_std code often requires unsafe that Miri can verify · CI/CD Pipeline — Miri job in the pipeline

Safe Rust guarantees memory safety and data-race freedom at compile time. But the moment you write unsafe — for FFI, hand-rolled data structures, or performance tricks — those guarantees become your responsibility. This chapter covers the tools that verify your unsafe code actually upholds the safety contracts it claims.

Miri — An Interpreter for Unsafe Rust

Miri is an interpreter for Rust’s Mid-level Intermediate Representation (MIR). Instead of compiling to machine code, Miri executes your program step-by-step with exhaustive checks for undefined behavior at every operation.

# Install Miri (nightly-only component)
rustup +nightly component add miri

# Run your test suite under Miri
cargo +nightly miri test

# Run a specific binary under Miri
cargo +nightly miri run

# Run a specific test
cargo +nightly miri test -- test_name

How Miri works:

Source → rustc → MIR → Miri interprets MIR
                        │
                        ├─ Tracks every pointer's provenance
                        ├─ Validates every memory access
                        ├─ Checks alignment at every deref
                        ├─ Detects use-after-free
                        ├─ Detects data races (with threads)
                        └─ Enforces Stacked Borrows / Tree Borrows rules

What Miri Catches (and What It Cannot)

Miri detects:

CategoryExampleWould Crash at Runtime?
Out-of-bounds accessptr.add(100).read() past allocationSometimes (depends on page layout)
Use after freeReading a dropped Box through raw pointerSometimes (depends on allocator)
Double freeCalling drop_in_place twiceUsually
Unaligned access(ptr as *const u32).read() on odd addressOn some architectures
Invalid valuestransmute::<u8, bool>(2)Silently wrong
Dangling references&*ptr where ptr is freedNo (silent corruption)
Data racesTwo threads, one writing, no synchronizationIntermittent, hard to reproduce
Stacked Borrows violationAliasing &mut referencesNo (silent corruption)

Miri does NOT detect:

LimitationWhy
Logic bugsMiri checks memory safety, not correctness
Concurrency deadlocksMiri checks data races, not livelocks
Performance issuesInterpretation is 10-100× slower than native
OS/hardware interactionMiri can’t emulate syscalls, device I/O
All FFI callsCan’t interpret C code (only Rust MIR)
Exhaustive path coverageOnly tests the paths your test suite reaches

A concrete example — catching unsound code that “works” in practice:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn test_miri_catches_ub() {
        // This "works" in release builds but is undefined behavior
        let mut v = vec![1, 2, 3];
        let ptr = v.as_ptr();

        // Push may reallocate, invalidating ptr
        v.push(4);

        // ❌ UB: ptr may be dangling after reallocation
        // Miri will catch this even if the allocator happens to
        // not move the buffer.
        // let _val = unsafe { *ptr };
        // Error: Miri would report:
        //   "pointer to alloc1234 was dereferenced after this
        //    allocation got freed"
        
        // ✅ Correct: get a fresh pointer after mutation
        let ptr = v.as_ptr();
        let val = unsafe { *ptr };
        assert_eq!(val, 1);
    }
}
}

Running Miri on a Real Crate

Practical Miri workflow for a crate with unsafe:

# Step 1: Run all tests under Miri
cargo +nightly miri test 2>&1 | tee miri_output.txt

# Step 2: If Miri reports errors, isolate them
cargo +nightly miri test -- failing_test_name

# Step 3: Use Miri's backtrace for diagnosis
MIRIFLAGS="-Zmiri-backtrace=full" cargo +nightly miri test

# Step 4: Choose a borrow model
# Stacked Borrows (default, stricter):
cargo +nightly miri test

# Tree Borrows (experimental, more permissive):
MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test

Miri flags for common scenarios:

# Disable isolation (allow file system access, env vars)
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test

# Memory leak detection is ON by default in Miri.
# To suppress leak errors (e.g., for intentional leaks):
# MIRIFLAGS="-Zmiri-ignore-leaks" cargo +nightly miri test

# Seed the RNG for reproducible results with randomized tests
MIRIFLAGS="-Zmiri-seed=42" cargo +nightly miri test

# Enable strict provenance checking
MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test

# Multiple flags
MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-backtrace=full -Zmiri-strict-provenance" \
    cargo +nightly miri test

Miri in CI:

# .github/workflows/miri.yml
name: Miri
on: [push, pull_request]

jobs:
  miri:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly
        with:
          components: miri

      - name: Run Miri
        run: cargo miri test --workspace
        env:
          MIRIFLAGS: "-Zmiri-backtrace=full"
          # Leak checking is on by default.
          # Skip tests that use system calls Miri can't handle
          # (file I/O, networking, etc.)

Performance note: Miri is 10-100× slower than native execution. A test suite that runs in 5 seconds natively may take 5 minutes under Miri. In CI, run Miri on a focused subset: crates with unsafe code only.

Valgrind and Its Rust Integration

Valgrind is the classic C/C++ memory checker. It works on compiled Rust binaries too, checking for memory errors at the machine-code level.

# Install Valgrind
sudo apt install valgrind  # Debian/Ubuntu
sudo dnf install valgrind  # Fedora

# Build with debug info (Valgrind needs symbols)
cargo build --tests
# or for release with debug info:
# cargo build --release
# [profile.release]
# debug = true

# Run a specific test binary under Valgrind
valgrind --tool=memcheck \
    --leak-check=full \
    --show-leak-kinds=all \
    --track-origins=yes \
    ./target/debug/deps/my_crate-abc123 --test-threads=1

# Run the main binary
valgrind --tool=memcheck \
    --leak-check=full \
    --error-exitcode=1 \
    ./target/debug/diag_tool --run-diagnostics

Valgrind tools beyond memcheck:

ToolCommandWhat It Detects
Memcheck--tool=memcheckMemory leaks, use-after-free, buffer overflows
Helgrind--tool=helgrindData races and lock-order violations
DRD--tool=drdData races (different detection algorithm)
Callgrind--tool=callgrindCPU instruction profiling (path-level)
Massif--tool=massifHeap memory profiling over time
Cachegrind--tool=cachegrindCache miss analysis

Using Callgrind for instruction-level profiling:

# Record instruction counts (more stable than wall-clock time)
valgrind --tool=callgrind \
    --callgrind-out-file=callgrind.out \
    ./target/release/diag_tool --run-diagnostics

# Visualize with KCachegrind
kcachegrind callgrind.out
# or the text-based alternative:
callgrind_annotate callgrind.out | head -100

Miri vs Valgrind — when to use which:

AspectMiriValgrind
Checks Rust-specific UB✅ Stacked/Tree Borrows❌ Not aware of Rust rules
Checks C FFI code❌ Can’t interpret C✅ Checks all machine code
Needs nightly✅ Yes❌ No
Speed10-100× slower10-50× slower
PlatformAny (interprets MIR)Linux, macOS (runs native code)
Data race detection✅ Yes✅ Yes (Helgrind/DRD)
Leak detection✅ Yes✅ Yes (more thorough)
False positivesVery rareOccasional (especially with allocators)

Use both:

  • Miri for pure-Rust unsafe code (Stacked Borrows, provenance)
  • Valgrind for FFI-heavy code and whole-program leak analysis

AddressSanitizer, MemorySanitizer, ThreadSanitizer

LLVM sanitizers are compile-time instrumentation passes that insert runtime checks. They’re faster than Valgrind (2-5× overhead vs 10-50×) and catch different classes of bugs.

# Required: install Rust source for rebuilding std with sanitizer instrumentation
rustup component add rust-src --toolchain nightly
# AddressSanitizer (ASan) — buffer overflows, use-after-free, stack overflows
RUSTFLAGS="-Zsanitizer=address" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

# MemorySanitizer (MSan) — uninitialized memory reads
RUSTFLAGS="-Zsanitizer=memory" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

# ThreadSanitizer (TSan) — data races
RUSTFLAGS="-Zsanitizer=thread" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

# LeakSanitizer (LSan) — memory leaks (included in ASan by default)
RUSTFLAGS="-Zsanitizer=leak" \
    cargo +nightly test --target x86_64-unknown-linux-gnu

Note: ASan, MSan, and TSan require -Zbuild-std to rebuild the standard library with sanitizer instrumentation. LSan does not.

Sanitizer comparison:

SanitizerOverheadCatchesNightly?-Zbuild-std?
ASan2× memory, 2× CPUBuffer overflow, use-after-free, stack overflowYesYes
MSan3× memory, 3× CPUUninitialized readsYesYes
TSan5-10× memory, 5× CPUData racesYesYes
LSanMinimalMemory leaksYesNo

Practical example — catching a data race with TSan:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::thread;

fn racy_counter() -> u64 {
    // ❌ UB: unsynchronized shared mutable state
    let data = Arc::new(std::cell::UnsafeCell::new(0u64));
    let mut handles = vec![];

    for _ in 0..4 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                // SAFETY: UNSOUND — data race!
                unsafe {
                    *data.get() += 1;
                }
            }
        }));
    }

    for h in handles {
        h.join().unwrap();
    }

    // Value should be 4000 but may be anything due to race
    unsafe { *data.get() }
}

// Both Miri and TSan catch this:
// Miri:  "Data race detected between (1) write and (2) write"
// TSan:  "WARNING: ThreadSanitizer: data race"
//
// Fix: use AtomicU64 or Mutex<u64>
}

cargo-fuzz — Coverage-Guided Fuzzing (finds crashes in parsers and decoders):

# Install
cargo install cargo-fuzz

# Initialize a fuzz target
cargo fuzz init
cargo fuzz add parse_gpu_csv
#![allow(unused)]
fn main() {
// fuzz/fuzz_targets/parse_gpu_csv.rs
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    if let Ok(s) = std::str::from_utf8(data) {
        // The fuzzer generates millions of inputs looking for panics/crashes.
        let _ = diag_tool::parse_gpu_csv(s);
    }
});
}
# Run the fuzzer (runs until interrupted or crash found)
cargo +nightly fuzz run parse_gpu_csv -- -max_total_time=300  # 5 minutes

# Minimize a crash
cargo +nightly fuzz tmin parse_gpu_csv artifacts/parse_gpu_csv/crash-...

When to fuzz: Any function that parses untrusted/semi-trusted input (sensor output, config files, network data, JSON/CSV). Fuzzing found real bugs in every major Rust parser crate (serde, regex, image).

loom — Concurrency Model Checker (exhaustively tests atomic orderings):

[dev-dependencies]
loom = "0.7"
#![allow(unused)]
fn main() {
#[cfg(loom)]
mod tests {
    use loom::sync::atomic::{AtomicUsize, Ordering};
    use loom::thread;

    #[test]
    fn test_counter_is_atomic() {
        loom::model(|| {
            let counter = loom::sync::Arc::new(AtomicUsize::new(0));
            let c1 = counter.clone();
            let c2 = counter.clone();

            let t1 = thread::spawn(move || { c1.fetch_add(1, Ordering::SeqCst); });
            let t2 = thread::spawn(move || { c2.fetch_add(1, Ordering::SeqCst); });

            t1.join().unwrap();
            t2.join().unwrap();

            // loom explores ALL possible thread interleavings
            assert_eq!(counter.load(Ordering::SeqCst), 2);
        });
    }
}
}

When to use loom: When you have lock-free data structures or custom synchronization primitives. Loom exhaustively explores thread interleavings — it’s a model checker, not a stress test. Not needed for Mutex/RwLock-based code.

When to Use Which Tool

Decision tree for unsafe verification:

Is the code pure Rust (no FFI)?
├─ Yes → Use Miri (catches Rust-specific UB, Stacked Borrows)
│        Also run ASan in CI for defense-in-depth
└─ No (calls C/C++ code via FFI)
   ├─ Memory safety concerns?
   │  └─ Yes → Use Valgrind memcheck AND ASan
   ├─ Concurrency concerns?
   │  └─ Yes → Use TSan (faster) or Helgrind (more thorough)
   └─ Memory leak concerns?
      └─ Yes → Use Valgrind --leak-check=full

Recommended CI matrix:

# Run all tools in parallel for fast feedback
jobs:
  miri:
    runs-on: ubuntu-latest
    steps:
      - uses: dtolnay/rust-toolchain@nightly
        with: { components: miri }
      - run: cargo miri test --workspace

  asan:
    runs-on: ubuntu-latest
    steps:
      - uses: dtolnay/rust-toolchain@nightly
      - run: |
          RUSTFLAGS="-Zsanitizer=address" \
          cargo test -Zbuild-std --target x86_64-unknown-linux-gnu

  valgrind:
    runs-on: ubuntu-latest
    steps:
      - run: sudo apt-get install -y valgrind
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo build --tests
      - run: |
          for test_bin in $(find target/debug/deps -maxdepth 1 -executable -type f ! -name '*.d'); do
            valgrind --error-exitcode=1 --leak-check=full "$test_bin" --test-threads=1
          done

Application: Zero Unsafe — and When You’ll Need It

The project contains zero unsafe blocks across 90K+ lines of Rust. This is a remarkable achievement for a systems-level diagnostics tool and demonstrates that safe Rust is sufficient for:

  • IPMI communication (via std::process::Command to ipmitool)
  • GPU queries (via std::process::Command to accel-query)
  • PCIe topology parsing (pure JSON/text parsing)
  • SEL record management (pure data structures)
  • DER report generation (JSON serialization)

When will the project need unsafe?

The likely triggers for introducing unsafe:

ScenarioWhy unsafeRecommended Verification
Direct ioctl-based IPMIlibc::ioctl() bypasses ipmitool subprocessMiri + Valgrind
Direct GPU driver queriesaccel-mgmt FFI instead of accel-query parsingValgrind (C library)
Memory-mapped PCIe configmmap for direct config-space readsASan + Valgrind
Lock-free SEL bufferAtomicPtr for concurrent event collectionMiri + TSan
Embedded/no_std variantRaw pointer manipulation for bare-metalMiri

Preparation: Before introducing unsafe, add the verification tools to CI:

# Cargo.toml — add a feature flag for unsafe optimizations
[features]
default = []
direct-ipmi = []     # Enable direct ioctl IPMI instead of ipmitool subprocess
direct-accel-api = []     # Enable accel-mgmt FFI instead of accel-query parsing
#![allow(unused)]
fn main() {
// src/ipmi.rs — gated behind a feature flag
#[cfg(feature = "direct-ipmi")]
mod direct {
    //! Direct IPMI device access via /dev/ipmi0 ioctl.
    //!
    //! # Safety
    //! This module uses `unsafe` for ioctl system calls.
    //! Verified with: Miri (where possible), Valgrind memcheck, ASan.

    use std::os::unix::io::RawFd;

    // ... unsafe ioctl implementation ...
}

#[cfg(not(feature = "direct-ipmi"))]
mod subprocess {
    //! IPMI via ipmitool subprocess (default, fully safe).
    // ... current implementation ...
}
}

Key insight: Keep unsafe behind feature flags so it can be verified independently. Run cargo +nightly miri test --features direct-ipmi in CI to continuously verify the unsafe paths without affecting the safe default build.

cargo-careful — Extra UB Checks on Stable

cargo-careful runs your code with extra standard library checks enabled — catching some undefined behavior that normal builds ignore, without requiring nightly or Miri’s 10-100× slowdown:

# Install (requires nightly, but runs your code at near-native speed)
cargo install cargo-careful

# Run tests with extra UB checks (catches uninitialized memory, invalid values)
cargo +nightly careful test

# Run a binary with extra checks
cargo +nightly careful run -- --run-diagnostics

What cargo-careful catches that normal builds don’t:

  • Reads of uninitialized memory in MaybeUninit and zeroed()
  • Creating invalid bool, char, or enum values via transmute
  • Unaligned pointer reads/writes
  • copy_nonoverlapping with overlapping ranges

Where it fits in the verification ladder:

Least overhead                                          Most thorough
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤
│  (0× overhead)  (~1.5× overhead)   (10-100×)  (2×)     (10-50×)   │
│  Safe Rust only  Catches some UB    Pure-Rust  FFI+Rust FFI+Rust   │

Recommendation: Add cargo +nightly careful test to CI as a fast safety check. It runs at near-native speed (unlike Miri) and catches real bugs that safe Rust abstractions mask.

Troubleshooting Miri and Sanitizers

SymptomCauseFix
Miri does not support FFIMiri is a Rust interpreter; it can’t execute C codeUse Valgrind or ASan for FFI code instead
error: unsupported operation: can't call foreign functionMiri hit an extern "C" callMock the FFI boundary or gate behind #[cfg(miri)]
Stacked Borrows violationAliasing rule violation — even if code “works”Miri is correct; refactor to avoid aliasing &mut with &
Sanitizer says DEADLYSIGNALASan detected buffer overflowCheck array indexing, slice operations, and pointer arithmetic
LeakSanitizer: detected memory leaksBox::leak(), forget(), or missing drop()Intentional: suppress with __lsan_disable(); unintentional: fix the leak
Miri is extremely slowMiri interprets, doesn’t compile — 10-100× slowerRun only on --lib tests or tag specific tests with #[cfg_attr(miri, ignore)] for slow ones
TSan: false positive with atomicsTSan doesn’t understand Rust’s atomic ordering model perfectlyAdd TSAN_OPTIONS=suppressions=tsan.supp with specific suppressions

Try It Yourself

  1. Trigger a Miri UB detection: Write an unsafe function that creates two &mut references to the same i32 (aliasing violation). Run cargo +nightly miri test and observe the “Stacked Borrows” error. Fix it with UnsafeCell or separate allocations.

  2. Run ASan on a deliberate bug: Create a test that does unsafe out-of-bounds array access. Build with RUSTFLAGS="-Zsanitizer=address" and observe ASan’s report. Note how it pinpoints the exact line.

  3. Benchmark Miri overhead: Time cargo test --lib vs cargo +nightly miri test --lib on the same test suite. Calculate the slowdown factor. Based on this, decide which tests to run under Miri in CI and which to skip with #[cfg_attr(miri, ignore)].

Safety Verification Decision Tree

flowchart TD
    START["Have unsafe code?"] -->|No| SAFE["Safe Rust — no\nverification needed"]
    START -->|Yes| KIND{"What kind?"}
    
    KIND -->|"Pure Rust unsafe"| MIRI["Miri\nMIR interpreter\ncatches aliasing, UB, leaks"]
    KIND -->|"FFI / C interop"| VALGRIND["Valgrind memcheck\nor ASan"]
    KIND -->|"Concurrent unsafe"| CONC{"Lock-free?"}
    
    CONC -->|"Atomics/lock-free"| LOOM["loom\nModel checker for atomics"]
    CONC -->|"Mutex/shared state"| TSAN["TSan or\nMiri -Zmiri-check-number-validity"]
    
    MIRI --> CI_MIRI["CI: cargo +nightly miri test"]
    VALGRIND --> CI_VALGRIND["CI: valgrind --leak-check=full"]
    
    style SAFE fill:#91e5a3,color:#000
    style MIRI fill:#e3f2fd,color:#000
    style VALGRIND fill:#ffd43b,color:#000
    style LOOM fill:#ff6b6b,color:#000
    style TSAN fill:#ffd43b,color:#000

🏋️ Exercises

🟡 Exercise 1: Trigger a Miri UB Detection

Write an unsafe function that creates two &mut references to the same i32 (aliasing violation). Run cargo +nightly miri test and observe the Stacked Borrows error. Fix it.

Solution
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn aliasing_ub() {
        let mut x: i32 = 42;
        let ptr = &mut x as *mut i32;
        unsafe {
            // BUG: Two &mut references to the same location
            let _a = &mut *ptr;
            let _b = &mut *ptr; // Miri: Stacked Borrows violation!
        }
    }
}
}

Fix: use separate allocations or UnsafeCell:

#![allow(unused)]
fn main() {
use std::cell::UnsafeCell;

#[test]
fn no_aliasing_ub() {
    let x = UnsafeCell::new(42);
    unsafe {
        let a = &mut *x.get();
        *a = 100;
    }
}
}

🔴 Exercise 2: ASan Out-of-Bounds Detection

Create a test with unsafe out-of-bounds array access. Build with RUSTFLAGS="-Zsanitizer=address" on nightly and observe ASan’s report.

Solution
#![allow(unused)]
fn main() {
#[test]
fn oob_access() {
    let arr = [1u8, 2, 3, 4, 5];
    let ptr = arr.as_ptr();
    unsafe {
        let _val = *ptr.add(10); // Out of bounds!
    }
}
}
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std \
  --target x86_64-unknown-linux-gnu -- oob_access
# ASan report: stack-buffer-overflow at <exact address>

Key Takeaways

  • Miri is the tool for pure-Rust unsafe — it catches aliasing violations, use-after-free, and leaks that compile and pass tests
  • Valgrind is the tool for FFI/C interop — it works on the final binary without recompilation
  • Sanitizers (ASan, TSan, MSan) require nightly but run at near-native speed — ideal for large test suites
  • loom is purpose-built for verifying lock-free concurrent data structures
  • Run Miri in CI on every push; run sanitizers on a nightly schedule to avoid slowing the main pipeline

Dependency Management and Supply Chain Security 🟢

What you’ll learn:

  • Scanning for known vulnerabilities with cargo-audit
  • Enforcing license, advisory, and source policies with cargo-deny
  • Supply chain trust verification with Mozilla’s cargo-vet
  • Tracking outdated dependencies and detecting breaking API changes
  • Visualizing and deduplicating your dependency tree

Cross-references: Release Profilescargo-udeps trims unused dependencies found here · CI/CD Pipeline — audit and deny jobs in the pipeline · Build Scriptsbuild-dependencies are part of your supply chain too

A Rust binary doesn’t just contain your code — it contains every transitive dependency in your Cargo.lock. A vulnerability, license violation, or malicious crate anywhere in that tree becomes your problem. This chapter covers the tools that make dependency management auditable and automated.

cargo-audit — Known Vulnerability Scanning

cargo-audit checks your Cargo.lock against the RustSec Advisory Database, which tracks known vulnerabilities in published crates.

# Install
cargo install cargo-audit

# Scan for known vulnerabilities
cargo audit

# Output:
# Crate:     chrono
# Version:   0.4.19
# Title:     Potential segfault in localtime_r invocations
# Date:      2020-11-10
# ID:        RUSTSEC-2020-0159
# URL:       https://rustsec.org/advisories/RUSTSEC-2020-0159
# Solution:  Upgrade to >= 0.4.20

# Check and fail CI if vulnerabilities exist
cargo audit --deny warnings

# Generate JSON output for automated processing
cargo audit --json

# Fix vulnerabilities by updating Cargo.lock
cargo audit fix

CI integration:

# .github/workflows/audit.yml
name: Security Audit
on:
  schedule:
    - cron: '0 0 * * *'  # Daily check — advisories appear continuously
  push:
    paths: ['Cargo.lock']

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: rustsec/audit-check@v2
        with:
          token: ${{ secrets.GITHUB_TOKEN }}

cargo-deny — Comprehensive Policy Enforcement

cargo-deny goes far beyond vulnerability scanning. It enforces policies across four dimensions:

  1. Advisories — known vulnerabilities (like cargo-audit)
  2. Licenses — allowed/denied license list
  3. Bans — forbidden crates or duplicate versions
  4. Sources — allowed registries and git sources
# Install
cargo install cargo-deny

# Initialize configuration
cargo deny init
# Creates deny.toml with documented defaults

# Run all checks
cargo deny check

# Run specific checks
cargo deny check advisories
cargo deny check licenses
cargo deny check bans
cargo deny check sources

Example deny.toml:

# deny.toml

[advisories]
vulnerability = "deny"        # Fail on known vulnerabilities
unmaintained = "warn"         # Warn on unmaintained crates
yanked = "deny"               # Fail on yanked crates
notice = "warn"               # Warn on informational advisories

[licenses]
unlicensed = "deny"           # All crates must have a license
allow = [
    "MIT",
    "Apache-2.0",
    "BSD-2-Clause",
    "BSD-3-Clause",
    "ISC",
    "Unicode-DFS-2016",
]
copyleft = "deny"             # No GPL/LGPL/AGPL in this project
default = "deny"              # Deny anything not explicitly allowed

[bans]
multiple-versions = "warn"    # Warn if same crate appears at 2 versions
wildcards = "deny"            # No path = "*" in dependencies
highlight = "all"             # Show all duplicates, not just first

# Ban specific problematic crates
deny = [
    # openssl-sys pulls in C OpenSSL — prefer rustls
    { name = "openssl-sys", wrappers = ["native-tls"] },
]

# Allow specific duplicate versions (when unavoidable)
[[bans.skip]]
name = "syn"
version = "1.0"               # syn 1.x and 2.x often coexist

[sources]
unknown-registry = "deny"     # Only allow crates.io
unknown-git = "deny"          # No random git dependencies
allow-registry = ["https://github.com/rust-lang/crates.io-index"]

License enforcement is particularly valuable for commercial projects:

# Check which licenses are in your dependency tree
cargo deny list

# Output:
# MIT          — 127 crates
# Apache-2.0   — 89 crates
# BSD-3-Clause — 12 crates
# MPL-2.0      — 3 crates   ← might need legal review
# Unicode-DFS  — 1 crate

cargo-vet — Supply Chain Trust Verification

cargo-vet (from Mozilla) addresses a different question: not “does this crate have known bugs?” but “has a trusted human actually reviewed this code?”

# Install
cargo install cargo-vet

# Initialize (creates supply-chain/ directory)
cargo vet init

# Check which crates need review
cargo vet

# After reviewing a crate, certify it:
cargo vet certify serde 1.0.203
# Records that you've audited serde 1.0.203 for your criteria

# Import audits from trusted organizations
cargo vet import mozilla
cargo vet import google
cargo vet import bytecode-alliance

How it works:

supply-chain/
├── audits.toml       ← Your team's audit certifications
├── config.toml       ← Trust configuration and criteria
└── imports.lock      ← Pinned imports from other organizations

cargo-vet is most valuable for organizations with strict supply-chain requirements (government, finance, infrastructure). For most teams, cargo-deny provides sufficient protection.

cargo-outdated and cargo-semver-checks

cargo-outdated — find dependencies that have newer versions:

cargo install cargo-outdated

cargo outdated --workspace
# Output:
# Name        Project  Compat  Latest   Kind
# serde       1.0.193  1.0.203 1.0.203  Normal
# regex       1.9.6    1.10.4  1.10.4   Normal
# thiserror   1.0.50   1.0.61  2.0.3    Normal  ← major version available

cargo-semver-checks — detect breaking API changes before publishing. Essential for library crates:

cargo install cargo-semver-checks

# Check if your changes are semver-compatible
cargo semver-checks

# Output:
# ✗ Function `parse_gpu_csv` is now private (was public)
#   → This is a BREAKING change. Bump MAJOR version.
#
# ✗ Struct `GpuInfo` has a new required field `power_limit_w`
#   → This is a BREAKING change. Bump MAJOR version.
#
# ✓ Function `parse_gpu_csv_v2` was added (non-breaking)

cargo-tree — Dependency Visualization and Deduplication

cargo tree is built into Cargo (no installation needed) and is invaluable for understanding your dependency graph:

# Full dependency tree
cargo tree

# Find why a specific crate is included
cargo tree --invert --package openssl-sys
# Shows all paths from your crate to openssl-sys

# Find duplicate versions
cargo tree --duplicates
# Output:
# syn v1.0.109
# └── serde_derive v1.0.193
#
# syn v2.0.48
# ├── thiserror-impl v1.0.56
# └── tokio-macros v2.2.0

# Show only direct dependencies
cargo tree --depth 1

# Show dependency features
cargo tree --format "{p} {f}"

# Count total dependencies
cargo tree | wc -l

Deduplication strategy: When cargo tree --duplicates shows the same crate at two major versions, check if you can update the dependency chain to unify them. Each duplicate adds compile time and binary size.

Application: Multi-Crate Dependency Hygiene

The the workspace uses [workspace.dependencies] for centralized version management — an excellent practice. Combined with cargo tree --duplicates for size analysis, this prevents version drift and reduces binary bloat:

# Root Cargo.toml — all versions pinned in one place
[workspace.dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["preserve_order"] }
regex = "1.10"
thiserror = "1.0"
anyhow = "1.0"
rayon = "1.8"

Recommended additions for the project:

# Add to CI pipeline:
cargo deny init              # One-time setup
cargo deny check             # Every PR — licenses, advisories, bans
cargo audit --deny warnings  # Every push — vulnerability scanning
cargo outdated --workspace   # Weekly — track available updates

Recommended deny.toml for the project:

[advisories]
vulnerability = "deny"
yanked = "deny"

[licenses]
allow = ["MIT", "Apache-2.0", "BSD-2-Clause", "BSD-3-Clause", "ISC", "Unicode-DFS-2016"]
copyleft = "deny"     # Hardware diagnostics tool — no copyleft

[bans]
multiple-versions = "warn"   # Track duplicates, don't block yet
wildcards = "deny"

[sources]
unknown-registry = "deny"
unknown-git = "deny"

Supply Chain Audit Pipeline

flowchart LR
    PR["Pull Request"] --> AUDIT["cargo audit\nKnown CVEs"]
    AUDIT --> DENY["cargo deny check\nLicenses + Bans + Sources"]
    DENY --> OUTDATED["cargo outdated\nWeekly schedule"]
    OUTDATED --> SEMVER["cargo semver-checks\nLibrary crates only"]
    
    AUDIT -->|"Fail"| BLOCK["❌ Block merge"]
    DENY -->|"Fail"| BLOCK
    SEMVER -->|"Breaking change"| BUMP["Bump major version"]
    
    style BLOCK fill:#ff6b6b,color:#000
    style BUMP fill:#ffd43b,color:#000
    style PR fill:#e3f2fd,color:#000

🏋️ Exercises

🟢 Exercise 1: Audit Your Dependencies

Run cargo audit and cargo deny init && cargo deny check on any Rust project. How many advisories are found? How many license categories are in your tree?

Solution
cargo audit
# Note any advisories — often chrono, time, or older crates

cargo deny init
cargo deny list
# Shows license breakdown: MIT (N), Apache-2.0 (N), etc.

cargo deny check
# Shows full audit across all four dimensions

🟡 Exercise 2: Find and Eliminate Duplicate Dependencies

Run cargo tree --duplicates on a workspace. Find a crate that appears at two versions. Can you update Cargo.toml to unify them? Measure the compile-time and binary-size impact.

Solution
cargo tree --duplicates
# Typical: syn 1.x and syn 2.x

# Find who pulls in the old version:
cargo tree --invert --package syn@1.0.109
# Output: serde_derive 1.0.xxx -> syn 1.0.109

# Check if a newer serde_derive uses syn 2.x:
cargo update -p serde_derive
cargo tree --duplicates
# If syn 1.x is gone, you've eliminated a duplicate

# Measure impact:
time cargo build --release  # Before and after
cargo bloat --release --crates | head -20

Key Takeaways

  • cargo audit catches known CVEs — run it on every push and on a daily schedule
  • cargo deny enforces four policy dimensions: advisories, licenses, bans, and sources
  • Use [workspace.dependencies] to centralize version management across a multi-crate workspace
  • cargo tree --duplicates reveals bloat; each duplicate adds compile time and binary size
  • cargo-vet is for high-security environments; cargo-deny is sufficient for most teams

Release Profiles and Binary Size 🟡

What you’ll learn:

  • Release profile anatomy: LTO, codegen-units, panic strategy, strip, opt-level
  • Thin vs Fat vs Cross-Language LTO trade-offs
  • Binary size analysis with cargo-bloat
  • Dependency trimming with cargo-udeps, cargo-machete and cargo-shear

Cross-references: Compile-Time Tools — the other half of optimization · Benchmarking — measure runtime before you optimize · Dependencies — trimming deps reduces both size and compile time

The default cargo build --release is already good. But for production deployment — especially single-binary tools deployed to thousands of servers — there’s a significant gap between “good” and “optimized.” This chapter covers the profile knobs and the tools to measure binary size.

Release Profile Anatomy

Cargo profiles control how rustc compiles your code. The defaults are conservative — designed for broad compatibility, not maximum performance:

# Cargo.toml — Cargo's built-in defaults (what you get if you specify nothing)

[profile.release]
opt-level = 3        # Optimization level (0=none, 1=basic, 2=good, 3=aggressive)
lto = false          # Link-time optimization OFF
codegen-units = 16   # Parallel compilation units (faster compile, less optimization)
panic = "unwind"     # Stack unwinding on panic (larger binary, catch_unwind works)
strip = "none"       # Keep all symbols and debug info
overflow-checks = false  # No integer overflow checks in release
debug = false        # No debug info in release

Production-optimized profile (what the project already uses):

[profile.release]
lto = true           # Full cross-crate optimization
codegen-units = 1    # Single codegen unit — maximum optimization opportunity
panic = "abort"      # No unwinding overhead — smaller, faster
strip = true         # Remove all symbols — smaller binary

The impact of each setting:

SettingDefault → OptimizedBinary SizeRuntime SpeedCompile Time
lto = false → true-10 to -20%+5 to +20%2-5× slower
codegen-units = 16 → 1-5 to -10%+5 to +10%1.5-2× slower
panic = "unwind" → "abort"-5 to -10%NegligibleNegligible
strip = "none" → true-50 to -70%NoneNone
opt-level = 3 → "s"-10 to -30%-5 to -10%Similar
opt-level = 3 → "z"-15 to -40%-10 to -20%Similar

Additional profile tweaks:

[profile.release]
# All of the above, plus:
overflow-checks = true    # Keep overflow checks even in release (safety > speed)
debug = "line-tables-only" # Minimal debug info for backtraces without full DWARF
rpath = false             # Don't embed runtime library paths
incremental = false       # Disable incremental compilation (cleaner builds)

# For size-optimized builds (embedded, WASM):
# opt-level = "z"         # Optimize for size aggressively
# strip = "symbols"       # Strip symbols but keep debug sections

Per-crate profile overrides — optimize hot crates, leave others alone:

# Dev builds: optimize dependencies but not your code (fast recompile)
[profile.dev.package."*"]
opt-level = 2          # Optimize all dependencies in dev mode

# Release builds: override specific crate optimization
[profile.release.package.serde_json]
opt-level = 3          # Maximum optimization for JSON parsing
codegen-units = 1

# Test profile: match release behavior for accurate integration tests
[profile.test]
opt-level = 1          # Some optimization to avoid timeout in slow tests

LTO in Depth — Thin vs Fat vs Cross-Language

Link-Time Optimization lets LLVM optimize across crate boundaries — inlining functions from serde_json into your parsing code, removing dead code from regex, etc. Without LTO, each crate is a separate optimization island.

[profile.release]
# Option 1: Fat LTO (default when lto = true)
lto = true
# All code merged into one LLVM module → maximum optimization
# Slowest compile, smallest/fastest binary

# Option 2: Thin LTO
lto = "thin"
# Each crate stays separate but LLVM does cross-module optimization
# Faster compile than fat LTO, nearly as good optimization
# Best trade-off for most projects

# Option 3: No LTO
lto = false
# Only intra-crate optimization
# Fastest compile, larger binary

# Option 4: Off (explicit)
lto = "off"
# Same as false

Fat LTO vs Thin LTO:

AspectFat LTO (true)Thin LTO ("thin")
Optimization qualityBest~95% of fat
Compile timeSlow (all code in one module)Moderate (parallel modules)
Memory usageHigh (all LLVM IR in memory)Lower (streaming)
ParallelismNone (single module)Good (per-module)
Recommended forFinal release buildsCI builds, development

Cross-language LTO — optimize across Rust and C boundaries:

[profile.release]
lto = true

# Cargo.toml — for crates using the cc crate
[build-dependencies]
cc = "1.0"
// build.rs — enable cross-language (linker-plugin) LTO
fn main() {
    // The cc crate respects CFLAGS from the environment.
    // For cross-language LTO, compile C code with:
    //   -flto=thin -O2
    cc::Build::new()
        .file("csrc/fast_parser.c")
        .flag("-flto=thin")
        .opt_level(2)
        .compile("fast_parser");
}
# Enable linker-plugin LTO (requires compatible LLD or gold linker)
RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" \
    cargo build --release

Cross-language LTO allows LLVM to inline C functions into Rust callers and vice versa. This is most impactful for FFI-heavy code where small C functions are called frequently (e.g., IPMI ioctl wrappers).

Binary Size Analysis with cargo-bloat

cargo-bloat answers: “What functions and crates are taking up the most space in my binary?”

# Install
cargo install cargo-bloat

# Show largest functions
cargo bloat --release -n 20
# Output:
#  File  .text     Size          Crate    Name
#  2.8%   5.1%  78.5KiB  serde_json       serde_json::de::Deserializer::parse_...
#  2.1%   3.8%  58.2KiB  regex_syntax     regex_syntax::ast::parse::ParserI::p...
#  1.5%   2.7%  42.1KiB  accel_diag         accel_diag::vendor::parse_smi_output
#  ...

# Show by crate (which dependencies are biggest)
cargo bloat --release --crates
# Output:
#  File  .text     Size Crate
# 12.3%  22.1%  340KiB serde_json
#  8.7%  15.6%  240KiB regex
#  6.2%  11.1%  170KiB std
#  5.1%   9.2%  141KiB accel_diag
#  ...

# Compare two builds (before/after optimization)
cargo bloat --release --crates > before.txt
# ... make changes ...
cargo bloat --release --crates > after.txt
diff before.txt after.txt

Common bloat sources and fixes:

Bloat SourceTypical SizeFix
regex (full engine)200-400 KBUse regex-lite if you don’t need Unicode
serde_json (full)200-350 KBConsider simd-json or sonic-rs if perf matters
Generics monomorphizationVariesUse dyn Trait at API boundaries
Formatting machinery (Display, Debug)50-150 KB#[derive(Debug)] on large enums adds up
Panic message strings20-80 KBpanic = "abort" removes unwinding, strip removes strings
Unused featuresVariesDisable default features: serde = { version = "1", default-features = false }

Trimming Dependencies with cargo-udeps

cargo-udeps finds dependencies declared in Cargo.toml that your code doesn’t actually use:

# Install (requires nightly)
cargo install cargo-udeps

# Find unused dependencies
cargo +nightly udeps --workspace
# Output:
# unused dependencies:
# `diag_tool v0.1.0`
# └── "tempfile" (dev-dependency)
#
# `accel_diag v0.1.0`
# └── "once_cell"    ← was needed before LazyLock, now dead

Every unused dependency:

  • Increases compile time
  • Increases binary size
  • Adds supply chain risk
  • Adds potential license complications

Alternative: cargo-machete — faster, heuristic-based approach:

cargo install cargo-machete
cargo machete
# Faster but may have false positives (heuristic, not compilation-based)

Alternative: cargo-shear — sweet spot between cargo-udeps and cargo-machete:

cargo install cargo-shear
cargo shear --fix
# Slower than cargo-machete but much faster than cargo-udeps
# Much less false positives than cargo-machete

Size Optimization Decision Tree

flowchart TD
    START["Binary too large?"] --> STRIP{"strip = true?"}
    STRIP -->|"No"| DO_STRIP["Add strip = true\n-50 to -70% size"]
    STRIP -->|"Yes"| LTO{"LTO enabled?"}
    LTO -->|"No"| DO_LTO["Add lto = true\ncodegen-units = 1"]
    LTO -->|"Yes"| BLOAT["Run cargo-bloat\n--crates"]
    BLOAT --> BIG_DEP{"Large dependency?"}
    BIG_DEP -->|"Yes"| REPLACE["Replace with lighter\nalternative or disable\ndefault features"]
    BIG_DEP -->|"No"| UDEPS["cargo-udeps\nRemove unused deps"]
    UDEPS --> OPT_LEVEL{"Need smaller?"}
    OPT_LEVEL -->|"Yes"| SIZE_OPT["opt-level = 's' or 'z'"]

    style DO_STRIP fill:#91e5a3,color:#000
    style DO_LTO fill:#e3f2fd,color:#000
    style REPLACE fill:#ffd43b,color:#000
    style SIZE_OPT fill:#ff6b6b,color:#000

🏋️ Exercises

🟢 Exercise 1: Measure LTO Impact

Build a project with default release settings, then with lto = true + codegen-units = 1 + strip = true. Compare binary size and compile time.

Solution
# Default release
cargo build --release
ls -lh target/release/my-binary
time cargo build --release  # Note time

# Optimized release — add to Cargo.toml:
# [profile.release]
# lto = true
# codegen-units = 1
# strip = true
# panic = "abort"

cargo clean
cargo build --release
ls -lh target/release/my-binary  # Typically 30-50% smaller
time cargo build --release       # Typically 2-3× slower to compile

🟡 Exercise 2: Find Your Biggest Crate

Run cargo bloat --release --crates on a project. Identify the largest dependency. Can you reduce it by disabling default features or switching to a lighter alternative?

Solution
cargo install cargo-bloat
cargo bloat --release --crates
# Output:
#  File  .text     Size Crate
# 12.3%  22.1%  340KiB serde_json
#  8.7%  15.6%  240KiB regex

# For regex — try regex-lite if you don't need Unicode:
# regex-lite = "0.1"  # ~10× smaller than full regex

# For serde — disable default features if you don't need std:
# serde = { version = "1", default-features = false, features = ["derive"] }

cargo bloat --release --crates  # Compare after changes

Key Takeaways

  • lto = true + codegen-units = 1 + strip = true + panic = "abort" is the production release profile
  • Thin LTO (lto = "thin") gives 80% of Fat LTO’s benefit at a fraction of the compile cost
  • cargo-bloat --crates tells you exactly which dependencies are eating binary space
  • cargo-udeps, cargo-machete and cargo-shear find dead dependencies that waste compile time and binary size
  • Per-crate profile overrides let you optimize hot crates without slowing the whole build

Compile-Time and Developer Tools 🟡

What you’ll learn:

  • Compilation caching with sccache for local and CI builds
  • Faster linking with mold (3-10× faster than the default linker)
  • cargo-nextest: a faster, more informative test runner
  • Developer visibility tools: cargo-expand, cargo-geiger, cargo-watch
  • Workspace lints, MSRV policy, and documentation-as-CI

Cross-references: Release Profiles — LTO and binary size optimization · CI/CD Pipeline — these tools integrate into your pipeline · Dependencies — fewer deps = faster compiles

Compile-Time Optimization: sccache, mold, cargo-nextest

Long compile times are the #1 developer pain point in Rust. These tools collectively can cut iteration time by 50-80%:

sccache — Shared compilation cache:

# Install
cargo install sccache

# Configure as the Rust wrapper
export RUSTC_WRAPPER=sccache

# Or set permanently in .cargo/config.toml:
# [build]
# rustc-wrapper = "sccache"

# First build: normal speed (populates cache)
cargo build --release  # 3 minutes

# Clean + rebuild: cache hits for unchanged crates
cargo clean && cargo build --release  # 45 seconds

# Check cache statistics
sccache --show-stats
# Compile requests        1,234
# Cache hits               987 (80%)
# Cache misses             247

sccache supports shared caches (S3, GCS, Azure Blob) for team-wide and CI cache sharing.

mold — A faster linker:

Linking is often the slowest phase. mold is 3-5× faster than lld and 10-20× faster than the default GNU ld:

# Install
sudo apt install mold  # Ubuntu 22.04+
# Note: mold is for ELF targets (Linux). macOS uses Mach-O, not ELF.
# The macOS linker (ld64) is already quite fast; if you need faster:
# brew install sold     # sold = mold for Mach-O (experimental, less mature)
# In practice, macOS link times are rarely a bottleneck.
# Use mold for linking
# .cargo/config.toml
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
# See https://github.com/rui314/mold/blob/main/docs/mold.md#environment-variables
export MOLD_JOBS=1

# Verify mold is being used
cargo build -v 2>&1 | grep mold

cargo-nextest — A faster test runner:

# Install
cargo install cargo-nextest

# Run tests (parallel by default, per-test timeout, retry)
cargo nextest run

# Key advantages over cargo test:
# - Each test runs in its own process → better isolation
# - Parallel execution with smart scheduling
# - Per-test timeouts (no more hanging CI)
# - JUnit XML output for CI
# - Retry failed tests

# Configuration
cargo nextest run --retries 2 --fail-fast

# Archive test binaries (useful for CI: build once, test on multiple machines)
cargo nextest archive --archive-file tests.tar.zst
cargo nextest run --archive-file tests.tar.zst
# .config/nextest.toml
[profile.default]
retries = 0
slow-timeout = { period = "60s", terminate-after = 3 }
fail-fast = true

[profile.ci]
retries = 2
fail-fast = false
junit = { path = "test-results.xml" }

Combined dev configuration:

# .cargo/config.toml — optimize the development inner loop
[build]
rustc-wrapper = "sccache"       # Cache compilation artifacts

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]  # Faster linking

# Dev profile: optimize deps but not your code
# (put in Cargo.toml)
# [profile.dev.package."*"]
# opt-level = 2

cargo-expand and cargo-geiger — Visibility Tools

cargo-expand — see what macros generate:

cargo install cargo-expand

# Expand all macros in a specific module
cargo expand --lib accel_diag::vendor

# Expand a specific derive
# Given: #[derive(Debug, Serialize, Deserialize)]
# cargo expand shows the generated impl blocks
cargo expand --lib --tests

Invaluable for debugging #[derive] macro output, macro_rules! expansions, and understanding what serde generates for your types.

In addition to cargo-expand, you can also use rust-analyzer to expand macros:

  1. Move cursor to the macro you want to check.
  2. Open command palette (e.g. F1 on VSCode).
  3. Search for rust-analyzer: Expand macro recursively at caret.

cargo-geiger — count unsafe usage across your dependency tree:

cargo install cargo-geiger

cargo geiger
# Output:
# Metric output format: x/y
#   x = unsafe code used by the build
#   y = total unsafe code found in the crate
#
# Functions  Expressions  Impls  Traits  Methods
# 0/0        0/0          0/0    0/0     0/0      ✅ my_crate
# 0/5        0/23         0/2    0/0     0/3      ✅ serde
# 3/3        14/14        0/0    0/0     2/2      ❗ libc
# 15/15      142/142      4/4    0/0     12/12    ☢️ ring

# The symbols:
# ✅ = no unsafe used
# ❗ = some unsafe used
# ☢️ = heavily unsafe

For the project’s zero-unsafe policy, cargo geiger verifies that no dependency introduces unsafe code into the call graph that your code actually exercises.

Workspace Lints — [workspace.lints]

Since Rust 1.74, you can configure Clippy and compiler lints centrally in Cargo.toml — no more #![deny(...)] at the top of every crate:

# Root Cargo.toml — lint configuration for all crates
[workspace.lints.clippy]
unwrap_used = "warn"         # Prefer ? or expect("reason")
dbg_macro = "deny"           # No dbg!() in committed code
todo = "warn"                # Track incomplete implementations
large_enum_variant = "warn"  # Catch accidental size bloat

[workspace.lints.rust]
unsafe_code = "deny"         # Enforce zero-unsafe policy
missing_docs = "warn"        # Encourage documentation
# Each crate's Cargo.toml — opt into workspace lints
[lints]
workspace = true

This replaces scattered #![deny(clippy::unwrap_used)] attributes and ensures consistent policy across the entire workspace.

Auto-fixing Clippy warnings:

# Let Clippy automatically fix machine-applicable suggestions
cargo clippy --fix --workspace --all-targets --allow-dirty

# Fix and also apply suggestions that may change behavior (review carefully!)
cargo clippy --fix --workspace --all-targets --allow-dirty -- -W clippy::pedantic

Tip: Run cargo clippy --fix before committing. It handles trivial issues (unused imports, redundant clones, type simplifications) that are tedious to fix by hand.

MSRV Policy and rust-version

Minimum Supported Rust Version (MSRV) ensures your crate compiles on older toolchains. This matters when deploying to systems with frozen Rust versions.

# Cargo.toml
[package]
name = "diag_tool"
version = "0.1.0"
rust-version = "1.75"    # Minimum Rust version required
# Verify MSRV compliance
cargo +1.75.0 check --workspace

# Automated MSRV discovery
cargo install cargo-msrv
cargo msrv find
# Output: Minimum Supported Rust Version is 1.75.0

# Verify in CI
cargo msrv verify

MSRV in CI:

jobs:
  msrv:
    name: Check MSRV
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@master
        with:
          toolchain: "1.75.0"    # Match rust-version in Cargo.toml
      - run: cargo check --workspace

MSRV strategy:

  • Binary applications (like a large project): Use latest stable. No MSRV needed.
  • Library crates (published to crates.io): Set MSRV to oldest Rust version that supports all features you use. Commonly N-2 (two versions behind current).
  • Enterprise deployments: Set MSRV to match the oldest Rust version installed on your fleet.

Application: Production Binary Profile

The project already has an excellent release profile:

# Current workspace Cargo.toml
[profile.release]
lto = true           # ✅ Full cross-crate optimization
codegen-units = 1    # ✅ Maximum optimization
panic = "abort"      # ✅ No unwinding overhead
strip = true         # ✅ Remove symbols for deployment

[profile.dev]
opt-level = 0        # ✅ Fast compilation
debug = true         # ✅ Full debug info

Recommended additions:

# Optimize dependencies in dev mode (faster test execution)
[profile.dev.package."*"]
opt-level = 2

# Test profile: some optimization to prevent timeout in slow tests
[profile.test]
opt-level = 1

# Keep overflow checks in release (safety)
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
overflow-checks = true    # ← add this: catch integer overflows
debug = "line-tables-only" # ← add this: backtraces without full DWARF

Recommended developer tooling:

# .cargo/config.toml (proposed)
[build]
rustc-wrapper = "sccache"  # 80%+ cache hit after first build

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]  # 3-5× faster linking

Expected impact on the project:

MetricCurrentWith Additions
Release binary~10 MB (stripped, LTO)Same
Dev build time~45s~25s (sccache + mold)
Rebuild (1 file change)~15s~5s (sccache + mold)
Test executioncargo testcargo nextest — 2× faster
Dep vulnerability scanningNonecargo audit in CI
License complianceManualcargo deny automated
Unused dependency detectionManualcargo udeps in CI

cargo-watch — Auto-Rebuild on File Changes

cargo-watch re-runs a command every time a source file changes — essential for tight feedback loops:

# Install
cargo install cargo-watch

# Re-check on every save (instant feedback)
cargo watch -x check

# Run clippy + tests on change
cargo watch -x 'clippy --workspace --all-targets' -x 'test --workspace --lib'

# Watch only specific crates (faster for large workspaces)
cargo watch -w accel_diag/src -x 'test -p accel_diag'

# Clear screen between runs
cargo watch -c -x check

Tip: Combine with mold + sccache from above for sub-second re-check times on incremental changes.

cargo doc and Workspace Documentation

For a large workspace, generated documentation is essential for discoverability. cargo doc uses rustdoc to produce HTML docs from doc-comments and type signatures:

# Generate docs for all workspace crates (opens in browser)
cargo doc --workspace --no-deps --open

# Include private items (useful during development)
cargo doc --workspace --no-deps --document-private-items

# Check doc-links without generating HTML (fast CI check)
cargo doc --workspace --no-deps 2>&1 | grep -E 'warning|error'

Intra-doc links — link between types across crates without URLs:

#![allow(unused)]
fn main() {
/// Runs GPU diagnostics using [`GpuConfig`] settings.
///
/// See [`crate::accel_diag::run_diagnostics`] for the implementation.
/// Returns [`DiagResult`] which can be serialized to the
/// [`DerReport`](crate::core_lib::DerReport) format.
pub fn run_accel_diag(config: &GpuConfig) -> DiagResult {
    // ...
}
}

Show platform-specific APIs in docs:

#![allow(unused)]
fn main() {
// Cargo.toml: [package.metadata.docs.rs]
// all-features = true
// rustdoc-args = ["--cfg", "docsrs"]

/// Windows-only: read battery status via Win32 API.
///
/// Only available on `cfg(windows)` builds.
#[cfg(windows)]
#[doc(cfg(windows))]  // Shows "Available on Windows only" badge in docs
pub fn get_battery_status() -> Option<u8> {
    // ...
}
}

CI documentation check:

# Add to CI workflow
- name: Check documentation
  run: RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
  # Treats broken intra-doc links as errors

For the project: With many crates, cargo doc --workspace is the best way for new team members to discover the API surface. Add RUSTDOCFLAGS="-D warnings" to CI to catch broken doc-links before merge.

Compile-Time Decision Tree

flowchart TD
    START["Compile too slow?"] --> WHERE{"Where's the time?"}

    WHERE -->|"Recompiling\nunchanged crates"| SCCACHE["sccache\nShared compilation cache"]
    WHERE -->|"Linking phase"| MOLD["mold linker\n3-10× faster linking"]
    WHERE -->|"Running tests"| NEXTEST["cargo-nextest\nParallel test runner"]
    WHERE -->|"Everything"| COMBO["All of the above +\ncargo-udeps to trim deps"]

    SCCACHE --> CI_CACHE{"CI or local?"}
    CI_CACHE -->|"CI"| S3["S3/GCS shared cache"]
    CI_CACHE -->|"Local"| LOCAL["Local disk cache\nauto-configured"]

    style SCCACHE fill:#91e5a3,color:#000
    style MOLD fill:#e3f2fd,color:#000
    style NEXTEST fill:#ffd43b,color:#000
    style COMBO fill:#b39ddb,color:#000

🏋️ Exercises

🟢 Exercise 1: Set Up sccache + mold

Install sccache and mold, configure them in .cargo/config.toml, then measure the compile time improvement on a clean rebuild.

Solution
# Install
cargo install sccache
sudo apt install mold  # Ubuntu 22.04+

# Configure .cargo/config.toml:
cat > .cargo/config.toml << 'EOF'
[build]
rustc-wrapper = "sccache"

[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
EOF

# First build (populates cache)
time cargo build --release  # e.g., 180s

# Clean + rebuild (cache hits)
cargo clean
time cargo build --release  # e.g., 45s

sccache --show-stats
# Cache hits should be 60-80%+

🟡 Exercise 2: Switch to cargo-nextest

Install cargo-nextest and run your test suite. Compare wall-clock time with cargo test. What’s the speedup?

Solution
cargo install cargo-nextest

# Standard test runner
time cargo test --workspace 2>&1 | tail -5

# nextest (parallel per-test-binary execution)
time cargo nextest run --workspace 2>&1 | tail -5

# Typical speedup: 2-5× for large workspaces
# nextest also provides:
# - Per-test timing
# - Retries for flaky tests
# - JUnit XML output for CI
cargo nextest run --workspace --retries 2

Key Takeaways

  • sccache with S3/GCS backend shares compilation cache across team and CI
  • mold is the fastest ELF linker — link times drop from seconds to milliseconds
  • cargo-nextest runs tests in parallel per-binary with better output and retry support
  • cargo-geiger counts unsafe usage — run it before accepting new dependencies
  • [workspace.lints] centralizes Clippy and rustc lint configuration across a multi-crate workspace

no_std and Feature Verification 🔴

What you’ll learn:

  • Verifying feature combinations systematically with cargo-hack
  • The three layers of Rust: core vs alloc vs std and when to use each
  • Building no_std crates with custom panic handlers and allocators
  • Testing no_std code on host and with QEMU

Cross-references: Windows & Conditional Compilation — the platform half of this topic · Cross-Compilation — cross-compiling to ARM and embedded targets · Miri and Sanitizers — verifying unsafe code in no_std environments · Build Scriptscfg flags emitted by build.rs

Rust runs everywhere from 8-bit microcontrollers to cloud servers. This chapter covers the foundation: stripping the standard library with #![no_std] and verifying that your feature combinations actually compile.

Verifying Feature Combinations with cargo-hack

cargo-hack tests all feature combinations systematically — essential for crates with #[cfg(...)] code:

# Install
cargo install cargo-hack

# Check that every feature compiles individually
cargo hack check --each-feature --workspace

# The nuclear option: test ALL feature combinations (exponential!)
# Only practical for crates with <8 features.
cargo hack check --feature-powerset --workspace

# Practical compromise: test each feature alone + all features + no features
cargo hack check --each-feature --workspace --no-dev-deps
cargo check --workspace --all-features
cargo check --workspace --no-default-features

Why this matters for the project:

If you add platform features (linux, windows, direct-ipmi, direct-accel-api), cargo-hack catches combinations that break:

# Example: features that gate platform code
[features]
default = ["linux"]
linux = []                          # Linux-specific hardware access
windows = ["dep:windows-sys"]       # Windows-specific APIs
direct-ipmi = []                    # unsafe IPMI ioctl (ch05)
direct-accel-api = []                    # unsafe accel-mgmt FFI (ch05)
# Verify all features compile in isolation AND together
cargo hack check --each-feature -p diag_tool
# Catches: "feature 'windows' doesn't compile without 'direct-ipmi'"
# Catches: "#[cfg(feature = \"linux\")] has a typo — it's 'lnux'"

CI integration:

# Add to CI pipeline (fast — just compilation checks)
- name: Feature matrix check
  run: cargo hack check --each-feature --workspace --no-dev-deps

Rule of thumb: Run cargo hack check --each-feature in CI for any crate with 2+ features. Run --feature-powerset only for core library crates with <8 features — it’s exponential ($2^n$ combinations).

no_std — When and Why

#![no_std] tells the compiler: “don’t link the standard library.” Your crate can only use core (and optionally alloc). Why would you want this?

ScenarioWhy no_std
Embedded firmware (ARM Cortex-M, RISC-V)No OS, no heap, no file system
UEFI diagnostics toolPre-boot environment, no OS APIs
Kernel modulesKernel space can’t use userspace std
WebAssembly (WASM)Minimize binary size, no OS dependencies
BootloadersRun before any OS exists
Shared library with C interfaceAvoid Rust runtime in callers

For hardware diagnostics, no_std becomes relevant when building:

  • UEFI-based pre-boot diagnostic tools (before the OS loads)
  • BMC firmware diagnostics (resource-constrained ARM SoCs)
  • Kernel-level PCIe diagnostics (kernel module or eBPF probe)

core vs alloc vs std — The Three Layers

┌─────────────────────────────────────────────────────────────┐
│ std                                                         │
│  Everything in core + alloc, PLUS:                          │
│  • File I/O (std::fs, std::io)                              │
│  • Networking (std::net)                                    │
│  • Threads (std::thread)                                    │
│  • Time (std::time)                                         │
│  • Environment (std::env)                                   │
│  • Process (std::process)                                   │
│  • OS-specific (std::os::unix, std::os::windows)            │
├─────────────────────────────────────────────────────────────┤
│ alloc          (available with #![no_std] + extern crate    │
│                 alloc, if you have a global allocator)       │
│  • String, Vec, Box, Rc, Arc                                │
│  • BTreeMap, BTreeSet                                       │
│  • format!() macro                                          │
│  • Collections and smart pointers that need heap            │
├─────────────────────────────────────────────────────────────┤
│ core           (always available, even in #![no_std])        │
│  • Primitive types (u8, bool, char, etc.)                    │
│  • Option, Result                                           │
│  • Iterator, slice, array, str (slices, not String)         │
│  • Traits: Clone, Copy, Debug, Display, From, Into          │
│  • Atomics (core::sync::atomic)                             │
│  • Cell, RefCell (core::cell)  — Pin (core::pin)            │
│  • core::fmt (formatting without allocation)                │
│  • core::mem, core::ptr (low-level memory operations)       │
│  • Math: core::num, basic arithmetic                        │
└─────────────────────────────────────────────────────────────┘

What you lose without std:

  • No HashMap (requires a hasher — use BTreeMap from alloc, or hashbrown)
  • No println!() (requires stdout — use core::fmt::Write to a buffer)
  • No std::error::Error (stabilized in core since Rust 1.81, but many ecosystems haven’t migrated)
  • No file I/O, no networking, no threads (unless provided by a platform HAL)
  • No Mutex (use spin::Mutex or platform-specific locks)

Building a no_std Crate

#![allow(unused)]
fn main() {
// src/lib.rs — a no_std library crate
#![no_std]

// Optionally use heap allocation
extern crate alloc;
use alloc::string::String;
use alloc::vec::Vec;
use core::fmt;

/// Temperature reading from a thermal sensor.
/// This struct works in any environment — bare metal to Linux.
#[derive(Clone, Copy, Debug)]
pub struct Temperature {
    /// Raw sensor value (0.0625°C per LSB for typical I2C sensors)
    raw: u16,
}

impl Temperature {
    pub const fn from_raw(raw: u16) -> Self {
        Self { raw }
    }

    /// Convert to degrees Celsius (fixed-point, no FPU required)
    pub const fn millidegrees_c(&self) -> i32 {
        (self.raw as i32) * 625 / 10 // 0.0625°C resolution
    }

    pub fn degrees_c(&self) -> f32 {
        self.raw as f32 * 0.0625
    }
}

impl fmt::Display for Temperature {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let md = self.millidegrees_c();
        // Handle sign correctly for values between -0.999°C and -0.001°C
        // where md / 1000 == 0 but the value is negative.
        if md < 0 && md > -1000 {
            write!(f, "-0.{:03}°C", (-md) % 1000)
        } else {
            write!(f, "{}.{:03}°C", md / 1000, (md % 1000).abs())
        }
    }
}

/// Parse space-separated temperature values.
/// Uses alloc — requires a global allocator.
pub fn parse_temperatures(input: &str) -> Vec<Temperature> {
    input
        .split_whitespace()
        .filter_map(|s| s.parse::<u16>().ok())
        .map(Temperature::from_raw)
        .collect()
}

/// Format without allocation — writes directly to a buffer.
/// Works in `core`-only environments (no alloc, no heap).
pub fn format_temp_into(temp: &Temperature, buf: &mut [u8]) -> usize {
    use core::fmt::Write;
    struct SliceWriter<'a> {
        buf: &'a mut [u8],
        pos: usize,
    }
    impl<'a> Write for SliceWriter<'a> {
        fn write_str(&mut self, s: &str) -> fmt::Result {
            let bytes = s.as_bytes();
            let remaining = self.buf.len() - self.pos;
            if bytes.len() > remaining {
                // Buffer full — signal the error instead of silently truncating.
                // Callers can check the returned pos for partial writes.
                return Err(fmt::Error);
            }
            self.buf[self.pos..self.pos + bytes.len()].copy_from_slice(bytes);
            self.pos += bytes.len();
            Ok(())
        }
    }
    let mut w = SliceWriter { buf, pos: 0 };
    let _ = write!(w, "{}", temp);
    w.pos
}
}
# Cargo.toml for a no_std crate
[package]
name = "thermal-sensor"
version = "0.1.0"
edition = "2021"

[features]
default = ["alloc"]
alloc = []    # Enable Vec, String, etc.
std = []      # Enable full std (implies alloc)

[dependencies]
# Use no_std-compatible crates
serde = { version = "1.0", default-features = false, features = ["derive"] }
# ↑ default-features = false drops std dependency!

Key crate pattern: Many popular crates (serde, log, rand, embedded-hal) support no_std via default-features = false. Always check whether a dependency requires std before using it in a no_std context. Note that some crates (e.g., regex) require at least alloc and don’t work in core-only environments.

Custom Panic Handlers and Allocators

In #![no_std] binaries (not libraries), you must provide a panic handler and optionally a global allocator:

// src/main.rs — a no_std binary (e.g., UEFI diagnostic)
#![no_std]
#![no_main]

extern crate alloc;

use core::panic::PanicInfo;

// Required: what to do on panic (no stack unwinding available)
#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    // In embedded: blink an LED, write to UART, hang
    // In UEFI: write to console, halt
    // Minimal: just loop forever
    loop {
        core::hint::spin_loop();
    }
}

// Required if using alloc: provide a global allocator
use alloc::alloc::{GlobalAlloc, Layout};

struct BumpAllocator {
    // Simple bump allocator for embedded/UEFI
    // In practice, use a crate like `linked_list_allocator` or `embedded-alloc`
}

// WARNING: This is a non-functional placeholder! Calling alloc() will return
// null, causing immediate UB (the global allocator contract requires non-null
// returns for non-zero-sized allocations). In real code, use an established
// allocator crate:
//   - embedded-alloc (embedded targets)
//   - linked_list_allocator (UEFI / OS kernels)
//   - talc (general-purpose no_std)
unsafe impl GlobalAlloc for BumpAllocator {
    /// # Safety
    /// Layout must have non-zero size. Returns null (placeholder — will crash).
    unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
        // PLACEHOLDER — will crash! Replace with real allocation logic.
        core::ptr::null_mut()
    }
    /// # Safety
    /// `_ptr` must have been returned by `alloc` with a compatible layout.
    unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
        // No-op for bump allocator
    }
}

#[global_allocator]
static ALLOCATOR: BumpAllocator = BumpAllocator {};

// Entry point (platform-specific, not fn main)
// For UEFI: #[entry] or efi_main
// For embedded: #[cortex_m_rt::entry]

Testing no_std Code

Tests run on the host machine, which has std. The trick: your library is no_std, but your test harness uses std:

#![allow(unused)]
fn main() {
// Your crate: #![no_std] in src/lib.rs
// But tests run under std automatically:

#[cfg(test)]
mod tests {
    use super::*;
    // std is available here — println!, assert!, Vec all work

    #[test]
    fn test_temperature_conversion() {
        let temp = Temperature::from_raw(800); // 50.0°C
        assert_eq!(temp.millidegrees_c(), 50000);
        assert!((temp.degrees_c() - 50.0).abs() < 0.01);
    }

    #[test]
    fn test_format_into_buffer() {
        let temp = Temperature::from_raw(800);
        let mut buf = [0u8; 32];
        let len = format_temp_into(&temp, &mut buf);
        let s = core::str::from_utf8(&buf[..len]).unwrap();
        assert_eq!(s, "50.000°C");
    }
}
}

Testing on the actual target (when std isn’t available at all):

# Use defmt-test for on-device testing (embedded ARM)
# Use uefi-test-runner for UEFI targets
# Use QEMU for cross-architecture tests without hardware

# Run no_std library tests on host (always works):
cargo test --lib

# Verify no_std compilation against a no_std target:
cargo check --target thumbv7em-none-eabihf  # ARM Cortex-M
cargo check --target riscv32imac-unknown-none-elf  # RISC-V

no_std Decision Tree

flowchart TD
    START["Does your code need\nthe standard library?"] --> NEED_FS{"File system,\nnetwork, threads?"}
    NEED_FS -->|"Yes"| USE_STD["Use std\nNormal application"]
    NEED_FS -->|"No"| NEED_HEAP{"Need heap allocation?\nVec, String, Box"}
    NEED_HEAP -->|"Yes"| USE_ALLOC["#![no_std]\nextern crate alloc"]
    NEED_HEAP -->|"No"| USE_CORE["#![no_std]\ncore only"]
    
    USE_ALLOC --> VERIFY["cargo-hack\n--each-feature"]
    USE_CORE --> VERIFY
    USE_STD --> VERIFY
    VERIFY --> TARGET{"Target has OS?"}
    TARGET -->|"Yes"| HOST_TEST["cargo test --lib\nStandard testing"]
    TARGET -->|"No"| CROSS_TEST["QEMU / defmt-test\nOn-device testing"]
    
    style USE_STD fill:#91e5a3,color:#000
    style USE_ALLOC fill:#ffd43b,color:#000
    style USE_CORE fill:#ff6b6b,color:#000

🏋️ Exercises

🟡 Exercise 1: Feature Combination Verification

Install cargo-hack and run cargo hack check --each-feature --workspace on a project with multiple features. Does it find any broken combinations?

Solution
cargo install cargo-hack

# Check each feature individually
cargo hack check --each-feature --workspace --no-dev-deps

# If a feature combination fails:
# error[E0433]: failed to resolve: use of undeclared crate or module `std`
# → This means a feature gate is missing a #[cfg] guard

# Check all features + no features + each individually:
cargo hack check --each-feature --workspace
cargo check --workspace --all-features
cargo check --workspace --no-default-features

🔴 Exercise 2: Build a no_std Library

Create a library crate that compiles with #![no_std]. Implement a simple stack-allocated ring buffer. Verify it compiles for thumbv7em-none-eabihf (ARM Cortex-M).

Solution
#![allow(unused)]
fn main() {
// lib.rs
#![no_std]

pub struct RingBuffer<const N: usize> {
    data: [u8; N],
    head: usize,
    len: usize,
}

impl<const N: usize> RingBuffer<N> {
    pub const fn new() -> Self {
        Self { data: [0; N], head: 0, len: 0 }
    }

    pub fn push(&mut self, byte: u8) -> bool {
        if self.len == N { return false; }
        let idx = (self.head + self.len) % N;
        self.data[idx] = byte;
        self.len += 1;
        true
    }

    pub fn pop(&mut self) -> Option<u8> {
        if self.len == 0 { return None; }
        let byte = self.data[self.head];
        self.head = (self.head + 1) % N;
        self.len -= 1;
        Some(byte)
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn push_pop() {
        let mut rb = RingBuffer::<4>::new();
        assert!(rb.push(1));
        assert!(rb.push(2));
        assert_eq!(rb.pop(), Some(1));
        assert_eq!(rb.pop(), Some(2));
        assert_eq!(rb.pop(), None);
    }
}
}
rustup target add thumbv7em-none-eabihf
cargo check --target thumbv7em-none-eabihf
# ✅ Compiles for bare-metal ARM

Key Takeaways

  • cargo-hack --each-feature is essential for any crate with conditional compilation — run it in CI
  • coreallocstd are layered: each adds capabilities but requires more runtime support
  • Custom panic handlers and allocators are required for bare-metal no_std binaries
  • Test no_std libraries on the host with cargo test --lib — no hardware needed
  • Run --feature-powerset only for core libraries with <8 features — it’s $2^n$ combinations

Windows and Conditional Compilation 🟡

What you’ll learn:

  • Windows support patterns: windows-sys/windows crates, cargo-xwin
  • Conditional compilation with #[cfg] — checked by the compiler, not the preprocessor
  • Platform abstraction architecture: when #[cfg] blocks suffice vs when to use traits
  • Cross-compiling for Windows from Linux

Cross-references: no_std & Featurescargo-hack and feature verification · Cross-Compilation — general cross-build setup · Build Scriptscfg flags emitted by build.rs

Windows Support — Platform Abstractions

Rust’s #[cfg()] attributes and Cargo features allow a single codebase to target both Linux and Windows cleanly. The project already demonstrates this pattern in platform::run_command:

#![allow(unused)]
fn main() {
// Real pattern from the project — platform-specific shell invocation
pub fn exec_cmd(cmd: &str, timeout_secs: Option<u64>) -> Result<CommandResult, CommandError> {
    #[cfg(windows)]
    let mut child = Command::new("cmd")
        .args(["/C", cmd])
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()?;

    #[cfg(not(windows))]
    let mut child = Command::new("sh")
        .args(["-c", cmd])
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()?;

    // ... rest is platform-independent ...
}
}

Available cfg predicates:

#![allow(unused)]
fn main() {
// Operating system
#[cfg(target_os = "linux")]         // Linux specifically
#[cfg(target_os = "windows")]       // Windows
#[cfg(target_os = "macos")]         // macOS
#[cfg(unix)]                        // Linux, macOS, BSDs, etc.
#[cfg(windows)]                     // Windows (shorthand)

// Architecture
#[cfg(target_arch = "x86_64")]      // x86 64-bit
#[cfg(target_arch = "aarch64")]     // ARM 64-bit
#[cfg(target_arch = "x86")]         // x86 32-bit

// Pointer width (portable alternative to arch)
#[cfg(target_pointer_width = "64")] // Any 64-bit platform
#[cfg(target_pointer_width = "32")] // Any 32-bit platform

// Environment / C library
#[cfg(target_env = "gnu")]          // glibc
#[cfg(target_env = "musl")]         // musl libc
#[cfg(target_env = "msvc")]         // MSVC on Windows

// Endianness
#[cfg(target_endian = "little")]
#[cfg(target_endian = "big")]

// Combinations with any(), all(), not()
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
#[cfg(any(target_os = "linux", target_os = "macos"))]
#[cfg(not(windows))]
}

The windows-sys and windows Crates

For calling Windows APIs directly:

# Cargo.toml — use windows-sys for raw FFI (lighter, no abstraction)
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [
    "Win32_Foundation",
    "Win32_System_Services",
    "Win32_System_Registry",
    "Win32_System_Power",
] }
# NOTE: windows-sys uses semver-incompatible releases (0.48 → 0.52 → 0.59).
# Pin to a single minor version — each release may remove or rename API bindings.
# Check https://github.com/microsoft/windows-rs for the latest version
# before starting a new project.

# Or use the windows crate for safe wrappers (heavier, more ergonomic)
# windows = { version = "0.59", features = [...] }
#![allow(unused)]
fn main() {
// src/platform/windows.rs
#[cfg(windows)]
mod win {
    use windows_sys::Win32::System::Power::{
        GetSystemPowerStatus, SYSTEM_POWER_STATUS,
    };

    pub fn get_battery_status() -> Option<u8> {
        let mut status = SYSTEM_POWER_STATUS::default();
        // SAFETY: GetSystemPowerStatus writes to the provided buffer.
        // The buffer is correctly sized and aligned.
        let ok = unsafe { GetSystemPowerStatus(&mut status) };
        if ok != 0 {
            Some(status.BatteryLifePercent)
        } else {
            None
        }
    }
}
}

windows-sys vs windows crate:

Aspectwindows-syswindows
API styleRaw FFI (unsafe calls)Safe Rust wrappers
Binary sizeMinimal (just extern declarations)Larger (wrapper code)
Compile timeFastSlower
ErgonomicsC-style, manual safetyRust-idiomatic
Error handlingRaw BOOL / HRESULTResult<T, windows::core::Error>
Use whenPerformance-critical, thin wrapperApplication code, ease of use

Cross-Compiling for Windows from Linux

# Option 1: MinGW (GNU ABI)
rustup target add x86_64-pc-windows-gnu
sudo apt install gcc-mingw-w64-x86-64
cargo build --target x86_64-pc-windows-gnu
# Produces a .exe — runs on Windows, links against msvcrt

# Option 2: MSVC ABI via xwin (for full MSVC compatibility)
cargo install cargo-xwin
cargo xwin build --target x86_64-pc-windows-msvc
# Uses Microsoft's CRT and SDK headers downloaded automatically

# Option 3: Zig-based cross-compilation
cargo zigbuild --target x86_64-pc-windows-gnu

GNU vs MSVC ABI on Windows:

Aspectx86_64-pc-windows-gnux86_64-pc-windows-msvc
LinkerMinGW ldMSVC link.exe or lld-link
C runtimemsvcrt.dll (universal)ucrtbase.dll (modern)
C++ interopGCC ABIMSVC ABI
Cross-compile from LinuxEasy (MinGW)Possible (cargo-xwin)
Windows API supportFullFull
Debug info formatDWARFPDB
Recommended forSimple tools, CI buildsFull Windows integration

Conditional Compilation Patterns

Pattern 1: Platform module selection

#![allow(unused)]
fn main() {
// src/platform/mod.rs — compile different modules per OS
#[cfg(target_os = "linux")]
mod linux;
#[cfg(target_os = "linux")]
pub use linux::*;

#[cfg(target_os = "windows")]
mod windows;
#[cfg(target_os = "windows")]
pub use windows::*;

// Both modules implement the same public API:
// pub fn get_cpu_temperature() -> Result<f64, PlatformError>
// pub fn list_pci_devices() -> Result<Vec<PciDevice>, PlatformError>
}

Pattern 2: Feature-gated platform support

# Cargo.toml
[features]
default = ["linux"]
linux = []              # Linux-specific hardware access
windows = ["dep:windows-sys"]  # Windows-specific APIs

[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [...], optional = true }
#![allow(unused)]
fn main() {
// Compile error if someone tries to build for Windows without the feature:
#[cfg(all(target_os = "windows", not(feature = "windows")))]
compile_error!("Enable the 'windows' feature to build for Windows");
}

Pattern 3: Trait-based platform abstraction

#![allow(unused)]
fn main() {
/// Platform-independent interface for hardware access.
pub trait HardwareAccess {
    type Error: std::error::Error;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error>;
    fn read_gpu_temperature(&self, gpu_index: u32) -> Result<f64, Self::Error>;
    fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error>;
    fn send_ipmi_command(&self, cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error>;
}

#[cfg(target_os = "linux")]
pub struct LinuxHardware;

#[cfg(target_os = "linux")]
impl HardwareAccess for LinuxHardware {
    type Error = LinuxHwError;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
        // Read from /sys/class/thermal/thermal_zone0/temp
        let raw = std::fs::read_to_string("/sys/class/thermal/thermal_zone0/temp")?;
        Ok(raw.trim().parse::<f64>()? / 1000.0)
    }
    // ...
}

#[cfg(target_os = "windows")]
pub struct WindowsHardware;

#[cfg(target_os = "windows")]
impl HardwareAccess for WindowsHardware {
    type Error = WindowsHwError;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
        // Read via WMI (Win32_TemperatureProbe) or Open Hardware Monitor
        todo!("WMI temperature query")
    }
    // ...
}

/// Create the platform-appropriate implementation
pub fn create_hardware() -> impl HardwareAccess {
    #[cfg(target_os = "linux")]
    { LinuxHardware }
    #[cfg(target_os = "windows")]
    { WindowsHardware }
}
}

Platform Abstraction Architecture

For a project that targets multiple platforms, organize code into three layers:

┌──────────────────────────────────────────────────┐
│ Application Logic (platform-independent)          │
│  diag_tool, accel_diag, network_diag, event_log, etc.      │
│  Uses only the platform abstraction trait          │
├──────────────────────────────────────────────────┤
│ Platform Abstraction Layer (trait definitions)    │
│  trait HardwareAccess { ... }                     │
│  trait CommandRunner { ... }                      │
│  trait FileSystem { ... }                         │
├──────────────────────────────────────────────────┤
│ Platform Implementations (cfg-gated)              │
│  ┌──────────────┐  ┌──────────────┐              │
│  │ Linux impl   │  │ Windows impl │              │
│  │ /sys, /proc  │  │ WMI, Registry│              │
│  │ ipmitool     │  │ ipmiutil     │              │
│  │ lspci        │  │ devcon       │              │
│  └──────────────┘  └──────────────┘              │
└──────────────────────────────────────────────────┘

Testing the abstraction: Mock the platform trait for unit tests:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    struct MockHardware {
        cpu_temp: f64,
        gpu_temps: Vec<f64>,
    }

    impl HardwareAccess for MockHardware {
        type Error = std::io::Error;

        fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
            Ok(self.cpu_temp)
        }

        fn read_gpu_temperature(&self, index: u32) -> Result<f64, Self::Error> {
            self.gpu_temps.get(index as usize)
                .copied()
                .ok_or_else(|| std::io::Error::new(
                    std::io::ErrorKind::NotFound,
                    format!("GPU {index} not found")
                ))
        }

        fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error> {
            Ok(vec![]) // Mock returns empty
        }

        fn send_ipmi_command(&self, _cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error> {
            Ok(IpmiResponse::default())
        }
    }

    #[test]
    fn test_thermal_check_with_mock() {
        let hw = MockHardware {
            cpu_temp: 75.0,
            gpu_temps: vec![82.0, 84.0],
        };
        let result = run_thermal_diagnostic(&hw);
        assert!(result.is_ok());
    }
}
}

Application: Linux-First, Windows-Ready

The project is already partially Windows-ready. Use cargo-hack to verify all feature combinations, and cross-compile to test on Windows from Linux:

Already done:

  • platform::run_command uses #[cfg(windows)] for shell selection
  • Tests use #[cfg(windows)] / #[cfg(not(windows))] for platform-appropriate test commands

Recommended evolution path for Windows support:

Phase 1: Extract platform abstraction trait (current → 2 weeks)
  ├─ Define HardwareAccess trait in core_lib
  ├─ Wrap current Linux code behind LinuxHardware impl
  └─ All diagnostic modules depend on trait, not Linux specifics

Phase 2: Add Windows stubs (2 weeks)
  ├─ Implement WindowsHardware with TODO stubs
  ├─ CI builds for x86_64-pc-windows-msvc (compile check only)
  └─ Tests pass with MockHardware on all platforms

Phase 3: Windows implementation (ongoing)
  ├─ IPMI via ipmiutil.exe or OpenIPMI Windows driver
  ├─ GPU via accel-mgmt (accel-api.dll) — same API as Linux
  ├─ PCIe via Windows Setup API (SetupDiEnumDeviceInfo)
  └─ NIC via WMI (Win32_NetworkAdapter)

Cross-platform CI addition:

# Add to CI matrix
- target: x86_64-pc-windows-msvc
  os: windows-latest
  name: windows-x86_64

This ensures the codebase compiles on Windows even before full Windows implementation is complete — catching cfg mistakes early.

Key insight: The abstraction doesn’t need to be perfect on day one. Start with #[cfg] blocks in leaf functions (like exec_cmd already does), then refactor to traits when you have two or more platform implementations. Premature abstraction is worse than #[cfg] blocks.

Conditional Compilation Decision Tree

flowchart TD
    START["Platform-specific code?"] --> HOW_MANY{"How many platforms?"}
    
    HOW_MANY -->|"2 (Linux + Windows)"| CFG_BLOCKS["#[cfg] blocks\nin leaf functions"]
    HOW_MANY -->|"3+"| TRAIT_APPROACH["Platform trait\n+ per-platform impl"]
    
    CFG_BLOCKS --> WINAPI{"Need Windows APIs?"}
    WINAPI -->|"Minimal"| WIN_SYS["windows-sys\nRaw FFI bindings"]
    WINAPI -->|"Rich (COM, etc)"| WIN_RS["windows crate\nSafe idiomatic wrappers"]
    WINAPI -->|"None\n(just #[cfg])"| NATIVE["cfg(windows)\ncfg(unix)"]
    
    TRAIT_APPROACH --> CI_CHECK["cargo-hack\n--each-feature"]
    CFG_BLOCKS --> CI_CHECK
    CI_CHECK --> XCOMPILE["Cross-compile in CI\ncargo-xwin or\nnative runners"]
    
    style CFG_BLOCKS fill:#91e5a3,color:#000
    style TRAIT_APPROACH fill:#ffd43b,color:#000
    style WIN_SYS fill:#e3f2fd,color:#000
    style WIN_RS fill:#e3f2fd,color:#000

🏋️ Exercises

🟢 Exercise 1: Platform-Conditional Module

Create a module with #[cfg(unix)] and #[cfg(windows)] implementations of a get_hostname() function. Verify both compile with cargo check and cargo check --target x86_64-pc-windows-msvc.

Solution
#![allow(unused)]
fn main() {
// src/hostname.rs
#[cfg(unix)]
pub fn get_hostname() -> String {
    use std::fs;
    fs::read_to_string("/etc/hostname")
        .unwrap_or_else(|_| "unknown".to_string())
        .trim()
        .to_string()
}

#[cfg(windows)]
pub fn get_hostname() -> String {
    use std::env;
    env::var("COMPUTERNAME").unwrap_or_else(|_| "unknown".to_string())
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn hostname_is_not_empty() {
        let name = get_hostname();
        assert!(!name.is_empty());
    }
}
}
# Verify Linux compilation
cargo check

# Verify Windows compilation (cross-check)
rustup target add x86_64-pc-windows-msvc
cargo check --target x86_64-pc-windows-msvc

🟡 Exercise 2: Cross-Compile for Windows with cargo-xwin

Install cargo-xwin and build a simple binary for x86_64-pc-windows-msvc from Linux. Verify the output is a .exe.

Solution
cargo install cargo-xwin
rustup target add x86_64-pc-windows-msvc

cargo xwin build --release --target x86_64-pc-windows-msvc
# Downloads Windows SDK headers/libs automatically

file target/x86_64-pc-windows-msvc/release/my-binary.exe
# Output: PE32+ executable (console) x86-64, for MS Windows

# You can also test with Wine:
wine target/x86_64-pc-windows-msvc/release/my-binary.exe

Key Takeaways

  • Start with #[cfg] blocks in leaf functions; refactor to traits only when three or more platforms diverge
  • windows-sys is for raw FFI; the windows crate provides safe, idiomatic wrappers
  • cargo-xwin cross-compiles to Windows MSVC ABI from Linux — no Windows machine needed
  • Always check --target x86_64-pc-windows-msvc in CI even if you only ship on Linux
  • Combine #[cfg] with Cargo features for optional platform support (e.g., feature = "windows")

Putting It All Together — A Production CI/CD Pipeline 🟡

What you’ll learn:

  • Structuring a multi-stage GitHub Actions CI workflow (check → test → coverage → security → cross → release)
  • Caching strategies with rust-cache and save-if tuning
  • Running Miri and sanitizers on a nightly schedule
  • Task automation with Makefile.toml and pre-commit hooks
  • Automated releases with cargo-dist

Cross-references: Build Scripts · Cross-Compilation · Benchmarking · Coverage · Miri/Sanitizers · Dependencies · Release Profiles · Compile-Time Tools · no_std · Windows

Individual tools are useful. A pipeline that orchestrates them automatically on every push is transformative. This chapter assembles the tools from chapters 1–10 into a cohesive CI/CD workflow.

The Complete GitHub Actions Workflow

A single workflow file that runs all verification stages in parallel:

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  CARGO_TERM_COLOR: always
  CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"  # Treat warnings as errors (top-level crate only)
  # NOTE: Unlike RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS does not affect build scripts
  # or proc-macros, which avoids false failures from third-party warnings.
  # Use RUSTFLAGS="-Dwarnings" instead if you want to enforce on build scripts too.

jobs:
  # ─── Stage 1: Fast feedback (< 2 min) ───
  check:
    name: Check + Clippy + Format
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: clippy, rustfmt

      - uses: Swatinem/rust-cache@v2  # Cache dependencies

      - name: Check Cargo.lock
        run: cargo fetch --locked

      - name: Check doc
        run: RUSTDOCFLAGS='-Dwarnings' cargo doc --workspace --all-features --no-deps

      - name: Check compilation
        run: cargo check --workspace --all-targets --all-features

      - name: Clippy lints
        run: cargo clippy --workspace --all-targets --all-features -- -D warnings

      - name: Formatting
        run: cargo fmt --all -- --check

  # ─── Stage 2: Tests (< 5 min) ───
  test:
    name: Test (${{ matrix.os }})
    needs: check
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - uses: Swatinem/rust-cache@v2

      - name: Run tests
        run: cargo test --workspace

      - name: Run doc tests
        run: cargo test --workspace --doc

  # ─── Stage 3: Cross-compilation (< 10 min) ───
  cross:
    name: Cross (${{ matrix.target }})
    needs: check
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
            use_cross: true
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}

      - name: Install musl-tools
        if: contains(matrix.target, 'musl')
        run: sudo apt-get install -y musl-tools

      - name: Install cross
        if: matrix.use_cross
        uses: taiki-e/install-action@cross

      - name: Build (native)
        if: "!matrix.use_cross"
        run: cargo build --release --target ${{ matrix.target }}

      - name: Build (cross)
        if: matrix.use_cross
        run: cross build --release --target ${{ matrix.target }}

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: binary-${{ matrix.target }}
          path: target/${{ matrix.target }}/release/diag_tool

  # ─── Stage 4: Coverage (< 10 min) ───
  coverage:
    name: Code Coverage
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview
      - uses: taiki-e/install-action@cargo-llvm-cov

      - name: Generate coverage
        run: cargo llvm-cov --workspace --lcov --output-path lcov.info

      - name: Enforce minimum coverage
        run: cargo llvm-cov --workspace --fail-under-lines 75

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: lcov.info
          token: ${{ secrets.CODECOV_TOKEN }}

  # ─── Stage 5: Safety verification (< 15 min) ───
  miri:
    name: Miri
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly
        with:
          components: miri

      - name: Run Miri
        run: cargo miri test --workspace
        env:
          MIRIFLAGS: "-Zmiri-backtrace=full"

  # ─── Stage 6: Benchmarks (PR only, < 10 min) ───
  bench:
    name: Benchmarks
    if: github.event_name == 'pull_request'
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable

      - name: Run benchmarks
        run: cargo bench -- --output-format bencher | tee bench.txt

      - name: Compare with baseline
        uses: benchmark-action/github-action-benchmark@v1
        with:
          tool: 'cargo'
          output-file-path: bench.txt
          github-token: ${{ secrets.GITHUB_TOKEN }}
          alert-threshold: '115%'
          comment-on-alert: true

Pipeline execution flow:

                    ┌─────────┐
                    │  check  │  ← clippy + fmt + cargo check (2 min)
                    └────┬────┘
           ┌─────────┬──┴──┬──────────┬──────────┐
           ▼         ▼     ▼          ▼          ▼
       ┌──────┐  ┌──────┐ ┌────────┐ ┌──────┐ ┌──────┐
       │ test │  │cross │ │coverage│ │ miri │ │bench │
       │ (2×) │  │ (2×) │ │        │ │      │ │(PR)  │
       └──────┘  └──────┘ └────────┘ └──────┘ └──────┘
         3 min    8 min     8 min     12 min    5 min

Total wall-clock: ~14 min (parallel after check gate)

CI Caching Strategies

Swatinem/rust-cache@v2 is the standard Rust CI cache action. It caches ~/.cargo and target/ between runs, but large workspaces need tuning:

# Basic (what we use above)
- uses: Swatinem/rust-cache@v2

# Tuned for a large workspace:
- uses: Swatinem/rust-cache@v2
  with:
    # Separate caches per job — prevents test artifacts bloating build cache
    prefix-key: "v1-rust"
    key: ${{ matrix.os }}-${{ matrix.target || 'default' }}
    # Only save cache on main branch (PRs read but don't write)
    save-if: ${{ github.ref == 'refs/heads/main' }}
    # Cache Cargo registry + git checkouts + target dir
    cache-targets: true
    cache-all-crates: true

Cache invalidation gotchas:

ProblemFix
Cache grows unbounded (>5 GB)Set prefix-key: "v2-rust" to force fresh cache
Different features pollute cacheUse key: ${{ hashFiles('**/Cargo.lock') }}
PR cache overwrites mainSet save-if: ${{ github.ref == 'refs/heads/main' }}
Cross-compilation targets bloatUse separate key per target triple

Sharing cache between jobs:

The check job saves the cache; downstream jobs (test, cross, coverage) read it. With save-if on main only, PR runs get the benefit of cached dependencies without writing stale caches.

Measured impact on large-scale workspace: Cold build ~4 min → cached build ~45 sec. The cache action alone saves ~25 min of CI time per pipeline run (across all parallel jobs).

Makefile.toml with cargo-make

cargo-make provides a portable task runner that works across platforms (unlike make/Makefile):

# Install
cargo install cargo-make
# Makefile.toml — at workspace root

[config]
default_to_workspace = false

# ─── Developer workflows ───

[tasks.dev]
description = "Full local verification (same checks as CI)"
dependencies = ["check", "test", "clippy", "fmt-check"]

[tasks.check]
command = "cargo"
args = ["check", "--workspace", "--all-targets"]

[tasks.test]
command = "cargo"
args = ["test", "--workspace"]

[tasks.clippy]
command = "cargo"
args = ["clippy", "--workspace", "--all-targets", "--", "-D", "warnings"]

[tasks.fmt]
command = "cargo"
args = ["fmt", "--all"]

[tasks.fmt-check]
command = "cargo"
args = ["fmt", "--all", "--", "--check"]

# ─── Coverage ───

[tasks.coverage]
description = "Generate HTML coverage report"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--html", "--open"]

[tasks.coverage-ci]
description = "Generate LCOV for CI upload"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--lcov", "--output-path", "lcov.info"]

# ─── Benchmarks ───

[tasks.bench]
description = "Run all benchmarks"
command = "cargo"
args = ["bench"]

# ─── Cross-compilation ───

[tasks.build-musl]
description = "Build static binary (musl)"
command = "cargo"
args = ["build", "--release", "--target", "x86_64-unknown-linux-musl"]

[tasks.build-arm]
description = "Build for aarch64 (requires cross)"
command = "cross"
args = ["build", "--release", "--target", "aarch64-unknown-linux-gnu"]

[tasks.build-all]
description = "Build for all deployment targets"
dependencies = ["build-musl", "build-arm"]

# ─── Safety verification ───

[tasks.miri]
description = "Run Miri on all tests"
toolchain = "nightly"
command = "cargo"
args = ["miri", "test", "--workspace"]

[tasks.audit]
description = "Check for known vulnerabilities"
install_crate = "cargo-audit"
command = "cargo"
args = ["audit"]

# ─── Release ───

[tasks.release-dry]
description = "Preview what cargo-release would do"
install_crate = "cargo-release"
command = "cargo"
args = ["release", "--workspace", "--dry-run"]

Usage:

# Equivalent of CI pipeline, locally
cargo make dev

# Generate and view coverage
cargo make coverage

# Build for all targets
cargo make build-all

# Run safety checks
cargo make miri

# Check for vulnerabilities
cargo make audit

Pre-Commit Hooks: Custom Scripts and cargo-husky

Catch issues before they reach CI. The recommended approach is a custom git hook — it’s simple, transparent, and has no external dependencies:

#!/bin/sh
# .githooks/pre-commit

set -e

echo "=== Pre-commit checks ==="

# Fast checks first
echo "→ cargo fmt --check"
cargo fmt --all -- --check

echo "→ cargo check"
cargo check --workspace --all-targets

echo "→ cargo clippy"
cargo clippy --workspace --all-targets -- -D warnings

echo "→ cargo test (lib only, fast)"
cargo test --workspace --lib

echo "=== All checks passed ==="
# Install the hook
git config core.hooksPath .githooks
chmod +x .githooks/pre-commit

Alternative: cargo-husky (auto-installs hooks via build script):

⚠️ Note: cargo-husky has not been updated since 2022. It still works but is effectively unmaintained. Consider the custom hook approach above for new projects.

cargo install cargo-husky
# Cargo.toml — add to dev-dependencies of root crate
[dev-dependencies]
cargo-husky = { version = "1", default-features = false, features = [
    "precommit-hook",
    "run-cargo-check",
    "run-cargo-clippy",
    "run-cargo-fmt",
    "run-cargo-test",
] }

Release Workflow: cargo-release and cargo-dist

cargo-release — automates version bumping, tagging, and publishing:

# Install
cargo install cargo-release
# release.toml — at workspace root
[workspace]
consolidate-commits = true
pre-release-commit-message = "chore: release {{version}}"
tag-message = "v{{version}}"
tag-name = "v{{version}}"

# Don't publish internal crates
[[package]]
name = "core_lib"
release = false

[[package]]
name = "diag_framework"
release = false

# Only publish the main binary
[[package]]
name = "diag_tool"
release = true
# Preview release
cargo release patch --dry-run

# Execute release (bumps version, commits, tags, optionally publishes)
cargo release patch --execute
# 0.1.0 → 0.1.1

cargo release minor --execute
# 0.1.1 → 0.2.0

cargo-dist — generates downloadable release binaries for GitHub Releases:

# Install
cargo install cargo-dist

# Initialize (creates CI workflow + metadata)
cargo dist init

# Preview what would be built
cargo dist plan

# Generate the release (usually done by CI on tag push)
cargo dist build
# Cargo.toml additions from `cargo dist init`
[workspace.metadata.dist]
cargo-dist-version = "0.28.0"
ci = "github"
targets = [
    "x86_64-unknown-linux-gnu",
    "x86_64-unknown-linux-musl",
    "aarch64-unknown-linux-gnu",
    "x86_64-pc-windows-msvc",
]
install-path = "CARGO_HOME"

This generates a GitHub Actions workflow that, on tag push:

  1. Builds the binary for all target platforms
  2. Creates a GitHub Release with downloadable .tar.gz / .zip archives
  3. Generates shell/PowerShell installer scripts
  4. Publishes to crates.io (if configured)

Try It Yourself — Capstone Exercise

This exercise ties together every chapter. You will build a complete engineering pipeline for a fresh Rust workspace:

  1. Create a new workspace with two crates: a library (core_lib) and a binary (cli). Add a build.rs that embeds the git hash and build timestamp using SOURCE_DATE_EPOCH (ch01).

  2. Set up cross-compilation for x86_64-unknown-linux-musl and aarch64-unknown-linux-gnu. Verify both targets build with cargo zigbuild or cross (ch02).

  3. Add a benchmark using Criterion or Divan for a function in core_lib. Run it locally and record a baseline (ch03).

  4. Measure code coverage with cargo llvm-cov. Set a minimum threshold of 80% and verify it passes (ch04).

  5. Run cargo +nightly careful test and cargo miri test. Add a test that exercises unsafe code if you have any (ch05).

  6. Configure cargo-deny with a deny.toml that bans openssl and enforces MIT/Apache-2.0 licensing (ch06).

  7. Optimize the release profile with lto = "thin", strip = true, and codegen-units = 1. Measure binary size before/after with cargo bloat (ch07).

  8. Add cargo hack --each-feature verification. Create a feature flag for an optional dependency and ensure it compiles alone (ch09).

  9. Write the GitHub Actions workflow (this chapter) with all 6 stages. Add Swatinem/rust-cache@v2 with save-if tuning.

Success criteria: Push to GitHub → all CI stages green → cargo dist plan shows your release targets. You now have a production-grade Rust pipeline.

CI Pipeline Architecture

flowchart LR
    subgraph "Stage 1 — Fast Feedback < 2 min"
        CHECK["cargo check\ncargo clippy\ncargo fmt"]
    end

    subgraph "Stage 2 — Tests < 5 min"
        TEST["cargo nextest\ncargo test --doc"]
    end

    subgraph "Stage 3 — Coverage"
        COV["cargo llvm-cov\nfail-under 80%"]
    end

    subgraph "Stage 4 — Security"
        SEC["cargo audit\ncargo deny check"]
    end

    subgraph "Stage 5 — Cross-Build"
        CROSS["musl static\naarch64 + x86_64"]
    end

    subgraph "Stage 6 — Release (tag only)"
        REL["cargo dist\nGitHub Release"]
    end

    CHECK --> TEST --> COV --> SEC --> CROSS --> REL

    style CHECK fill:#91e5a3,color:#000
    style TEST fill:#91e5a3,color:#000
    style COV fill:#e3f2fd,color:#000
    style SEC fill:#ffd43b,color:#000
    style CROSS fill:#e3f2fd,color:#000
    style REL fill:#b39ddb,color:#000

Key Takeaways

  • Structure CI as parallel stages: fast checks first, expensive jobs behind gates
  • Swatinem/rust-cache@v2 with save-if: ${{ github.ref == 'refs/heads/main' }} prevents PR cache thrashing
  • Run Miri and heavier sanitizers on a nightly schedule: trigger, not on every push
  • Makefile.toml (cargo make) bundles multi-tool workflows into a single command for local dev
  • cargo-dist automates cross-platform release builds — stop writing platform matrix YAML by hand

Tricks from the Trenches 🟡

What you’ll learn:

  • Battle-tested patterns that don’t fit neatly into one chapter
  • Common pitfalls and their fixes — from CI flake to binary bloat
  • Quick-win techniques you can apply to any Rust project today

Cross-references: Every chapter in this book — these tricks cut across all topics

This chapter collects engineering patterns that come up repeatedly in production Rust codebases. Each trick is self-contained — read them in any order.


1. The deny(warnings) Trap

Problem: #![deny(warnings)] in source code breaks builds when Clippy adds new lints — your code that compiled yesterday fails today.

Fix: Use CARGO_ENCODED_RUSTFLAGS in CI instead of a source-level attribute:

# CI: treat warnings as errors without touching source
env:
  CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"

Or use [workspace.lints] for finer control:

# Cargo.toml
[workspace.lints.rust]
unsafe_code = "deny"

[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
pedantic = { level = "warn", priority = -1 }

See Compile-Time Tools, Workspace Lints for the full pattern.


2. Compile Once, Test Everywhere

Problem: cargo test recompiles when switching between --lib, --doc, and --test because they use different profiles.

Fix: Use cargo nextest for unit/integration tests and run doc-tests separately:

cargo nextest run --workspace        # Fast: parallel, cached
cargo test --workspace --doc         # Doc-tests (nextest can't run these)

See Compile-Time Tools for cargo-nextest setup.


3. Feature Flag Hygiene

Problem: A library crate has default = ["std"] but nobody tests --no-default-features. One day an embedded user reports it doesn’t compile.

Fix: Add cargo-hack to CI:

- name: Feature matrix
  run: |
    cargo hack check --each-feature --no-dev-deps
    cargo check --no-default-features
    cargo check --all-features

See no_std and Feature Verification for the full pattern.


4. The Lock File Debate — Commit or Ignore?

Rule of thumb:

Crate TypeCommit Cargo.lock?Why
Binary / applicationYesReproducible builds
LibraryNo (.gitignore)Let downstream choose versions
Workspace with bothYesBinary wins

Add a CI check to ensure the lock file stays up-to-date:

- name: Check lock file
  run: cargo update --locked  # Fails if Cargo.lock is stale

5. Debug Builds with Optimized Dependencies

Problem: Debug builds are painfully slow because dependencies (especially serde, regex) aren’t optimized.

Fix: Optimize deps in dev profile while keeping your code unoptimized for fast recompilation:

# Cargo.toml
[profile.dev.package."*"]
opt-level = 2  # Optimize all dependencies in dev mode

This slows the first build slightly but makes runtime dramatically faster during development. Particularly impactful for database-backed services and parsers.

See Release Profiles for per-crate profile overrides.


6. CI Cache Thrashing

Problem: Swatinem/rust-cache@v2 saves a new cache on every PR, bloating storage and slowing restore times.

Fix: Only save cache from main, restore from anywhere:

- uses: Swatinem/rust-cache@v2
  with:
    save-if: ${{ github.ref == 'refs/heads/main' }}

For workspaces with multiple binaries, add a shared-key:

- uses: Swatinem/rust-cache@v2
  with:
    shared-key: "ci-${{ matrix.target }}"
    save-if: ${{ github.ref == 'refs/heads/main' }}

See CI/CD Pipeline for the full workflow.


7. RUSTFLAGS vs CARGO_ENCODED_RUSTFLAGS

Problem: RUSTFLAGS="-Dwarnings" applies to everything — including build scripts and proc-macros. A warning in serde_derive’s build.rs fails your CI.

Fix: Use CARGO_ENCODED_RUSTFLAGS which only applies to the top-level crate:

# BAD — breaks on third-party build script warnings
RUSTFLAGS="-Dwarnings" cargo build

# GOOD — only affects your crate
CARGO_ENCODED_RUSTFLAGS="-Dwarnings" cargo build

# ALSO GOOD — workspace lints (Cargo.toml)
[workspace.lints.rust]
warnings = "deny"

8. Reproducible Builds with SOURCE_DATE_EPOCH

Problem: Embedding chrono::Utc::now() in build.rs makes builds non-reproducible — every build produces a different binary hash.

Fix: Honor SOURCE_DATE_EPOCH:

#![allow(unused)]
fn main() {
// build.rs
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
    .ok()
    .and_then(|s| s.parse::<i64>().ok())
    .unwrap_or_else(|| chrono::Utc::now().timestamp());
println!("cargo:rustc-env=BUILD_TIMESTAMP={timestamp}");
}

See Build Scripts for the full build.rs patterns.


9. The cargo tree Deduplication Workflow

Problem: cargo tree --duplicates shows 5 versions of syn and 3 of tokio-util. Compile time is painful.

Fix: Systematic deduplication:

# Step 1: Find duplicates
cargo tree --duplicates

# Step 2: Find who pulls the old version
cargo tree --invert --package syn@1.0.109

# Step 3: Update the culprit
cargo update -p serde_derive  # Might pull in syn 2.x

# Step 4: If no update available, pin in [patch]
# [patch.crates-io]
# old-crate = { git = "...", branch = "syn2-migration" }

# Step 5: Verify
cargo tree --duplicates  # Should be shorter

See Dependency Management for cargo-deny and supply chain security.


10. Pre-Push Smoke Test

Problem: You push, CI takes 10 minutes, fails on a formatting issue.

Fix: Run the fast checks locally before push:

# Makefile.toml (cargo-make)
[tasks.pre-push]
description = "Local smoke test before pushing"
script = '''
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --lib
'''
cargo make pre-push  # < 30 seconds
git push

Or use a git pre-push hook:

#!/bin/sh
# .git/hooks/pre-push
cargo fmt --all -- --check && cargo clippy --workspace -- -D warnings

See CI/CD Pipeline for Makefile.toml patterns.


🏋️ Exercises

🟢 Exercise 1: Apply Three Tricks

Pick three tricks from this chapter and apply them to an existing Rust project. Which had the biggest impact?

Solution

Typical high-impact combination:

  1. [profile.dev.package."*"] opt-level = 2 — Immediate improvement in dev-mode runtime (2-10× faster for parsing-heavy code)

  2. CARGO_ENCODED_RUSTFLAGS — Eliminates false CI failures from third-party warnings

  3. cargo-hack --each-feature — Usually finds at least one broken feature combination in any project with 3+ features

# Apply trick 5:
echo '[profile.dev.package."*"]' >> Cargo.toml
echo 'opt-level = 2' >> Cargo.toml

# Apply trick 7 in CI:
# Replace RUSTFLAGS with CARGO_ENCODED_RUSTFLAGS

# Apply trick 3:
cargo install cargo-hack
cargo hack check --each-feature --no-dev-deps

🟡 Exercise 2: Deduplicate Your Dependency Tree

Run cargo tree --duplicates on a real project. Eliminate at least one duplicate. Measure compile-time before and after.

Solution
# Before
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l  # Count duplicate lines

# Find and fix one duplicate
cargo tree --duplicates
cargo tree --invert --package <duplicate-crate>@<old-version>
cargo update -p <parent-crate>

# After
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l  # Should be fewer

# Typical result: 5-15% compile time reduction per eliminated
# duplicate (especially for heavy crates like syn, tokio)

Key Takeaways

  • Use CARGO_ENCODED_RUSTFLAGS instead of RUSTFLAGS to avoid breaking third-party build scripts
  • [profile.dev.package."*"] opt-level = 2 is the single highest-impact dev experience trick
  • Cache tuning (save-if on main only) prevents CI cache bloat on active repositories
  • cargo tree --duplicates + cargo update is a free compile-time win — do it monthly
  • Run fast checks locally with cargo make pre-push to avoid CI round-trip waste

Quick Reference Card

Cheat Sheet: Commands at a Glance

# ─── Build Scripts ───
cargo build                          # Compiles build.rs first, then crate
cargo build -vv                      # Verbose — shows build.rs output

# ─── Cross-Compilation ───
rustup target add x86_64-unknown-linux-musl
cargo build --release --target x86_64-unknown-linux-musl
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
cross build --release --target aarch64-unknown-linux-gnu

# ─── Benchmarking ───
cargo bench                          # Run all benchmarks
cargo bench -- parse                 # Run benchmarks matching "parse"
cargo flamegraph -- --args           # Generate flamegraph from binary
perf record -g ./target/release/bin  # Record perf data
perf report                          # View perf data interactively

# ─── Coverage ───
cargo llvm-cov --html                # HTML report
cargo llvm-cov --lcov --output-path lcov.info
cargo llvm-cov --workspace --fail-under-lines 80
cargo tarpaulin --out Html           # Alternative tool

# ─── Safety Verification ───
cargo +nightly miri test             # Run tests under Miri
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
valgrind --leak-check=full ./target/debug/binary
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

# ─── Audit & Supply Chain ───
cargo audit                          # Known vulnerability scan
cargo audit --deny warnings          # Fail CI on any advisory
cargo deny check                     # License + advisory + ban + source checks
cargo deny list                      # List all licenses in dep tree
cargo vet                            # Supply chain trust verification
cargo outdated --workspace           # Find outdated dependencies
cargo semver-checks                  # Detect breaking API changes
cargo geiger                         # Count unsafe in dependency tree

# ─── Binary Optimization ───
cargo bloat --release --crates       # Size contribution per crate
cargo bloat --release -n 20          # 20 largest functions
cargo +nightly udeps --workspace     # Find unused dependencies
cargo machete                        # Fast unused dep detection
cargo expand --lib module::name      # See macro expansions
cargo msrv find                      # Discover minimum Rust version
cargo clippy --fix --workspace --allow-dirty  # Auto-fix lint warnings

# ─── Compile-Time Optimization ───
export RUSTC_WRAPPER=sccache         # Shared compilation cache
sccache --show-stats                 # Cache hit statistics
cargo nextest run                    # Faster test runner
cargo nextest run --retries 2        # Retry flaky tests

# ─── Platform Engineering ───
cargo check --target thumbv7em-none-eabihf   # Verify no_std builds
cargo build --target x86_64-pc-windows-gnu   # Cross-compile to Windows
cargo xwin build --target x86_64-pc-windows-msvc  # MSVC ABI cross-compile
cfg!(target_os = "linux")                    # Compile-time cfg (evaluates to bool)

# ─── Release ───
cargo release patch --dry-run        # Preview release
cargo release patch --execute        # Bump, commit, tag, publish
cargo dist plan                      # Preview distribution artifacts

Decision Table: Which Tool When

GoalToolWhen to Use
Embed git hash / build infobuild.rsBinary needs traceability
Compile C code with Rustcc crate in build.rsFFI to small C libraries
Generate code from schemasprost-build / tonic-buildProtobuf, gRPC, FlatBuffers
Link system librarypkg-config in build.rsOpenSSL, libpci, systemd
Static Linux binary--target x86_64-unknown-linux-muslContainer/cloud deployment
Target old glibccargo-zigbuildRHEL 7, CentOS 7 compatibility
ARM server binarycross or cargo-zigbuildGraviton/Ampere deployment
Statistical benchmarksCriterion.rsPerformance regression detection
Quick perf checkDivanDevelopment-time profiling
Find hot spotscargo flamegraph / perfAfter benchmark identifies slow code
Line/branch coveragecargo-llvm-covCI coverage gates, gap analysis
Quick coverage checkcargo-tarpaulinLocal development
Rust UB detectionMiriPure-Rust unsafe code
C FFI memory safetyValgrind memcheckMixed Rust/C codebases
Data race detectionTSan or MiriConcurrent unsafe code
Buffer overflow detectionASanunsafe pointer arithmetic
Leak detectionValgrind or LSanLong-running services
Local CI equivalentcargo-makeDeveloper workflow automation
Pre-commit checkscargo-husky or git hooksCatch issues before push
Automated releasescargo-release + cargo-distVersion management + distribution
Dependency auditingcargo-audit / cargo-denySupply chain security
License compliancecargo-deny (licenses)Commercial / enterprise projects
Supply chain trustcargo-vetHigh-security environments
Find outdated depscargo-outdatedScheduled maintenance
Detect breaking changescargo-semver-checksLibrary crate publishing
Dependency tree analysiscargo tree --duplicatesDedup and trim dep graph
Binary size analysiscargo-bloatSize-constrained deployments
Find unused depscargo-udeps / cargo-macheteTrim compile time and size
LTO tuninglto = true or "thin"Release binary optimization
Size-optimized binaryopt-level = "z" + strip = trueEmbedded / WASM / containers
Unsafe usage auditcargo-geigerSecurity policy enforcement
Macro debuggingcargo-expandDerive / macro_rules debugging
Faster linkingmold linkerDeveloper inner loop
Compilation cachesccacheCI and local build speed
Faster testscargo-nextestCI and local test speed
MSRV compliancecargo-msrvLibrary publishing
no_std library#![no_std] + default-features = falseEmbedded, UEFI, WASM
Windows cross-compilecargo-xwin / MinGWLinux → Windows builds
Platform abstraction#[cfg] + trait patternMulti-OS codebases
Windows API callswindows-sys / windows crateNative Windows functionality
End-to-end timinghyperfineWhole-binary benchmarks, before/after comparison
Property-based testingproptestEdge case discovery, parser robustness
Snapshot testinginstaLarge structured output verification
Coverage-guided fuzzingcargo-fuzzCrash discovery in parsers
Concurrency model checkingloomLock-free data structures, atomic ordering
Feature combination testingcargo-hackCrates with multiple #[cfg] features
Fast UB checks (near-native)cargo-carefulCI safety gate, lighter than Miri
Auto-rebuild on savecargo-watchDeveloper inner loop, tight feedback
Workspace documentationcargo doc + rustdocAPI discovery, onboarding, doc-link CI
Reproducible builds--locked + SOURCE_DATE_EPOCHRelease integrity verification
CI cache tuningSwatinem/rust-cache@v2Build time reduction (cold → cached)
Workspace lint policy[workspace.lints] in Cargo.tomlConsistent Clippy/compiler lints across all crates
Auto-fix lint warningscargo clippy --fixAutomated cleanup of trivial issues

Further Reading

TopicResource
Cargo build scriptsCargo Book — Build Scripts
Cross-compilationRust Cross-Compilation
cross toolcross-rs/cross
cargo-zigbuildcargo-zigbuild docs
Criterion.rsCriterion User Guide
DivanDivan docs
cargo-llvm-covcargo-llvm-cov
cargo-tarpaulintarpaulin docs
MiriMiri GitHub
Sanitizers in Rustrustc Sanitizer docs
cargo-makecargo-make book
cargo-releasecargo-release docs
cargo-distcargo-dist docs
Profile-guided optimizationRust PGO guide
Flamegraphscargo-flamegraph
cargo-denycargo-deny docs
cargo-vetcargo-vet docs
cargo-auditcargo-audit
cargo-bloatcargo-bloat
cargo-udepscargo-udeps
cargo-geigercargo-geiger
cargo-semver-checkscargo-semver-checks
cargo-nextestnextest docs
sccachesccache
mold linkermold
cargo-msrvcargo-msrv
LTOrustc Codegen Options
Cargo ProfilesCargo Book — Profiles
no_stdRust Embedded Book
windows-sys cratewindows-rs
cargo-xwincargo-xwin docs
cargo-hackcargo-hack
cargo-carefulcargo-careful
cargo-watchcargo-watch
Rust CI cacheSwatinem/rust-cache
Rustdoc bookRustdoc Book
Conditional compilationRust Reference — cfg
Embedded RustAwesome Embedded Rust
hyperfinehyperfine
proptestproptest
instainsta snapshot testing
cargo-fuzzcargo-fuzz
loomloom concurrency testing

Generated as a companion reference — a companion to Rust Patterns and Type-Driven Correctness.

Version 1.3 — Added cargo-hack, cargo-careful, cargo-watch, cargo doc, reproducible builds, CI caching strategies, capstone exercise, and chapter dependency diagram for completeness.