Overview

This page is a tour through the major pieces of bunsen, in roughly the order you’d encounter them building a model on top of burn. The sections below each have a dedicated chapter elsewhere in the book; this page is the orientation map.

Tensor Contracts

Shape errors in tensor code are hard to diagnose: a bad reshape produces the wrong meaning, not an exception, and the eventual failure points at a symptom three layers downstream. bunsen::contracts lets you write the shape of a tensor the way a paper would,

$x \in R^{B \times C \times H \times W}$

and then check it at module boundaries, in one step that both asserts the pattern and unpacks the named dimensions you’ll need:

#![allow(unused)]
fn main() {
use bunsen::contracts::unpack_shape_contract;
let tensor = [12, 3 * 4, 5 * 4, 3];
let [b, h_wins, w_wins, c] = unpack_shape_contract!(
    [
        "batch",
        "height" = "h_wins" * "window_size",
        "width"  = "w_wins" * "window_size",
        "channels",
    ],
    &tensor,
    &["batch", "h_wins", "w_wins", "channels"],
    &[("window_size", 4)],
);
}

If the shape doesn’t match, the failure is loud and specific — it names the offending dimension, the pattern it failed against, and the parameter bindings in scope. The check is fast enough (~160 ns per unpack on a four-dimensional shape) to stay enabled in release builds.

Contracts are the shared vocabulary the rest of bunsen is built on; almost every block in the crate uses them at its module boundaries.

See Contracts for the pattern grammar, the full macro surface, the cost-control mechanisms, and the rationale behind the design.

Ops — pure tensor functions

bunsen::ops is the functional layer: pure functions over tensors that extend burn::tensor::Tensor’s surface without owning any trainable parameters. Range generators, clamping, dropout, noise, RMSNorm, repeat-interleave, and a substantial collection of convolution-shape arithmetic and functional-conv helpers.

A typical block’s forward is a parametric layer (a Linear, a Conv2d) wrapped in two or three ops calls. Lifting the non-parametric work out makes the block readable and makes the underlying math testable in isolation.

   contracts        validate shapes between layers
       │
       ▼
   ops              pure functions over Tensors
       │
       ▼
   blocks           stateful parameter-owning Modules
       │
       ▼
   kits             whole models built from blocks

The ops surface also includes small value-object types (ClampOp, NoiseConfig) designed to be embedded directly into a #[derive(Config)] struct of a downstream module.

Blocks — reusable Module components

bunsen::blocks is the stateful layer: burn::module::Module building blocks that own parameters and can be trained. Organized by domain:

Transformers — multi-head causal self-attention with grouped-query support, a KV cache for autoregressive decoding, scaled-dot-product attention helpers, and rotary positional embeddings.
Images — conv composites (ConvNorm2d, CNA2d), ViT-style patch tokenization (PatchEmbed), TF-style same-padding pooling, and stochastic regularization layers (DropBlock, DropPath).

Every block follows the patterns documented in Building Reusable Modules: Meta traits for cross-module introspection, Contract→Structure config splits where the user-facing knobs differ from the implementation parameter list, and inline shape contracts at module boundaries.

Kits — complete domain implementations

bunsen::kits is for whole things you pick up and use end-to-end. The current categories:

bimm — Bunsen/Burn Image Models, an incremental port of the timm ecosystem. Currently includes the ResNet family with pretrained-weight loaders and the Swin Transformer V2 family.
gpts — full GPT / LLM variants. Currently includes NanoChat, a compact GPT suitable for experimentation and fine-tuning.
sims — iterative tensor simulations. Currently includes Conway’s Game of Life in 2D and 3D, and a D2Q9 lattice-Boltzmann fluid solver.

Kits compose the lower layers: each one uses contracts, ops, and blocks (and, where training matters, burner) to deliver a full domain solution rather than a building block. They’re also where to look for worked examples of every convention in real code.

Burner — `burn`-adjacent infrastructure

bunsen::burner is the infrastructure layer: the pieces that sit next to burn itself, not on top of its tensor surface. Most code that uses bunsen won’t import from burner at all — you reach for it when you need to:

introspect a model generically — the reflection layer turns a Module into a queryable XML document with an XPath query API, for “select every rank-2 weight under the transformer blocks” problems;
compose optimizers — the GroupOptimizerAdaptor{N} family mounts multiple optimizers on a single module (e.g. Muon for matrix parameters, AdamW for the rest), each driving a disjoint group of parameters, with per-group learning-rate selectors;
carry tensor metadata in non-generic code paths (TensorParamDesc);
work with burn::record outside what the derive macros provide for free.

The reflection and group-optimizer pieces compose: the canonical pattern is to walk a model with XmlModuleTree, slice it into parameter groups with XPath, and hand the groups to a GroupOptimizerAdaptor. The NanoChat training demo (demos/chat/examples/train) does exactly that.

Keyboard shortcuts