Image Blocks
bunsen::blocks::images collects the building blocks used by
2-D vision models — convolutional composites, patch
tokenization, same-padding pooling, and stochastic-depth-style
regularization layers.
API: https://docs.rs/bunsen/latest/bunsen/blocks/images/
Conv composites
The conv
submodule packages the conv-plus-something composites that show up
across ResNet, EfficientNet, and friends.
ConvNorm2d
ConvNorm2d
is the standard Conv2d + BatchNorm pairing with a single
forward. Beyond the convenience of one module instead of two, it
carries zero_init_norm() — the “zero-initialize the last batch
norm in a residual branch” trick used by ResNet and successors to
make residual branches start as identities.
CNA2d
CNA2d
is the more general Conv / Norm / Activation block. Beyond the
basic forward, it provides:
match_norm_features()— adapts a genericNormalizationConfig(BatchNorm::new(0),RmsNorm::new(0), etc.) to the right channel count after the conv. Lets callers pass a norm config without knowing the channel count yet.map_forward(f)— runs the conv and norm, then hands the intermediate tensor to a user closure before the activation. Useful for inserting attention, channel reweighting, or per-residual side-effects without copying out the rest of the block.
Patching
patching
holds patch tokenization, the entry point for transformer-style
vision models.
PatchEmbed
PatchEmbed
is ViT-style patch tokenization: it takes an
image, splits it into non-overlapping patches, and
projects each patch into an embedding, producing a sequence of
tokens of width embed_dim.
Pooling
pool
holds the pooling layers that don’t fit burn’s defaults.
AvgPool2dSame
AvgPool2dSame
is TensorFlow-style same padding for average pooling —
asymmetric where needed to keep the spatial dimensions of the output
aligned with ceil(input / stride). Helpers get_same_padding and
pad_same are exposed for the underlying arithmetic, useful when
you’re matching a Keras / TF-Slim reference implementation.
Stochastic regularization
drop
collects regularization layers that drop structured pieces of the
activations rather than individual scalars.
DropBlock
DropBlock
is structured spatial dropout from
Ghiasi et al., 2018: instead
of dropping independent pixels, it drops contiguous blocks of
activations. For convnets this acts as a substantially stronger
regularizer than plain dropout, because adjacent pixels are highly
correlated and independent dropout barely removes information.
DropPath
DropPath
is stochastic depth from
Huang et al., 2016: with some
probability, the entire residual branch is zeroed for a given
sample, so the network sees a shorter effective depth on each
training step.
Supporting types
progressive_dpr— rate table that linearly ramps drop rates over a network’s depth, matching the SWIN V2 /timmconvention of giving deeper blocks higher drop probabilities.SizeConfig— a small enum describing a size as eitherDefault, aRatio(f64), or aFixed(usize). Used by the drop layers when the effective region size is relative to a wrapped layer’s spatial dimensions.