Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

bunsen::kits::gpts

Full GPT / LLM variants. Where bunsen::blocks provides reusable transformer sub-modules, gpts is for whole language-model architectures: end-to-end models, tokenizer wiring, and the training/inference surface around them.

API: https://docs.rs/bunsen/latest/bunsen/kits/gpts/

Current models

nanochat

A compact GPT in the spirit of the “nano” GPT lineage — small enough to train on modest hardware, opinionated enough to be a useful reference implementation.

The model lives in bunsen::kits::gpts::nanochat and is split into:

  • the per-layer MLP,
  • the transformer block (attention + MLP + norms),
  • the full model wrapper that stacks the blocks and adds embedding and head layers.

gpts is a work in progress; further GPT/LLM variants will land here as the port from upstream reference implementations progresses.