Feature Flags
These are the Cargo features available on wordchipper. Features can be enabled in your
Cargo.toml:
[dependencies]
wordchipper = { version = "0.8", features = ["client"] }
The default features are std, fast-hash, and parallel.
features = ["std"]
Enabled by default.
Provide standard library integration: regex-based spanning via regex and fancy-regex, file I/O,
and std::collections::HashMap. Building with default-features = false removes the standard
library dependency, leaving a pure no_std crate that uses hashbrown for hash maps and logos DFA
lexers for pre-tokenization.
features = ["fast-hash"]
Enabled by default.
Use foldhash for faster hash maps. When combined with std, the standard library's HashMap is
used with foldhash's hasher. Without std, hashbrown::HashMap is used with foldhash's hasher.
This feature has no std requirement, so it works in no_std environments.
features = ["parallel"]
Enabled by default. Implies concurrent.
Enable rayon-based parallelism for batch encoding and decoding. Control the thread count with the
RAYON_NUM_THREADS environment variable.
features = ["concurrent"]
Implies std.
Enable the thread pool (PoolToy) and concurrency utilities used for concurrent encoder access.
The parallel feature enables this automatically; use concurrent directly if you want the thread
pool without pulling in rayon.
features = ["client"]
Implies download and datagym (both of which imply std).
Everything needed to load pretrained vocabularies: downloading from the network and parsing DataGym-format files.
features = ["download"]
Implies std.
Download and cache vocabulary files from the internet.
features = ["datagym"]
Implies std.
Load DataGym-format vocabularies (used by older OpenAI models like GPT-2). Pulls in serde_json.
features = ["tracing"]
Add tracing instrumentation points throughout the encoding pipeline. Only useful for profiling
the library itself.
features = ["testing"]
Export test utilities for downstream crates to use in their own test suites.