Experiments¶
The experiments module provides dataset and model wrappers that form the
foundation of a Krum training loop. It handles model construction, dataset
loading, loss composition, optimization, checkpointing, and device/dtype
configuration.
Components¶
Component |
Role |
Key Class |
|---|---|---|
Configuration |
Device, dtype, and memory-transfer settings |
|
Model |
Model wrapper with parameter flattening and gradient handling |
|
Dataset |
Infinite-batch wrapper around torchvision and custom datasets |
|
Loss |
Derivable loss with regularization and composition support |
|
Criterion |
Non-derivable evaluation metric (top-k, sigmoid) |
|
Optimizer |
Optimizer wrapper with learning-rate control |
|
Checkpoint |
Save/restore state dictionaries for models and optimizers |
|
Core Modules¶
Core Modules:
Custom Models & Datasets¶
API Reference¶
Experiment components for model training, dataset loading, and evaluation.
This module groups the building blocks of a Krum training loop:
experiments.configuration— device and dtype configuration.experiments.model— model wrapper with parameter flattening.experiments.dataset— dataset loading and infinite batch sampling.experiments.loss— derivable loss composition.experiments.optimizer— optimizer wrapper with LR control.experiments.checkpoint— save and restore state dictionaries.
Custom models and datasets can be added under experiments/models/ and
experiments/datasets/; they are discovered automatically at import time.
Example:¶
from experiments import (
Configuration, Model, Dataset,
Loss, Criterion, Optimizer,
make_datasets,
)
config = Configuration(device="cuda:0")
model = Model("resnet18", config, num_classes=10)
trainset, testset = make_datasets("cifar10", train_batch=128)
- class experiments.Checkpoint[source]
Bases:
objectCollection of state dictionaries with saving/loading helpers.
This class can snapshot any object implementing the
state_dict/load_state_dictprotocol (e.g.torch.nn.Module,torch.optim.Optimizer). It also knows how to unwrapModelandOptimizerwrappers automatically.Example:¶
>>> ckpt = Checkpoint() >>> ckpt.snapshot(model, deepcopy=True) >>> ckpt.restore(model)
- load(filepath, overwrite=False)[source]
Load checkpoint data from a file.
- Parameters:
filepath (str or pathlib.Path) – Path to the saved checkpoint.
overwrite (bool, optional) – Whether to overwrite any existing snapshots.
Returns
-------
Checkpoint – Self, for chaining.
Raises
------
tools.UserException – If the checkpoint is non-empty and
overwriteisFalse.
- restore(instance, nothrow=False)[source]
Restore an instance from its stored snapshot.
- save(filepath, overwrite=False)[source]
Save the current checkpoint to a file.
- Parameters:
filepath (str or pathlib.Path) – Destination path.
overwrite (bool, optional) – Whether to overwrite an existing file.
Returns
-------
Checkpoint – Self, for chaining.
Raises
------
tools.UserException – If the file exists and
overwriteisFalse.
- snapshot(instance, overwrite=False, deepcopy=False, nowarnref=False)[source]
Take (or overwrite) a snapshot of an instance’s state dictionary.
- Parameters:
instance (object) – Instance to snapshot. Must support
state_dict().overwrite (bool, optional) – Whether to overwrite an existing snapshot for the same class.
deepcopy (bool, optional) – Whether to deep-copy the state dictionary instead of shallow-copying.
nowarnref (bool, optional) – Suppress the debug warning when restoring a reference is the intended behavior.
Returns
-------
Checkpoint – Self, for chaining.
Raises
------
tools.UserException – If a snapshot already exists and
overwriteisFalse.
- class experiments.Configuration(device=None, dtype=None, noblock=False, relink=False)[source]
Bases:
MappingImmutable tensor configuration holder.
This class bundles
device,dtype, and memory-transfer options into a single immutable mapping. It is used throughoutexperimentsto ensure every created or moved tensor uses the same configuration.- Parameters:
device (str, torch.device, or None, optional) – Target device.
Nonedefaults to"cuda"when available, otherwise"cpu". Strings such as"cuda:0"are resolved automatically.dtype (torch.dtype or None, optional) – Tensor datatype.
Noneuses PyTorch’s current default dtype.noblock (bool, optional) – Whether to use non-blocking host-to-device transfers.
relink (bool, optional) – Whether to relink instead of copying during parameter assignments.
Example
-------
Configuration (>>> from experiments import)
Configuration(device="cpu" (>>> config =)
dtype=torch.float32)
config["device"] (>>>)
device(type='cpu')
- default_device = 'cpu'
- class experiments.Criterion(name_build, *args, **kwargs)[source]
Bases:
objectNon-derivable evaluation metric wrapper.
Available criteria:
"top-k"— top-k classification accuracy."sigmoid"— binary accuracy with sigmoid threshold at 0.5.
All criteria return a 1-D tensor
[num_correct, batch_size].
- class experiments.Dataset(data, name=None, root=None, *args, **kwargs)[source]
Bases:
objectUnified dataset wrapper producing infinite batches.
This class can wrap:
A
torchvisiondataset loaded by name.A custom generator yielding batches forever.
A single fixed batch repeated forever.
- Parameters:
data (str, generator, or object) – Dataset name, infinite generator, or single batch.
name (str or None, optional) – User-defined name for debugging.
root (str or pathlib.Path or None, optional) – Cache root directory.
Noneuses the default.*args (object) – Forwarded to the dataset constructor when
datais a string.**kwargs (object) – Forwarded to the dataset constructor when
datais a string.Raises
------
tools.UnavailableException – If
datais an unknown dataset name.TypeError – If constructor arguments are invalid.
- epoch(config=None)[source]
Return a finite epoch iterator.
Note
Only works for DataLoader-based datasets.
- Parameters:
config (experiments.Configuration or None, optional) – Target configuration for tensor placement.
Returns
-------
generator – Finite iterator over one epoch.
- classmethod get_default_root()[source]
Lazily initialize and return the default dataset cache directory.
Returns:¶
- pathlib.Path
Path to the dataset cache. Falls back to the system temp directory if the default does not exist.
- sample(config=None)[source]
Sample the next batch.
- Parameters:
config (experiments.Configuration or None, optional) – Target configuration for tensor placement.
Returns
-------
tuple – Next batch, optionally moved to the target device.
- class experiments.Loss(name_build, *args, **kwargs)[source]
Bases:
objectDerivable loss function wrapper with composition support.
Losses can be added (
loss1 + loss2) and scaled (0.5 * loss). All standard PyTorch losses are available by lower-case name. Additionally,"l1"and"l2"provide parameter-norm regularization.- Parameters:
name_build (str or callable) – Loss name (e.g.
"crossentropy","mse") or a callable with signature(output, target, params) -> tensor.*args (object) – Forwarded to the loss constructor when
name_buildis a string.**kwargs (object) – Forwarded to the loss constructor when
name_buildis a string.Raises
------
tools.UnavailableException – If
name_buildis an unknown string.
- class experiments.Model(name_build, config=None, init_multi=None, init_multi_args=None, init_mono=None, init_mono_args=None, *args, **kwargs)[source]
Bases:
objectUnified model wrapper with parameter and gradient management.
Models are resolved by lower-case name from
torchvision.modelsand from custom modules underexperiments/models/. Parameters are automatically flattened into a contiguous vector accessible viaget()andset().Data parallelism (
torch.nn.DataParallel) is enabled automatically when the model is placed on a non-indexed CUDA device.- Parameters:
name_build (str or callable) – Model name (e.g.
"resnet18","torchvision-resnet18") or a constructor function.config (experiments.Configuration, optional) – Target device and dtype configuration.
init_multi (str or callable or None, optional) – Weight initializer for tensors of dimension
>= 2.init_multi_args (dict or None, optional) – Keyword arguments for
init_multiwhen it is a string.init_mono (str or callable or None, optional) – Weight initializer for tensors of dimension
== 1.init_mono_args (dict or None, optional) – Keyword arguments for
init_monowhen it is a string.*args (object) – Forwarded to the model constructor.
**kwargs (object) – Forwarded to the model constructor.
Raises
------
tools.UnavailableException – If
name_buildis an unknown string.tools.UserException – If the built object is not a
torch.nn.Module.
- backprop(dataset=None, loss=None, outloss=False, **kwargs)[source]
Compute gradient on a batch from the given dataset.
- Parameters:
dataset (experiments.Dataset or None, optional) – Dataset to sample from.
Noneuses the default trainset.loss (experiments.Loss or None, optional) – Loss function.
Noneuses the default loss.outloss (bool, optional) – Whether to also return the loss value.
**kwargs (object) – Forwarded to
loss.backward().Returns
-------
tuple[torch.Tensor (torch.Tensor or) – Flat gradient, optionally paired with the loss value.
torch.Tensor] – Flat gradient, optionally paired with the loss value.
- property config
Return the immutable configuration.
Returns:¶
- experiments.Configuration
Model configuration.
- default(name, new=None, erase=False)[source]
Get and/or set a named default.
- Parameters:
name (str) – Default key (e.g.
"trainset","loss","optimizer").new (object or None, optional) – New value to set. Ignored unless
new is not NoneoreraseisTrue.erase (bool, optional) – Force the value to
None.Returns
-------
object – Current (or old) value of the default.
Raises
------
tools.UnavailableException – If
nameis not a known default.
- eval(dataset=None, criterion=None)[source]
Evaluate the model on a batch from the given dataset.
- Parameters:
dataset (experiments.Dataset or None, optional) – Dataset to sample from.
Noneuses the default testset.criterion (experiments.Criterion or None, optional) – Criterion function.
Noneuses the default criterion.Returns
-------
torch.Tensor – Mean criterion value over the sampled batch.
- get()[source]
Get a reference to the flat parameter vector.
Returns:¶
- torch.Tensor
Flat parameter tensor. Future calls to
set()will modify it in place.
- get_gradient()[source]
Get (or create) the flat gradient vector.
Returns:¶
- torch.Tensor
Flat gradient tensor. Future calls to
set_gradient()will modify it in place.
- loss(dataset=None, loss=None, training=None)[source]
Estimate loss on a batch from the given dataset.
- Parameters:
dataset (experiments.Dataset or None, optional) – Dataset to sample from.
Noneuses the default trainset.loss (experiments.Loss or None, optional) – Loss function.
Noneuses the default loss.training (bool or None, optional) – Whether this is a training run.
Noneguesses fromtorch.is_grad_enabled().Returns
-------
torch.Tensor – Scalar loss value.
- run(data, training=False)[source]
Forward pass through the model.
- Parameters:
data (torch.Tensor) – Input tensor.
training (bool, optional) – Whether to use training mode (enables dropout, batch-norm updates, etc.). Defaults to evaluation mode.
Returns
-------
torch.Tensor – Model output.
- set(params, relink=None)[source]
Overwrite parameters with the given flat vector.
- Parameters:
params (torch.Tensor) – New flat parameter vector.
relink (bool or None, optional) – Whether to relink instead of copying.
Noneuses the configuration default.
- set_gradient(gradient, relink=None)[source]
Overwrite the gradient with the given flat vector.
- Parameters:
gradient (torch.Tensor) – New flat gradient.
relink (bool or None, optional) – Whether to relink instead of copying.
Noneuses the configuration default.
- update(gradient, optimizer=None, relink=None)[source]
Update parameters using the given gradient and optimizer.
- Parameters:
gradient (torch.Tensor) – Flat gradient to apply.
optimizer (experiments.Optimizer or None, optional) – Optimizer wrapper.
Noneuses the default optimizer.relink (bool or None, optional) – Whether to relink the gradient.
Noneuses the configuration default.
- class experiments.Optimizer(name_build, model, *args, **kwargs)[source]
Bases:
objectOptimizer wrapper with name resolution and LR control.
- Parameters:
name_build (str or callable) – Optimizer name (e.g.
"adam","sgd") or a constructor function. Names are resolved againsttorch.optim.model (experiments.Model) – Model whose parameters will be optimized.
*args (object) – Additional positional arguments forwarded to the optimizer constructor.
**kwargs (object) – Additional keyword arguments forwarded to the optimizer constructor.
Raises
------
tools.UnavailableException – If
name_buildis a string that does not match any known optimizer.
- class experiments.Storage[source]
Bases:
dictPlain dictionary that implements the
state_dictprotocol.This allows arbitrary key/value data to be snapshotted and restored alongside models and optimizers using
Checkpoint.
- experiments.batch_dataset(inputs, labels, train=False, batch_size=None, split=0.75)[source]
Batch a raw tensor dataset into infinite sampler generators.
- Parameters:
inputs (torch.Tensor) – Input data tensor.
labels (torch.Tensor) – Label tensor with the same first-dimension size as
inputs.train (bool, optional) – Whether to build a training set (adds shuffling) or a test set.
batch_size (int or None, optional) – Batch size.
Noneor0uses the full split size.split (float or int, optional) – Fraction of samples for training when
< 1, or absolute count when>= 1.Returns
-------
generator – Infinite sampler generator.
- experiments.get_default_transform(dataset, train)[source]
Return the default transform for a torchvision dataset.
- experiments.make_datasets(dataset, train_batch=None, test_batch=None, train_transforms=None, test_transforms=None, num_workers=1, **custom_args)[source]
Build training and testing dataset wrappers.
- Parameters:
dataset (str) – Case-sensitive dataset name.
train_batch (int or None, optional) – Training batch size.
Noneor0for full-batch.test_batch (int or None, optional) – Testing batch size.
Noneor0for full-batch.train_transforms (callable or None, optional) – Transform for the training set.
Noneuses the default.test_transforms (callable or None, optional) – Transform for the testing set.
Noneuses the default.num_workers (int or tuple[int, int], optional) – Number of workers for the training and testing loaders. An
intapplies to both; a tuple specifies(train_workers, test_workers).**custom_args (object) – Additional keyword arguments forwarded to the dataset constructor.
Returns
-------
tuple[Dataset – Training and testing dataset wrappers.
Dataset] – Training and testing dataset wrappers.
- experiments.make_sampler(loader)[source]
Create an infinite sampler from a DataLoader.
- Parameters:
loader (torch.utils.data.DataLoader) – Finite data loader.
Yields
------
tuple – Batches, transparently restarting the loader when exhausted.