Welcome to the new Krum documentation!

Krum, the Library

Datasets¶

LIBSVM dataset loaders.

This module provides builder functions that can be registered automatically by the experiments.Dataset loader because they are listed in __all__. Each builder downloads the raw LIBSVM file on first use, caches a pre-processed PyTorch tensor version, and returns an infinite-batch generator.

Example:¶

>>> from experiments import Dataset
>>> dataset = Dataset("svm-phishing", train=True, download=True)
>>> inputs, labels = dataset.sample()

See Also:¶

experiments.batch_datasethelper used internally to create the infinite: sampler from raw tensors.

experiments.datasets.svm.phishing(train=True, batch_size=None, root=None, download=False, *args, **kwargs)[source]¶

Phishing dataset builder returning an infinite-batch generator.

Parameters:

train (bool, optional) – Whether to return the training split. If False, the test split is returned instead.
batch_size (int or None, optional) – Number of samples per batch. None or 0 yields the full split in a single batch.
root (pathlib.Path or str or None, optional) – Cache directory. None defaults to experiments.dataset.Dataset.get_default_root().
download (bool, optional) – Whether to allow downloading the raw file if the cache is missing.
*args (object) – Ignored (kept for API compatibility).
**kwargs (object) – Ignored (kept for API compatibility).
Returns
-------
generator – Infinite sampler yielding (inputs, labels) tuples.
Notes
-----
test). (The dataset is split at position 8400 (≈ 76 % train / 24 %)
divisibility (The split point was chosen for good)

:param (\(8400 = 2^4 \times 3 \times 5^2 \times 7\)).:

See also

For the dataset wrapper that loads these constructors, see Dataset.