Krum¶
Krum aggregation rule, single-gradient selection.
- Reference:
Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. “Machine learning with adversaries: Byzantine tolerant gradient descent.” In Advances in Neural Information Processing Systems 30 (NIPS 2017).
- class aggregators.krum.Krum[source]¶
Bases:
MultiKrumKrum aggregation rule, single-gradient selection.
For each worker gradient, Krum scores it by the sum of its distances to its \(n - f - 2\) closest neighbors, and returns the gradient with the smallest score — the one most consistent with the other honest workers. This is
MultiKrumwith \(m = 1\).- classmethod aggregate(gradients: Sequence[Tensor] | Tensor, /, out: Tensor | None = None, *, n: int, f: int, **specialized: Any) Tensor[source]¶
Aggregate the gradients.
- Parameters:
gradients – Sequence of 1-D tensors containing gradients from workers.
out – Optional pre-allocated tensor to write the result into.
n – Total number of workers.
f – Number of Byzantine workers to tolerate.
**specialized – Additional keyword arguments.
- Returns:
Aggregated gradient of shape `` (d,)
- Raises:
ValueError – If \(n < 1\), \(f < 0\), \(f > n\), \(n < 2f + 3\), or
len(gradients) != n.