Brute

Brute aggregation rule, most-clumped-subset baseline.

Reference:

El Mahdi El Mhamdi, Rachid Guerraoui, and Sébastien Rouault. “The Hidden Vulnerability of Distributed Learning in Byzantium.” In Proceedings of the 35th International Conference on Machine Learning (ICML 2018).

class aggregators.brute.Brute[source]

Bases: Aggregator

Brute aggregation rule, most-clumped subset selection.

For every subset \(R\) of size \(n - f\) from the submitted gradients, define its clumping score as \(\max_{i, j \in R} \|V_i - V_j\|^2\). The aggregator picks the subset with the smallest clumping score and returns its mean. This enumerates \(\binom{n}{n-f}\) subsets, so it is only feasible when that count is small (the paper uses \(6\) honest + \(5\) Byzantine workers, giving \(\binom{11}{6} = 462\) subsets).

classmethod aggregate(gradients: Sequence[Tensor] | Tensor, /, out: Tensor | None = None, *, n: int, f: int, **specialized: Any) Tensor[source]

Aggregate gradients by selecting the most-clumped \(n - f\) subset.

Parameters:
  • gradients – Sequence of 1-D tensors containing gradients from workers.

  • out – Optional pre-allocated tensor to write the result into.

  • n – Total number of workers.

  • f – Number of Byzantine workers to tolerate.

  • **specialized – Additional keyword arguments.

Returns:

Mean of the selected ``n - f`` subset, of shape `` (d,)

Raises:

ValueError – If \(n\), \(f\), or the gradients count is invalid.

See also

For a vector-level medoid, see GeoMed. For stronger two-stage resilience, see Bulyan.