Jobs Utilities

Experiment job management helpers.

This module provides utilities for running and managing experiment jobs in a reproducible manner.

Classes and Functions

Job orchestration

Command encapsulates a command with seed, device, and result-directory arguments. Jobs manages parallel execution of experiments on multiple devices.

Helpers

dict_to_cmdlist converts dictionaries into command-line argument lists. move_directory moves an existing directory aside with versioning.

Example:

from tools import Command, Jobs, dict_to_cmdlist

cmd = Command(["python", "train.py", "--lr", "0.01"])
jobs = Jobs("./results", devices=["cuda:0", "cuda:1"])
jobs.submit("exp1", cmd)
jobs.wait()
jobs.close()
class tools.jobs.Command(command: Iterable[str])[source]

Bases: object

Command wrapper that adds standard runtime arguments.

This class wraps a base command and automatically appends seed, device, and result-directory arguments when building the final command line.

Parameters:
  • command (iterable of str) – Base command as an iterable of strings (e.g. ["python", "train.py"]). The iterable is copied on instantiation.

  • Attributes

  • ----------

  • _basecmd (list of str) – Internal copy of the base command.

build(seed: int | str, device: str, resdir: Path | str) list[str][source]

Build the final command line.

Parameters:
  • seed (int or str) – Seed to use for the experiment.

  • device (str) – Device on which to run the experiment (e.g. "cuda:0").

  • resdir (pathlib.Path or str) – Target directory path for results.

  • Returns

  • -------

  • str (list of) – Final command list ready to be passed to subprocess.run.

class tools.jobs.Jobs(res_dir: Path | str, devices: list[str] | None = None, devmult: int = 1, seeds: Sequence[int] | None = None)[source]

Bases: object

Job execution manager for parallel experiments.

Manages parallel execution of experiments across multiple devices, with support for result tracking and error handling.

Parameters:
  • res_dir (pathlib.Path or str) – Directory to store results.

  • devices (list of str, optional) – List of device names (e.g. ["cuda:0", "cuda:1"]). Defaults to ["cpu"] if none specified.

  • devmult (int, optional) – Number of parallel jobs per device. Default is 1.

  • seeds (sequence of int, optional) – Seeds to use for repeating experiments. Default is range(1, 6).

  • Attributes

  • ----------

  • _res_dir (pathlib.Path) – Resolved result directory.

  • _jobs (list of tuple or None) – Pending job queue as (name, seed, command) tuples, or None when the manager has been closed.

  • _workers (list of threading.Thread) – Worker thread pool, one entry per active slot.

  • _devices (list of str) – Devices used for execution.

  • _seeds (tuple of int) – Seeds used for repeating experiments.

  • _lock (threading.Lock) – Main lock protecting shared state.

  • _cvready (threading.Condition) – Condition variable to signal that new jobs are available or that workers must shut down.

  • _cvdone (threading.Condition) – Condition variable to signal that all submitted jobs have been processed.

close() None[source]

Close and wait for the worker pool, discarding not-yet-started submissions.

get_seeds() tuple[int, ...][source]

Get the list of seeds used for repeating the experiments.

Returns:

tuple of int

Seeds used by this manager.

submit(name: str, command: Command) None[source]

Submit a job for execution.

The job is repeated for every seed configured in the manager.

Parameters:
  • name (str) – Job identifier.

  • command (Command) – Command builder to execute.

  • Raises

  • ------

  • RuntimeError – If the manager has already been closed.

wait(predicate: Callable[[], bool] | None = None) None[source]

Wait for all the submitted jobs to be processed.

Parameters:

predicate (callable returning bool, optional) – Optional custom predicate. If provided, waiting stops when the predicate returns True.

tools.jobs.dict_to_cmdlist(dp: dict[str, Any]) list[str][source]

Convert a dictionary into command-line arguments.

This helper is useful for turning experiment configurations into CLI arguments.

Parameters:
  • dp (dict of str to Any) – Dictionary mapping parameter names to values.

  • Returns

  • -------

  • str (list of) – Command-line arguments such as ["--lr", "0.01", "--batch", "32"].

  • Notes

  • -----

  • True. (- Boolean values are included only when they are)

  • pairs. (- Lists and tuples expand to repeated --name value)

  • Example

  • -------

  • dict_to_cmdlist({"lr" (>>>)

  • ['--lr'

  • '0.01'

  • '--batch'

  • '32'

  • '--debug']

  • dict_to_cmdlist({"layers" (>>>)

  • ['--layers'

  • '64'

  • '--layers'

  • '128']

See also

For general utilities (parsing, timing, registries), see Miscellaneous Utilities. For PyTorch tensor helpers, see PyTorch Utilities.