decent_bench.metrics.metric_library#
Collection of pre-defined table and plot metrics.
- class decent_bench.metrics.metric_library.Regret(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricGlobal regret.
- Table:
Global regret using the agents’/clients’ final x.
- Plot:
Global regret (y-axis) per iteration (x-axis).
Global regret is defined as:
\[\frac{1}{N} \sum_i (f_i(\mathbf{\bar{x}}) - f_i(\mathbf{x}^\star))\]where \(f_i\) is agent i’s local cost function, \(\mathbf{\bar{x}}\) is the mean x across all \(N\) agents, and \(\mathbf{x}^\star\) is the optimal x defined in the problem.
Note
Available only when
problem.x_optimalis provided.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.GradientNorm(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricGlobal gradient norm.
- Table:
Gradient norm using the agents’/clients’ final x.
- Plot:
Gradient norm (y-axis) per iteration (x-axis).
Gradient norm is defined as:
\[\| \frac{1}{N} \sum_i \nabla f_i(\mathbf{\bar{x}}) \|\]where N is the number of agents, \(f_i\) is agent i’s local cost function, and \(\mathbf{\bar{x}}\) is the mean x across all agents.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.XError(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricDistance to optimal solution.
- Table:
Distance to optimal solution using the mean of the agents’/clients’ final x.
- Plot:
Distance to optimal solution (y-axis) per iteration (x-axis).
X error is defined as:
\[\|\mathbf{\bar{x}} - \mathbf{x}^\star\|\]where \(\mathbf{\bar{x}}\) is the mean x across all agents/clients, and \(\mathbf{x}^\star\) is the optimal x defined in the problem.
Note
Available only when
problem.x_optimalis provided.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.ConsensusError(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricDistance to consensus.
- Table:
Distance of the agents’/clients’ states from their current average.
- Plot:
Distance to consensus (y-axis) per iteration (x-axis).
The consensus error per agent/client is defined as:
\[\{ \|\mathbf{x}_i - \bar{\mathbf{x}}\|, \|\mathbf{x}_j - \bar{\mathbf{x}}\|, ... \}\]where \(\mathbf{x}_i\) is agent/client i’s current state, \(\bar{\mathbf{x}}\) is the average of all agents’/clients’ states, and \(\| \cdot \|\) is the 2-norm.
See also
RuntimeConsensusErrorfor the runtime version.- compute(network: NetworkMetricsView, _: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.XUpdates(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of x iterations/updates.
- Table:
Number of x iterations/updates per agent.
- Plot:
Number of x iterations/updates (y-axis) per iteration (x-axis). Will be a flat line as the number of x iterations/updates is only calculated at the end of the trial, not per iteration.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[int][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.FunctionCalls(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of function calls.
- Table:
Number of function calls per agent.
- Plot:
Number of function calls (y-axis) per iteration (x-axis). Will be a flat line as the number of function calls is only calculated at the end of the trial, not per iteration.
Note
Can be a floating point number if
EmpiricalRiskCostis used and a batch size other than the full dataset size is used.- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.GradientCalls(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of gradient calls.
- Table:
Number of gradient calls per agent.
- Plot:
Number of gradient calls (y-axis) per iteration (x-axis). Will be a flat line as the number of gradient calls is only calculated at the end of the trial, not per iteration.
Note
Can be a floating point number if
EmpiricalRiskCostis used and a batch size other than the full dataset size is used.- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.HessianCalls(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of Hessian calls.
- Table:
Number of Hessian calls per agent.
- Plot:
Number of Hessian calls (y-axis) per iteration (x-axis). Will be a flat line as the number of Hessian calls is only calculated at the end of the trial, not per iteration.
Note
Can be a floating point number if
EmpiricalRiskCostis used and a batch size other than the full dataset size is used.- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.ProximalCalls(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of proximal calls.
- Table:
Number of proximal calls per agent.
- Plot:
Number of proximal calls (y-axis) per iteration (x-axis). Will be a flat line as the number of proximal calls is only calculated at the end of the trial, not per iteration.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.SentMessages(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of sent messages.
- Table:
Number of sent messages per agent. For federated networks, this includes the server.
- Plot:
Number of sent messages (y-axis) per iteration (x-axis). Will be a flat line as the number of sent messages is calculated at the end of the trial, not per iteration.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.ReceivedMessages(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of received messages.
- Table:
Number of received messages per agent. For federated networks, this includes the server.
- Plot:
Number of received messages (y-axis) per iteration (x-axis). Will be a flat line as the number of received messages are calculated at the end of the trial, not per iteration.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.SentMessagesDropped(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricNumber of sent messages dropped.
- Table:
Number of sent messages dropped per agent. For federated networks, this includes the server.
- Plot:
Number of sent messages dropped (y-axis) per iteration (x-axis). Will be a flat line as the number of sent messages dropped is calculated at the end of the trial, not per iteration.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.Accuracy(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricAccuracy of the agents’/clients’ predictions.
- Table:
Accuracy of the agents’/clients’ final x.
- Plot:
Accuracy (y-axis) per iteration (x-axis).
Accuracy is calculated as the mean accuracy across agents/clients, where each agent’s/client’s accuracy is calculated using its recorded x at that iteration.
Only available for
EmpiricalRiskCostand integer targets.Accuracy measures the proportion of correct predictions:
\[\text{Accuracy} = \frac{\text{TP} + \text{TN}}{\text{TP} + \text{TN} + \text{FP} + \text{FN}}\]where TP, TN, FP, and FN are true positives, true negatives, false positives, and false negatives, respectively.
Note
Available only when:
problem.test_datais provided,all agent costs are
EmpiricalRiskCost,target labels are integer-valued.
- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.MSE(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricMean squared error of the agents’/clients’ predictions.
- Table:
Mean squared error of the agents’/clients’ final x.
- Plot:
Mean Squared Error (MSE) (y-axis) per iteration (x-axis).
MSE is calculated as the mean MSE across agents/clients, where each agent’s/client’s MSE is calculated using its recorded x at that iteration.
Only available for
EmpiricalRiskCost.MSE measures the average squared difference between predictions and true values:
\[\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (\hat{y}_i - y_i)^2\]where \(\hat{y}_i\) are the predicted values, \(y_i\) are the true values, and \(n\) is the number of samples.
Note
Available only when
problem.test_datais provided and all agent costs areEmpiricalRiskCost.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.Precision(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricPrecision of the agents’/clients’ predictions.
- Table:
Precision of the agents’/clients’ final x.
- Plot:
Precision (y-axis) per iteration (x-axis).
Precision is calculated as the mean precision across agents/clients, where each agent’s/client’s precision is calculated using its recorded x at that iteration.
Only available for
EmpiricalRiskCostand integer targets.Precision measures the proportion of positive predictions that are correct:
\[\text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}}\]where TP is the number of true positives and FP is the number of false positives.
Note
Available only when:
problem.test_datais provided,all agent costs are
EmpiricalRiskCost,target labels are integer-valued.
- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.Recall(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricRecall of the agents’/clients’ predictions.
- Table:
Recall of the agents’/clients’ final x.
- Plot:
Recall (y-axis) per iteration (x-axis).
Recall is calculated as the mean recall across agents/clients, where each agent’s/client’s recall is calculated using its recorded x at that iteration.
Only available for
EmpiricalRiskCostand integer targets.Recall measures the proportion of actual positives that are correctly identified:
\[\text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}}\]where TP is the number of true positives and FN is the number of false negatives.
Note
Available only when:
problem.test_datais provided,all agent costs are
EmpiricalRiskCost,target labels are integer-valued.
- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.Loss(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricLoss of the agents’/clients’ predictions.
- Table:
Loss of the agents’/clients’ final x.
- Plot:
Loss (y-axis) per iteration (x-axis).
Loss is calculated as the mean loss across agents/clients, where each agent’s/client’s loss is calculated using its recorded x at that iteration.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.ClientDriftFromServer(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricDistance between client local models and the server model.
- Table:
Distance of the clients’ final states from the final server state.
- Plot:
Client drift from server (y-axis) per iteration (x-axis).
The client drift per client is defined as:
\[\{ \|\mathbf{x}_i - \mathbf{x}_s\|, \|\mathbf{x}_j - \mathbf{x}_s\|, ... \}\]where \(\mathbf{x}_s\) is the current server state.
Note
Available only for
FedNetwork.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.FractionSelectedClients(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricFraction of clients selected by the federated algorithm to perform local training.
- Table:
Fraction of selected clients over the algorithm run.
Note
Available only for
FedNetwork.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, _: BenchmarkProblem, __: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.ServerMSE(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricMean squared error of the server model’s predictions.
- Table:
Mean squared error of the final server x.
- Plot:
Server MSE (y-axis) per iteration (x-axis).
Note
Available only for
FedNetworkwithproblem.test_dataand empirical-risk client costs.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values
- class decent_bench.metrics.metric_library.ServerAccuracy(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#
Bases:
MetricAccuracy of the server model’s predictions.
- Table:
Accuracy of the final server x.
- Plot:
Server accuracy (y-axis) per iteration (x-axis).
Note
Available only for
FedNetworkwithproblem.test_data, empirical-risk client costs, and integer-valued targets.- is_available(problem: BenchmarkProblem) tuple[bool, str | None][source]#
Check whether this metric can be computed for the given problem.
Override in subclasses that have availability preconditions (e.g. requiring
problem.x_optimalorproblem.test_data). The default implementation always returns available.- Parameters:
problem – the benchmark problem being evaluated
- Returns:
A tuple
(available, reason). When available isTrue, reason isNone. When available isFalse, reason contains a human-readable explanation.
- compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) list[float][source]#
Evaluate the metric on the results of a trial.
- Parameters:
network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x
- Returns:
a sequence of metric values