decent_bench.metrics#

class decent_bench.metrics.AgentMetricsView(id: UUID, cost: Cost, x_history: AgentHistory, n_x_updates: int, n_function_calls: float, n_gradient_calls: float, n_hessian_calls: float, n_proximal_calls: float, n_sent_messages: float, n_received_messages: float, n_sent_messages_dropped: float, n_times_selected: int)[source]#

Bases: object

Immutable view of agent that exposes useful properties for calculating metrics.

id: UUID#

cost: Cost#

x_history: AgentHistory#

n_x_updates: int#

n_function_calls: float#

n_gradient_calls: float#

n_hessian_calls: float#

n_proximal_calls: float#

n_sent_messages: float#

n_received_messages: float#

n_sent_messages_dropped: float#

n_times_selected: int#

static from_agent(agent: Agent) → AgentMetricsView[source]#: Create from agent.

class decent_bench.metrics.ComputationalCost(function: float = 1.0, gradient: float = 1.0, hessian: float = 1.0, proximal: float = 1.0, communication: float = 1.0)[source]#

Bases: object

Computational costs associated with an algorithm for plot metrics.

function: float = 1.0#

gradient: float = 1.0#

hessian: float = 1.0#

proximal: float = 1.0#

communication: float = 1.0#

class decent_bench.metrics.Metric(fmt: str = '.2e', x_log: bool = False, y_log: bool = True)[source]#

Bases: ABC

Abstract base class for metrics.

In order to create a new metric, subclass this class and implement the abstract methods description() and compute().

Parameters:

fmt –
format string used to format the values in the table, defaults to “.2e”. Common formats include:
- ”.2e”: scientific notation with 2 decimal places
- ”.3f”: fixed-point notation with 3 decimal places
- ”.4g”: general format with 4 significant digits
- ”.1%”: percentage format with 1 decimal place
Where the integer specifies the precision. See str.format() documentation for details on the format string options.
x_log – whether to apply log scaling to the x-axis in plots.
y_log – whether to apply log scaling to the y-axis in plots.

abstract property description: str#: Metric description used as the table row label and y-axis label in plots.

is_available(problem: BenchmarkProblem) → tuple[bool, str | None][source]#

Check whether this metric can be computed for the given problem.

Override in subclasses that have availability preconditions (e.g. requiring problem.x_optimal or problem.test_data). The default implementation always returns available.

Parameters:: problem – the benchmark problem being evaluated
Returns:: A tuple (available, reason). When available is True, reason is None. When available is False, reason contains a human-readable explanation.

abstractmethod compute(network: NetworkMetricsView, problem: BenchmarkProblem, iteration: int) → Sequence[float][source]#

Evaluate the metric on the results of a trial.

Parameters:

network – the snapshotted network view being evaluated.
problem – the benchmark problem being evaluated
iteration – the iteration at which to compute the metric, or -1 to use the agents’ final x

Returns:

a sequence of metric values

class decent_bench.metrics.NetworkMetricsView(graph: networkx.Graph[AgentMetricsView], network_type: NetworkType, _server: AgentMetricsView | None = None)[source]#

Bases: object

Immutable view of a network that exposes useful properties for calculating metrics.

The underlying data structure is a frozen nx.Graph whose nodes are AgentMetricsView objects. The object is created using from_network passing a FedNetwork or P2PNetwork.

Available methods are: - agents() and connected_agents(agent) - Fed-only: clients(), server(), and coordinator() - P2P-only: neighbors(agent)

graph: networkx.Graph[AgentMetricsView]#

network_type: NetworkType#

static from_network(network: decent_bench.networks.Network) → NetworkMetricsView[source]#: Create a network metrics view from a network.

agents() → list[AgentMetricsView][source]#: Return agents exposed by network semantics (clients for federated, all nodes for P2P).

clients() → list[AgentMetricsView][source]#: Return clients in a federated network (alias of agents()).

server() → AgentMetricsView[source]#: Return the server node in a federated network.

coordinator() → AgentMetricsView[source]#: Alias for server().

connected_agents(agent: AgentMetricsView) → list[AgentMetricsView][source]#: Return agents in the network connected to an agent.

neighbors(agent: AgentMetricsView) → list[AgentMetricsView][source]#: Return neighbors in a peer-to-peer network.

property iterations: list[int]#: List of iterations reached by any agent (plus server) in the network.

class decent_bench.metrics.NetworkType(*values)[source]#

Bases: Enum

Supported network types for metric views.

P2P = 'p2p'#

FEDERATED = 'federated'#

class decent_bench.metrics.RuntimeMetric(update_interval: int, save_path: str | Path | None = None)[source]#

Bases: ABC

Abstract base class for runtime metrics.

Runtime metrics are computed during algorithm execution to provide live feedback for early stopping or monitoring. Unlike post-hoc metrics, they don’t store historical data and are designed to be lightweight.

To create a new runtime metric, subclass this class and implement description(), x_log(), y_log(), and compute().

Parameters:

update_interval – Number of iterations between metric updates, do not update more frequently than necessary as this can slow down the algorithm.
save_path – Path to save the plot when the metric is updated, if None, the plot will not be saved

Note

The compute() method should be efficient as it’s called during algorithm execution. Avoid expensive computations or operations that might significantly slow down the algorithm.

abstract property description: str#: Description of the metric, used as the y-axis label.

abstract property x_log: bool#: Whether the x-axis should be logarithmic.

abstract property y_log: bool#: Whether the y-axis should be logarithmic.

property update_interval: int#

Number of iterations between metric updates.

Returns:: Number of iterations between updates.

abstractmethod compute(problem: BenchmarkProblem, agents: Sequence[Agent], iteration: int) → float[source]#

Compute the metric value for the current iteration.

Parameters:

problem – benchmark problem being solved
agents – sequence of agents with their current state
iteration – current iteration number

Returns:

The computed metric value as a float.

initialize_plot(algorithm_name: str, trial: int, queue: queue.Queue[Any]) → None[source]#

Initialize the plot for this metric.

Sends initialization message to plotter process to create the figure.

Parameters:

algorithm_name – name of the algorithm being run
trial – trial number (0-indexed)
queue – multiprocessing queue for sending data to the plotter

update_plot(problem: BenchmarkProblem, agents: Sequence[Agent], iteration: int) → None[source]#

Update the plot with a new data point.

Computes the metric value and sends it to the centralized plotter via queue.

Parameters:

problem – benchmark problem being solved
agents – sequence of agents with their current state
iteration – current iteration number

should_update(iteration: int) → bool[source]#

Check if the metric should be updated at this iteration.

Parameters:: iteration – current iteration number
Returns:: True if the metric should be updated, False otherwise.

class decent_bench.metrics.RuntimeMetricPlotter(queue: queue.Queue[Any], context: SpawnContext | DefaultContext)[source]#

Bases: object

Centralized plotter for runtime metrics that runs in its own process.

This class handles all matplotlib plotting operations in a separate process, receiving data from other processes via a queue. This avoids X server conflicts when using multiprocessing.

Note

This class is not intended to be instantiated by users. It is automatically created and managed by the benchmark infrastructure.

start() → None[source]#: Start the plotter in a separate process using the provided context.

run() → None[source]#: Process loop, continuously process queue updates.

create_figure(metric_id: str, description: str, x_log: bool, y_log: bool, save_path: Path | None) → None[source]#

Create a figure for a metric.

Parameters:

metric_id – Unique identifier for the metric
description – Human-readable description for the y-axis label
x_log – Whether the x-axis should be logarithmic
y_log – Whether the y-axis should be logarithmic
save_path – Path to save the plot when updated (if None, no saving is performed)

update(metric_id: str, algorithm_name: str, trial: int, iteration: int, value: float) → None[source]#

Update a plot with new data.

Parameters:

metric_id – Unique identifier for the metric
algorithm_name – Name of the algorithm
trial – Trial number
iteration – Current iteration
value – Metric value

shutdown() → None[source]#: Signal the plotter process to stop and wait for it to finish.

Note

This method can be called multiple times safely. If the process is already stopped, it will do nothing. If the process is still running, it will send a stop signal and wait for it to finish. If the process does not stop within a reasonable time, it will be forcefully terminated.