netket.logging.HDF5Log#

class netket.logging.HDF5Log[source]#

Bases: AbstractCallback

HDF5 Logger, that can be passed with keyword argument logger to Monte Carlo drivers in order to serialize the output data of the simulation.

The logger has support for scalar numbers, NumPy/JAX arrays, and netket.stats.Stats objects. These are stored as individual groups within a HDF5 file, under the main group data/:

  • scalars are stored as a group with one dataset values of shape (n_steps,) containing the logged values,

  • arrays are stored in the same way, but with values having shape (n_steps, *array_shape),

  • netket.stats.Stats are stored as a group containing each field (Mean, Variance, etc...) as a separate dataset.

Importantly, each group has a dataset iters, which tracks the iteration number of the logged quantity.

Data can be deserialized by calling f = h5py.File(filename, 'r') and inspecting the datasets as a dictionary, i.e. f['data/energy/Mean']

This class is a full AbstractCallback and can be passed either as out=logger or inside the callbacks=[..., logger] list.

Tip

Use the metadata argument to attach a flat dict of hyper-parameters (learning rate, system size, model type, …) to the output file. They are stored as HDF5 attributes on the metadata/ group and travel with the file, making it easy to correlate results without relying on external bookkeeping.

Examples

Basic usage as an output logger.

>>> import pytest; pytest.skip("skip automated test of this docstring")
>>>
>>> import netket as nk
>>> logger = nk.logging.HDF5Log("output")
>>> gs.run(n_iter=300, out=logger)
>>> # data lives in output.h5

Attaching metadata to record hyper-parameters.

>>> import pytest; pytest.skip("skip automated test of this docstring")
>>>
>>> import netket as nk
>>> logger = nk.logging.HDF5Log(
...     "output",
...     metadata={"learning_rate": 0.01, "alpha": 1, "L": 20},
... )
>>> gs.run(n_iter=300, out=logger)
>>> # f['metadata'].attrs['learning_rate'] == '0.01'

Using the logger as a callback.

>>> import pytest; pytest.skip("skip automated test of this docstring")
>>>
>>> import netket as nk
>>> logger = nk.logging.HDF5Log("output")
>>> gs.run(n_iter=300, callbacks=[logger])

Note

The API of this logger is covered by our Semantic Versioning API guarantees. However, the structure of the logged files is not, and might change in the future. If you think that we could improve the output format of this logger, please open an issue on the NetKet repository and let us know.

Inheritance
Inheritance diagram of netket.logging.HDF5Log
__init__(path, mode='write', save_params=<object object>, save_params_every=<object object>, write_every=50, chunk_size=None, metadata=None)[source]#

Construct a HDF5 Logger.

Parameters:
  • path (str) – the name of the output files before the extension

  • mode (str) – Specify the behaviour in case the file already exists at this path. Options are - [w]rite: (default) overwrites file if it already exists; - [a]ppend: appends to an existing file, otherwise creates one; - [x] or fail: fails if file already exists;

  • save_params – deprecated, has no effect.

  • save_params_every – deprecated, has no effect.

  • write_every (int) – every how many iterations the HDF5 file should be flushed to disk

  • chunk_size (int | None) – number of log entries per HDF5 chunk. If omitted, chunking is chosen adaptively to target a moderate chunk size in bytes.

  • metadata (dict | None) – optional flat dict of key/value pairs stored once at run start as HDF5 attributes on the metadata/ group.

Attributes
callback_order#
Methods
__call__(step, log_data, variational_state=None)[source]#

Call self as a function.

before_parameter_update(step, log_data, driver)[source]#

Called after all update logic has been computed and the step has been accepted, but before the driver applies the parameter update.

At this point:

  • The loss and its gradient have been computed by compute_loss_and_update().

  • The step has been accepted (not rejected by on_compute_update_end()).

  • driver.step_count still refers to the current step — it has not yet been incremented.

  • The variational state parameters have not yet changed.

This is the right place to estimate additional observables, add data to log_data, or take a snapshot of the state for logging. Callbacks with a lower callback_order run first, so observables callbacks (order 0) are guaranteed to populate log_data before logger callbacks (order 10) read it.

flush()[source]#

Writes buffered data to disk.

on_compute_update_end(step, log_data, driver)[source]#

Callback called at the end of the compute update phase, after computing the loss and its gradient.

This is called before the parameters are updated, so it can be used to implement custom logic for rejecting a step based on the computed loss or gradient.

Return type:

bool

Returns:

A boolean indicating whether to reject the step (i.e. repeat it with the same parameters). If it returns None, it is treated as False.

on_compute_update_start(step, log_data, driver)[source]#
on_run_end(step, driver)[source]#
on_run_error(step, error, driver)[source]#
on_run_start(step, driver)[source]#
on_step_end(step, log_data, driver)[source]#
on_step_start(step, log_data, driver)[source]#
replace(**kwargs)[source]#

Replace the values of the fields of the object with the values of the keyword arguments. If the object is a dataclass, dataclasses.replace will be used. Otherwise, a new object will be created with the same type as the original object.

Return type:

TypeVar(P, bound= Pytree)

Parameters:
  • self (P)

  • kwargs (Any)