netket.callbacks.AutoSlurmRequeue#
- class netket.callbacks.AutoSlurmRequeue[source]#
Bases:
AbstractCallbackA callback that automatically requeues a Slurm job if it is about to run out of time.
This callback should be used together with a form of checkpointing to ensure that the job can be requeued without losing progress.
- Inheritance

- __init__(before=datetime.timedelta(seconds=300), max_requeue_count=3)[source]#
Initialize the auto-requeue callback.
- Parameters:
before (
timedelta) – The time before the job ends to check for requeueing (default: 5 minute). This should be a timedelta object or a number of seconds, and it should be at least as long as the time it takes an iteration to run.max_requeue_count (
int) – Maximum number of times the job should be requeued.
- Attributes
- callback_order#
An integer representing the order in which this callback should be called.
Lower numbers are called first, and higher numbers are called later.
This can be redefined in subclasses to change the order in which callbacks are called. (Default: 0, for all callbacks, 10 for loggers).
- Methods
- before_parameter_update(step, log_data, driver)[source]#
Called after all update logic has been computed and the step has been accepted, but before the driver applies the parameter update.
At this point:
The loss and its gradient have been computed by
compute_loss_and_update().The step has been accepted (not rejected by
on_compute_update_end()).driver.step_countstill refers to the current step — it has not yet been incremented.The variational state parameters have not yet changed.
This is the right place to estimate additional observables, add data to
log_data, or take a snapshot of the state for logging. Callbacks with a lowercallback_orderrun first, so observables callbacks (order 0) are guaranteed to populatelog_databefore logger callbacks (order 10) read it.
- on_compute_update_end(step, log_data, driver)[source]#
Callback called at the end of the compute update phase, after computing the loss and its gradient.
This is called before the parameters are updated, so it can be used to implement custom logic for rejecting a step based on the computed loss or gradient.
- Return type:
- Returns:
A boolean indicating whether to reject the step (i.e. repeat it with the same parameters). If it returns None, it is treated as False.