netket.stats.statistics#
- netket.stats.statistics(data)[source]#
Returns statistics of a given array (or matrix, see below) containing a stream of data. This is particularly useful to analyze Markov Chain data, but it can be used also for other type of time series.
- Parameters:
data (
vectorormatrix) – The input data. It can be real or complex valued. * if a vector, it is assumed that this is a time series of data (not necessarily independent); * if a matrix, it is assumed that that rowsdata[i]contain independent time series.- Returns:
A dictionary-compatible class containing the average (
.mean,["Mean"]), variance (.variance,["Variance"]), the Monte Carlo standard error of the mean (error_of_mean,["Sigma"]), an estimate of the autocorrelation time (tau_corr,["TauCorr"]), and the Gelman-Rubin split-Rhat diagnostic (.R_hat,["R_hat"]).If the flag NETKET_EXPERIMENTAL_FFT_AUTOCORRELATION is set, the autocorrelation is computed exactly using a FFT transform, and an extra field tau_corr_max is inserted in the statistics object
These properties can be accessed both the attribute and the dictionary-style syntax (both indicated above).
The split-Rhat diagnostic is based on comparing intra-chain and inter-chain statistics of the sample and is thus only available for 2d-array inputs where the rows are independently sampled MCMC chains. In an ideal MCMC samples, R_hat should be 1.0. If it deviates from this value too much, this indicates MCMC convergence issues. Thresholds such as R_hat > 1.1 or even R_hat > 1.01 have been suggested in the literature for when to discard a sample. (See, e.g., Gelman et al., Bayesian Data Analysis, or Vehtari et al., arXiv:1903.08008.)
- Return type: