Training Tasks — colvarsfinder.core
- Author:
Wei Zhang
- Year:
2022
- Copyright:
GNU Public License v3
This module implements classes for learning collective variables (CVs).
The following training tasks derived from the base class TrainingTask
are implemented:
AutoEncoderTask
, which finds collective variables by training autoencoder.
RegAutoEncoderTask
, which finds collective variables by training a regularized autoencoder.
EigenFunctionTask
, which finds collective variables by computing eigenfunctions of either infinitesimal generator or transfer operator.
Base class
- class colvarsfinder.core.TrainingTask(traj_obj, pp_layer, model, model_path, learning_rate, load_model_filename, save_model_every_step, k, batch_size, num_epochs, test_ratio, optimizer_name, device, plot_class, plot_frequency, verbose, debug_mode)[source]
Abstract base class of train tasks.
- Parameters:
traj_obj (
colvarsfinder.utils.WeightedTrajectory
) – An object that holds trajectory data and weightspp_layer (
torch.nn.Module
) – preprocessing layer. It corresponds to the function \(r\) described in Representation of collective variablesmodel – neural network to be trained
model_path (str) – directory to save training results
learning_rate (float) – learning rate
load_model_filename (str) – filename of a trained model, used to restart from a previous training
save_model_every_step (int) – how often to save model
k (int) – number of collective variables to be learned
batch_size (int) – size of mini-batch
num_epochs (int) – number of training epochs
test_ratio – float in \((0,1)\), ratio of the amount of data used as test data
optimizer_name (str) – name of optimizer used to train neural networks. either ‘Adam’ or ‘SGD’
device (
torch.torch.device
) – computing device, either CPU or GPUplot_class – plot callback class
plot_frequency – how often (epoch) to call plot function specified by plot_class
verbose (bool) – print more information if true
debug_mode (bool) – if true, write model to file during the training
- traj_obj
same as the input parameter
- preprocessing_layer
same as the input parameter pp_layer
- model
same as the input parameter
- model_path
same as the input parameter
- learning_rate
same as the input parameter
- load_model_filename
same as the input parameter
- save_model_every_step
same as the input parameter
- k
same as the input parameter
- batch_size
same as the input parameter
- num_epochs
same as the input parameter
- test_ratio
same as the input parameter
- optimizer_name
same as the input parameter
- optimizer
either
torch.optim.Adam
ortorch.optim.SGD
- device
same as the input parameter
- plot_class
same as the input parameter
- plot_frequency
same as the input parameter
- verbose
same as the input parameter
- debug_mode
same as the input parameter
- abstract colvar_model()[source]
- Returns:
neural network that represents collective variables given
preprocessing_layer
andmodel
.- Return type:
This function is called by
save_model()
.
- init_model_and_optimizer()[source]
Initialize
model
andoptimizer
.The previously saved model will be loaded for initialization, if
load_model_filename
points to an existing file.The attribute
optimizer
is set totorch.optim.Adam
, ifoptimizer_name
= ‘Adam’ (case-insensitive); Otherwise, it is set totorch.optim.SGD
.This function shall be called in the constructor of derived classes.
- save_model(epoch, description='latest')[source]
Save model to file.
- Parameters:
epoch (int) – current epoch
description (str) – name of subdirectory to save files
The state_dict of the trained
model
will be saved at model.pt under the subdirectory specified by description in the output directory. Weights and biases of each layer are also saved in text files.The neural network representing collective variables corresponding to
model
(e.g. encoder part of autoencoder) is first constructed by callingcolvar_model()
, then compiled to atorch.jit.ScriptModule
, which is finally saved under the output directory. If the device is GPU, both CPU and CUDA versions will be saved.This function is called by
train()
.
Learning CVs by training autoencoder
- class colvarsfinder.core.AutoEncoderTask(traj_obj, pp_layer, model, model_path, learning_rate=0.01, load_model_filename=None, save_model_every_step=10, batch_size=1000, num_epochs=10, test_ratio=0.2, optimizer_name='Adam', device=device(type='cpu'), plot_class=None, plot_frequency=0, verbose=True, debug_mode=True)[source]
Bases:
TrainingTask
Class for training autoencoders using reconstruction loss.
- Parameters:
traj_obj (
colvarsfinder.utils.WeightedTrajectory
) – trajectory datapp_layer (
torch.nn.Module
) – preprocessing layer. It corresponds to the function \(r:\mathbb{R}^{d}\rightarrow \mathbb{R}^{d_r}\) described in Representation of collective variablesmodel (
colvarsfinder.nn.AutoEncoder
) – neural network to be trainedmodel_path (str) – directory to save training results
learning_rate (float) – learning rate
load_model_filename (str) – filename of a trained neural network, used to restart from a previous training if provided
save_model_every_step (int) – how often to save model
batch_size (int) – size of mini-batch
num_epochs (int) – number of training epochs
test_ratio – float in \((0,1)\), ratio of the amount of data used as test data
optimizer_name (str) – name of optimizer used for training. either ‘Adam’ or ‘SGD’
device (
torch.torch.device
) – computing device, either CPU or GPUplot_class – plot callback class
plot_frequency – how often (epoch) to call plot function
verbose (bool) – print more information if true
debug_mode (bool) – if true, save model to file during the training, otherwise only the latest or best is kept.
This task trains autoencoders using the standard reconstruction loss discussed in Loss function for training autoencoder. The neural networks representing the encoder \(f_{enc}:\mathbb{R}^{d_r}\rightarrow \mathbb{R}^k\) and the decoder \(f_{enc}:\mathbb{R}^{k}\rightarrow \mathbb{R}^{d_r}\) are stored in
model.encoder
andmodel.decoder
, respectively.Note
When plot_class!=None, the following line is executed in
AutoEncoderTask.train()
:self.plot_class.plot(self.colvar_model(), epoch=epoch)
Accordingly, plot_class should be an object of some class with such a member function.
- model
same as the input parameter
- preprocessing_layer
same as the input parameter pp_layer
- loss_list
list of loss values on training data and test data during the training
- train_loss_df
dataframe of training loss for each epoch
- Type:
- test_loss_df
dataframe of test loss for each epoch
- Type:
- colvar_model()[source]
- Returns:
neural network that represents collective variables. It is the concatenation of
preprocessing_layer
andmodel.encoder
.- Return type:
This function is called by
TrainingTask.save_model()
in the base class.
Learning CVs by training regularized autoencoder
- class colvarsfinder.core.RegAutoEncoderTask(traj_obj, pp_layer, model, model_path, eig_weights=[], learning_rate=0.01, load_model_filename=None, save_model_every_step=10, batch_size=1000, num_epochs=10, test_ratio=0.2, optimizer_name='Adam', alpha=1.0, gamma=[0.0, 0.0], eta=[0.0, 0.0, 0.0], lag_tau_ae=0, lag_tau_reg=0, beta=1.0, device=device(type='cpu'), plot_class=None, plot_frequency=0, freeze_encoder=False, verbose=True, debug_mode=True)[source]
Bases:
TrainingTask
Class for training regularized autoencoders.
- Parameters:
traj_obj (
colvarsfinder.utils.WeightedTrajectory
) – trajectory datapp_layer (
torch.nn.Module
) – preprocessing layer. It corresponds to the function \(r:\mathbb{R}^{d}\rightarrow \mathbb{R}^{d_r}\) described in Representation of collective variablesmodel (
colvarsfinder.nn.RegAutoEncoder
) – neural network to be trainedmodel_path (str) – directory to save training results
eig_weights (list of floats) – weights \((\omega_i)_{1\le i\le K}\) in the regularization part of the loss function involving \(K\) eigenfunctions
learning_rate (float) – learning rate
load_model_filename (str) – filename of a trained neural network, used to restart from a previous training if provided
save_model_every_step (int) – how often to save model
batch_size (int) – size of mini-batch
num_epochs (int) – number of training epochs
test_ratio – float in \((0,1)\), ratio of the amount of data used as test data
optimizer_name (str) – name of optimizer used in training. either ‘Adam’ or ‘SGD’
alpha (float) – weight of the reconstruction loss
gamma (list of two floats) – weights \(\gamma_1,\gamma_2\) in the regularization loss involving eigenfunctions (i.e. variational objective and penalty)
eta (list of three floats) – weights \(\eta_1,\eta_2,\eta_3\) in the regularization loss, related to constraints on the (squared, integrated) gradient norm, the norm, and the orthogonality of the encoders
lag_tau_ae (float) – lag-time \(\tau_1\) in the reconstruction loss. Positive number corresponds to time-lagged autoencoder, while zero for standard autoencoder
lag_tau_reg (float) – lag-time \(\tau_2\) in the regularization loss involving eigenfunctions. Positive number corresponds to computing eigenfunctions for transfer operator, while zero corresponds to computing eigenfunctions of generator
beta (float) – inverse of temperature, only relevant when the regularization loss corresponds to generator (i.e. lag_tau_reg=0)
device (
torch.torch.device
) – computing device, either CPU or GPUplot_class – plot callback class
plot_frequency – how often (epoch) to call plot function of plot_class
freeze_encoder (bool) – fix parameters of encoder if true
verbose (bool) – print more information if true
debug_mode (bool) – if true, write model to file during the training
This task trains a regularized autoencoder using the generalized loss discussed in Loss function for regularized autoencoders. The neural networks representing the encoder \(f_{enc}:\mathbb{R}^{d_r}\rightarrow \mathbb{R}^k\), the decoder \(f_{enc}:\mathbb{R}^{k}\rightarrow \mathbb{R}^{d_r}\), and the regularizers \(\widetilde{f}_1,\cdots, \widetilde{f}_K:\mathbb{R}^k\rightarrow \mathbb{R}\) are stored in
model.encoder
,model.decoder
, andmodel.reg
, respectively.- model
same as the input parameter
- preprocessing_layer
same as the input parameter pp_layer
- loss_list
list of loss values evaluated on training data and test data during the training
- train_loss_df
dataframe of training loss for each epoch
- Type:
- test_loss_df
dataframe of test loss for each epoch
- Type:
Note
Make sure that lag_tau_ae= \(i\Delta t\) and lag_tau_reg= \(j\Delta t\) for some integers \(i,j\), where \(\Delta t\) is the time interval between two consecutive states in the trajectory data stored in traj_obj. For MD systems, the unit of both lag-times is ns, the same as the unit of
dt
incolvarsfinder.utils.WeightedTrajectory
.- colvar_model()[source]
- Returns:
neural network that represents collective variables, concatenation of
preprocessing_layer
and the encodermodel.encoder
.- Return type:
This function is called by
TrainingTask.save_model()
in the base class.
- reg_eigen_loss(X, weight, X_lagged, weight_lagged)[source]
- Parameters:
X – data, input tensor
X_lagged – data, time-lagged input tensor
weight – weights of data X
weight_lagged – weights of data X_lagged
- Returns:
eigenvalues, variational objective, penalty, and order of eigenfunctions (list of indices)
X_lagged and weight_lagged are only used when computing eigenfunctions of transfer operator.
- reg_enc_grad_loss(X, weight)[source]
- Parameters:
X – input PyTorch tensor
weight – weights of states in X
- Returns:
squared \(l^2\)-norm of encoder’s gradients
- reg_enc_norm_loss(X, weight)[source]
- Parameters:
X – data, input tensor
weight – weights of data
- Returns:
penalty on the variances of encoder’s components
- reg_enc_orthognal_loss(X, weight)[source]
- Parameters:
X – data, input tensor
weight – weights of data
- Returns:
penalty on the orthogonality (covarinace) among encoder’s components
- reg_model()[source]
- Returns:
neural network that represents the regularizers, i.e. eigenfunctions. A concatenation of
preprocessing_layer
andcolvarsfinder.nn.RegModel
.- Return type:
Learning CVs by computing eigenfunctions of transfer operator or generator
- class colvarsfinder.core.EigenFunctionTask(traj_obj, pp_layer, model, model_path, alpha, eig_weights, diag_coeff=None, beta=1.0, lag_tau=0, learning_rate=0.01, load_model_filename=None, save_model_every_step=10, sort_eigvals_in_training=True, k=1, batch_size=1000, num_epochs=10, test_ratio=0.2, optimizer_name='Adam', device=device(type='cpu'), plot_class=None, plot_frequency=0, verbose=True, debug_mode=True)[source]
Bases:
TrainingTask
Class for training eigenfunctions of transfer operator or generator.
- Parameters:
traj_obj (
colvarsfinder.utils.WeightedTrajectory
) – An object that holds trajectory data and weightspp_layer (
torch.nn.Module
) – preprocessing layer. It corresponds to the function \(r\) in Representation of collective variablesmodel (
colvarsfinder.nn.EigenFunctions
) – feedforward neural network to be trained. It corresponds to functions \(g_1, \dots, g_k\) described in Loss function for training eigenfunctionsmodel_path (str) – directory to save training results
alpha (float) – penalty constant \(\alpha\) in the loss function
eig_weights (list of floats) – \(k\) weights \(\omega_1 \ge \omega_2 \ge \dots \ge \omega_k > 0\) in the loss functions in Loss function for training eigenfunctions
diag_coeff (
torch.Tensor
) – 1D PyTorch tensor of length \(d\), which contains diagonal entries of the matrix \(a\) in the Loss function for training eigenfunctionsbeta (float) – \((k_BT)^{-1}\) for MD systems, only relevant when lag_tau=0 (the case of generator)
lag_tau (float) – lag-time \(\tau\) in the loss function. Positive value corresponds to learning eigenfunctions of transfer operator, while zero corresponds to generator. The unit is ns for MD systems.
learning_rate (float) – learning rate
load_model_filename (str) – filename of a trained model, used to restart from a previous training if provided
save_model_every_step (int) – how often to save model
sort_eigvals_in_training (bool) – whether or not to reorder the \(k\) eigenfunctions according to estimated eigenvalues (such that the first corresponds to the slowest scale)
k (int) – number of eigenfunctions to be learned
batch_size (int) – size of mini-batch
num_epochs (int) – number of training epochs
test_ratio – float in \((0,1)\), ratio of the amount of data used as test data
optimizer_name (str) – name of optimizer used to train neural networks. either ‘Adam’ or ‘SGD’
device (
torch.torch.device
) – computing device, either CPU or GPUplot_class – plot callback class
plot_frequency – how often (epoch) to call plot function
verbose (bool) – print more information if true
debug_mode (bool) – if true, write model to file during the training
- model
same as the input parameter
- preprocessing_layer
same as the input parameter pp_layer
- loss_list
list of loss values evaluated on training data and test data during the training
- train_loss_df
dataframe of training loss for each epoch
- Type:
- test_loss_df
dataframe of test loss for each epoch
- Type:
Note
Make sure that lag_tau= \(i\Delta t\) for some integers \(i\), where \(\Delta t\) is the time interval between two consecutive states in the trajectory data stored in traj_obj. For MD systems, the unit of the lag-time is ns, the same as the unit of
dt
incolvarsfinder.utils.WeightedTrajectory
.See Loss function for training eigenfunctions.
- colvar_model()[source]
- Returns:
neural network that represents collective variables \(\xi=(g_1\circ r, \dots, g_k\circ r)^T\), built from
preprocessing_layer
that represents \(r\) andmodel
that represents \(g_1, g_2, \cdots, g_k\). See Loss function for training eigenfunctions.- Return type:
- get_reordered_eigenfunctions(model, cvec)[source]
- Parameters:
model (
colvarsfinder.nn.EigenFunctions
) – model whose module listcolvarsfinder.nn.EigenFunctions.eigen_funcs
are to be reordered.cvec (list of int) – a permutation of \([0, 1, \dots, k-1]\)
- Returns:
a new object of
colvarsfinder.nn.EigenFunctions
by deep copy whose module list are reordered according to cvec.
Functions in
model
may not be sorted according to the magnitude of eigenvalues. This function returns a sorted model that can then be saved to file.
- loss_func(X, weight, X_lagged, weight_lagged)[source]
- Parameters:
X – data, input tensor
X_lagged – data, time-lagged input tensor
weight – weights of data X
weight_lagged – weights of data X_lagged
- Returns:
total loss, eigenvalues, variational objective, penalty, and ordering of eigenfunctions (a list of indices)
X_lagged and weight_lagged are only used when computing eigenfunctions of transfer operator.