OCDocker.OCScore.Dimensionality.AutoencoderOptimizer module¶
Module to perform the optimization of the Autoencoder.
It is imported as:
from OCDocker.OCScore.NN.AutoencoderOptimizer import AutoencoderOptimizer
- class OCDocker.OCScore.Dimensionality.AutoencoderOptimizer.AutoencoderDataset(*args, **kwargs)[source]¶
Bases:
DatasetDataset class for the Autoencoder. It is used to create the DataLoader for the training and testing of the Autoencoder.
- Parameters:
features (torch.Tensor) – The features to be used in the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
- __getitem__(idx)[source]¶
Returns the features and the target for the given index. It is used by the DataLoader to get the samples from the dataset.
- Parameters:
idx (int) – The index of the sample to be returned.
- Returns:
The features and the target for the given index. It is used by the DataLoader to get the samples from the dataset.
- Return type:
tuple
- __init__(features)[source]¶
Constructor for the AutoencoderDataset class. It is used to create the DataLoader for the training and testing of the Autoencoder.
- Parameters:
features (torch.Tensor) – The features to be used in the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
- Return type:
None
- class OCDocker.OCScore.Dimensionality.AutoencoderOptimizer.Autoencoder(*args, **kwargs)[source]¶
Bases:
ModuleAutoencoder class. It is used to create the Autoencoder model. It is a subclass of nn.Module. It is used to create the Autoencoder model.
- Parameters:
input_size (int) – The size of the input. It should be a positive integer.
encoding_dim (list) – The size of the encoding. It should be a list of integers.
encoder_activation_fn (list[tuple(type[nn.Module], dict[str, Any]]) – The activation functions to be used in the encoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoder_activation_fn (list[tuple(type[nn.Module], dict[str, Any]]) – The activation functions to be used in the decoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoding_dim (list) – The size of the decoding. It should be a list of integers.
device (torch.device, optional) – The device to be used. It should be a torch.device. Default is torch.device(“cpu”).
- __init__(input_size, encoding_dim, encoder_activation_fn, decoder_activation_fn, decoding_dim, device=torch.device)[source]¶
Constructor for the Autoencoder class. It is used to create the Autoencoder model.
- Parameters:
input_size (int) – The size of the input. It should be a positive integer.
encoding_dim (list) – The size of the encoding. It should be a list of integers.
encoder_activation_fn (list[tuple[type[nn.Module], dict[str, Any]]]) – The activation functions to be used in the encoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoder_activation_fn (list[tuple[type[nn.Module], dict[str, Any]]]) – The activation functions to be used in the decoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoding_dim (list) – The size of the decoding. It should be a list of integers.
device (torch.device, optional) – The device to be used. It should be a torch.device. Default is torch.device(“cpu”).
- Return type:
None
- forward(x)[source]¶
Forward pass of the Autoencoder. It is used to pass the input through the encoder and decoder.
- Parameters:
x (torch.Tensor) – The input to be passed through the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
- Returns:
The output of the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
- Return type:
torch.Tensor
- get_decoder()[source]¶
Get the decoder. It is used to get the decoder of the Autoencoder.
- Returns:
The decoder of the Autoencoder. It is used to get the decoder of the Autoencoder.
- Return type:
nn.Module
- get_decoder_topology()[source]¶
Get the topology of the decoder. It is used to get the layers of the decoder.
- Returns:
The topology of the decoder. It is used to get the layers of the decoder.
- Return type:
list
- class OCDocker.OCScore.Dimensionality.AutoencoderOptimizer.AutoencoderOptimizer(X_train, X_test, X_validation=None, encoding_dims=(16, 256), storage='sqlite:///autoencoder.db', models_folder='./models/Autoencoder/', random_seed=42, use_gpu=True, verbose=False)[source]¶
Bases:
objectAutoencoderOptimizer class. It is used to optimize the Autoencoder using Optuna. It is used to create the AutoencoderOptimizer object.
- Parameters:
X_train (Union[np.ndarray, pd.DataFrame, pd.Series]) – The training data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_test (Union[np.ndarray, pd.DataFrame, pd.Series]) – The testing data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_validation (Union[None, Union[np.ndarray, pd.DataFrame, pd.Series]], optional) – The validation data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series. Default is None.
encoding_dims (tuple, optional) – The dimensions of the encoding. It should be a tuple of two integers. Default is (16, 256).
storage (str, optional) – The storage string for the study. It should be a string. Default is “sqlite:///autoencoder.db”.
models_folder (str, optional) – The folder where the models will be saved. It should be a string. Default is “./models/Autoencoder/”.
random_seed (int, optional) – The random seed to be used in the Autoencoder. It should be a positive integer. Default is 42.
use_gpu (bool, optional) – If True, the Autoencoder will use the GPU. It should be a boolean. Default is True.
verbose (bool, optional) – If True, the Autoencoder will print the training and testing information. It should be a boolean. Default is False.
- device: torch.device¶
- __init__(X_train, X_test, X_validation=None, encoding_dims=(16, 256), storage='sqlite:///autoencoder.db', models_folder='./models/Autoencoder/', random_seed=42, use_gpu=True, verbose=False)[source]¶
Constructor for the AutoencoderOptimizer class. It is used to create the AutoencoderOptimizer object.
- Parameters:
X_train (Union[np.ndarray, pd.DataFrame, pd.Series]) – The training data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_test (Union[np.ndarray, pd.DataFrame, pd.Series]) – The testing data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_validation (Union[None, Union[np.ndarray, pd.DataFrame, pd.Series]], optional) – The validation data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series. Default is None.
encoding_dims (tuple, optional) – The dimensions of the encoding. It should be a tuple of two integers. Default is (16, 256).
storage (str, optional) – The storage string for the study. It should be a string. Default is “sqlite:///autoencoder.db”.
models_folder (str, optional) – The folder where the models will be saved. It should be a string. Default is “./models/Autoencoder/”.
random_seed (int, optional) – The random seed to be used in the Autoencoder. It should be a positive integer. Default is 42.
use_gpu (bool, optional) – If True, the Autoencoder will use the GPU. It should be a boolean. Default is True.
verbose (bool, optional) – If True, the Autoencoder will print the training and testing information. It should be a boolean. Default is False.
- Return type:
None
- X_train: torch.Tensor¶
- train_loader: torch.utils.data.DataLoader | None¶
- X_test: torch.Tensor¶
- test_loader: torch.utils.data.DataLoader | None¶
- X_validation: torch.Tensor | None¶
- validation_loader: torch.utils.data.DataLoader | None¶
- evaluate_autoencoder(model, criterion, loader=None)[source]¶
Evaluate the Autoencoder. It is used to evaluate the Autoencoder.
- Parameters:
model (nn.Module) – The Autoencoder model to be evaluated. It should be a nn.Module.
criterion (nn.Module) – The loss function to be used in the Autoencoder. It should be a nn.Module.
loader (Union[None, DataLoader], optional) – The DataLoader to be used in the Autoencoder. It should be a DataLoader. Default is None.
- Returns:
The RMSE of the Autoencoder. It is used to get the RMSE of the Autoencoder.
- Return type:
float
- objective(trial)[source]¶
Objective function for the Optuna optimization. It is used to optimize the Autoencoder.
- Parameters:
trial (optuna.Trial) – The Optuna trial to be used in the Autoencoder. It should be a optuna.Trial.
- Returns:
The RMSE of the Autoencoder. It is used to get the RMSE of the Autoencoder.
- Return type:
float
- optimize(direction='maximize', n_trials=10, study_name='NN_Optimization', load_if_exists=True, sampler=optuna.samplers.TPESampler, n_jobs=1)[source]¶
Optimize the Autoencoder. It is used to optimize the Autoencoder.
- Parameters:
direction (str, optional) – The direction of the optimization. It should be a string. Default is “maximize”.
n_trials (int, optional) – The number of trials to be used in the Autoencoder. It should be a positive integer. Default is 10.
study_name (str, optional) – The name of the study. It should be a string. Default is “NN_Optimization”.
load_if_exists (bool, optional) – If True, the study will be loaded if it exists. It should be a boolean. Default is True.
sampler (optuna.samplers.BaseSampler, optional) – The sampler to be used in the Autoencoder. It should be a optuna.samplers.BaseSampler. Default is TPESampler().
n_jobs (int, optional) – The number of jobs to be used in the Autoencoder. It should be a positive integer. Default is 1.
- Returns:
The Optuna study. It is used to get the study of the Autoencoder.
- Return type:
optuna.study.Study
- set_random_seed()[source]¶
Set the random seed for the Autoencoder. It is used to set the random seed for the Autoencoder.
- Return type:
None
- train_autoencoder(model, optimizer, criterion, clip_grad, epochs, trial)[source]¶
Train the Autoencoder. It is used to train the Autoencoder.
- Parameters:
model (nn.Module) – The Autoencoder model to be trained. It should be a nn.Module.
optimizer (optim.Optimizer) – The optimizer to be used in the Autoencoder. It should be a optim.Optimizer.
criterion (nn.Module) – The loss function to be used in the Autoencoder. It should be a nn.Module.
clip_grad (float) – The gradient clipping value to be used in the Autoencoder. It should be a float.
epochs (int) – The number of epochs to be used in the Autoencoder. It should be a positive integer.
trial (optuna.Trial) – The Optuna trial to be used in the Autoencoder. It should be a optuna.Trial.
- Returns:
The best validation and training RMSE. It is used to get the best validation and training RMSE.
- Return type:
tuple