OCDocker.OCScore.Dimensionality.AutoencoderOptimizer module¶

Module to perform the optimization of the Autoencoder.

It is imported as:

from OCDocker.OCScore.NN.AutoencoderOptimizer import AutoencoderOptimizer

class OCDocker.OCScore.Dimensionality.AutoencoderOptimizer.AutoencoderDataset(*args, **kwargs)[source]¶

Bases: Dataset

Dataset class for the Autoencoder. It is used to create the DataLoader for the training and testing of the Autoencoder.

Parameters:: features (torch.Tensor) – The features to be used in the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).

__getitem__(idx)[source]¶

Returns the features and the target for the given index. It is used by the DataLoader to get the samples from the dataset.

Parameters:: idx (int) – The index of the sample to be returned.
Returns:: The features and the target for the given index. It is used by the DataLoader to get the samples from the dataset.
Return type:: tuple

__init__(features)[source]¶

Constructor for the AutoencoderDataset class. It is used to create the DataLoader for the training and testing of the Autoencoder.

Parameters:: features (torch.Tensor) – The features to be used in the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
Return type:: None

__len__()[source]¶

Returns the length of the dataset. It is used by the DataLoader to know how many samples are in the dataset.

Returns:: The length of the dataset. It is used by the DataLoader to know how many samples are in the dataset.
Return type:: int

class OCDocker.OCScore.Dimensionality.AutoencoderOptimizer.Autoencoder(*args, **kwargs)[source]¶

Bases: Module

Autoencoder class. It is used to create the Autoencoder model. It is a subclass of nn.Module. It is used to create the Autoencoder model.

Parameters:

input_size (int) – The size of the input. It should be a positive integer.
encoding_dim (list) – The size of the encoding. It should be a list of integers.
encoder_activation_fn (list[tuple(type[nn.Module], dict[str, Any]]) – The activation functions to be used in the encoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoder_activation_fn (list[tuple(type[nn.Module], dict[str, Any]]) – The activation functions to be used in the decoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoding_dim (list) – The size of the decoding. It should be a list of integers.
device (torch.device, optional) – The device to be used. It should be a torch.device. Default is torch.device(“cpu”).

__init__(input_size, encoding_dim, encoder_activation_fn, decoder_activation_fn, decoding_dim, device=torch.device)[source]¶

Constructor for the Autoencoder class. It is used to create the Autoencoder model.

Parameters:

input_size (int) – The size of the input. It should be a positive integer.
encoding_dim (list) – The size of the encoding. It should be a list of integers.
encoder_activation_fn (list[tuple[type[nn.Module], dict[str, Any]]]) – The activation functions to be used in the encoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoder_activation_fn (list[tuple[type[nn.Module], dict[str, Any]]]) – The activation functions to be used in the decoder. It should be a list of tuples where each tuple will be the activation function and its parameters.
decoding_dim (list) – The size of the decoding. It should be a list of integers.
device (torch.device, optional) – The device to be used. It should be a torch.device. Default is torch.device(“cpu”).

Return type:

None

forward(x)[source]¶

Forward pass of the Autoencoder. It is used to pass the input through the encoder and decoder.

Parameters:: x (torch.Tensor) – The input to be passed through the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
Returns:: The output of the Autoencoder. It should be a torch.Tensor of shape (n_samples, n_features).
Return type:: torch.Tensor

get_decoder()[source]¶

Get the decoder. It is used to get the decoder of the Autoencoder.

Returns:: The decoder of the Autoencoder. It is used to get the decoder of the Autoencoder.
Return type:: nn.Module

get_decoder_topology()[source]¶

Get the topology of the decoder. It is used to get the layers of the decoder.

Returns:: The topology of the decoder. It is used to get the layers of the decoder.
Return type:: list

get_encoder()[source]¶

Get the encoder. It is used to get the encoder of the Autoencoder.

Returns:: The encoder of the Autoencoder. It is used to get the encoder of the Autoencoder.
Return type:: nn.Module

get_encoder_topology()[source]¶

Get the topology of the encoder. It is used to get the layers of the encoder.

Returns:: The topology of the encoder. It is used to get the layers of the encoder.
Return type:: list

class OCDocker.OCScore.Dimensionality.AutoencoderOptimizer.AutoencoderOptimizer(X_train, X_test, X_validation=None, encoding_dims=(16, 256), storage='sqlite:///autoencoder.db', models_folder='./models/Autoencoder/', random_seed=42, use_gpu=True, verbose=False)[source]¶

Bases: object

AutoencoderOptimizer class. It is used to optimize the Autoencoder using Optuna. It is used to create the AutoencoderOptimizer object.

Parameters:

X_train (Union[np.ndarray, pd.DataFrame, pd.Series]) – The training data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_test (Union[np.ndarray, pd.DataFrame, pd.Series]) – The testing data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_validation (Union[None, Union[np.ndarray, pd.DataFrame, pd.Series]], optional) – The validation data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series. Default is None.
encoding_dims (tuple, optional) – The dimensions of the encoding. It should be a tuple of two integers. Default is (16, 256).
storage (str, optional) – The storage string for the study. It should be a string. Default is “sqlite:///autoencoder.db”.
models_folder (str, optional) – The folder where the models will be saved. It should be a string. Default is “./models/Autoencoder/”.
random_seed (int, optional) – The random seed to be used in the Autoencoder. It should be a positive integer. Default is 42.
use_gpu (bool, optional) – If True, the Autoencoder will use the GPU. It should be a boolean. Default is True.
verbose (bool, optional) – If True, the Autoencoder will print the training and testing information. It should be a boolean. Default is False.

device: torch.device¶

__init__(X_train, X_test, X_validation=None, encoding_dims=(16, 256), storage='sqlite:///autoencoder.db', models_folder='./models/Autoencoder/', random_seed=42, use_gpu=True, verbose=False)[source]¶

Constructor for the AutoencoderOptimizer class. It is used to create the AutoencoderOptimizer object.

Parameters:

X_train (Union[np.ndarray, pd.DataFrame, pd.Series]) – The training data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_test (Union[np.ndarray, pd.DataFrame, pd.Series]) – The testing data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series.
X_validation (Union[None, Union[np.ndarray, pd.DataFrame, pd.Series]], optional) – The validation data to be used in the Autoencoder. It should be a numpy array, pandas DataFrame or pandas Series. Default is None.
encoding_dims (tuple, optional) – The dimensions of the encoding. It should be a tuple of two integers. Default is (16, 256).
storage (str, optional) – The storage string for the study. It should be a string. Default is “sqlite:///autoencoder.db”.
models_folder (str, optional) – The folder where the models will be saved. It should be a string. Default is “./models/Autoencoder/”.
random_seed (int, optional) – The random seed to be used in the Autoencoder. It should be a positive integer. Default is 42.
use_gpu (bool, optional) – If True, the Autoencoder will use the GPU. It should be a boolean. Default is True.
verbose (bool, optional) – If True, the Autoencoder will print the training and testing information. It should be a boolean. Default is False.

Return type:

None

X_train: torch.Tensor¶

train_loader: torch.utils.data.DataLoader | None¶

X_test: torch.Tensor¶

test_loader: torch.utils.data.DataLoader | None¶

X_validation: torch.Tensor | None¶

validation_loader: torch.utils.data.DataLoader | None¶

evaluate_autoencoder(model, criterion, loader=None)[source]¶

Evaluate the Autoencoder. It is used to evaluate the Autoencoder.

Parameters:

model (nn.Module) – The Autoencoder model to be evaluated. It should be a nn.Module.
criterion (nn.Module) – The loss function to be used in the Autoencoder. It should be a nn.Module.
loader (Union[None, DataLoader], optional) – The DataLoader to be used in the Autoencoder. It should be a DataLoader. Default is None.

Returns:

The RMSE of the Autoencoder. It is used to get the RMSE of the Autoencoder.

Return type:

float

objective(trial)[source]¶

Objective function for the Optuna optimization. It is used to optimize the Autoencoder.

Parameters:: trial (optuna.Trial) – The Optuna trial to be used in the Autoencoder. It should be a optuna.Trial.
Returns:: The RMSE of the Autoencoder. It is used to get the RMSE of the Autoencoder.
Return type:: float

optimize(direction='maximize', n_trials=10, study_name='NN_Optimization', load_if_exists=True, sampler=optuna.samplers.TPESampler, n_jobs=1)[source]¶

Optimize the Autoencoder. It is used to optimize the Autoencoder.

Parameters:

direction (str, optional) – The direction of the optimization. It should be a string. Default is “maximize”.
n_trials (int, optional) – The number of trials to be used in the Autoencoder. It should be a positive integer. Default is 10.
study_name (str, optional) – The name of the study. It should be a string. Default is “NN_Optimization”.
load_if_exists (bool, optional) – If True, the study will be loaded if it exists. It should be a boolean. Default is True.
sampler (optuna.samplers.BaseSampler, optional) – The sampler to be used in the Autoencoder. It should be a optuna.samplers.BaseSampler. Default is TPESampler().
n_jobs (int, optional) – The number of jobs to be used in the Autoencoder. It should be a positive integer. Default is 1.

Returns:

The Optuna study. It is used to get the study of the Autoencoder.

Return type:

optuna.study.Study

set_random_seed()[source]¶

Set the random seed for the Autoencoder. It is used to set the random seed for the Autoencoder.

Return type:: None

train_autoencoder(model, optimizer, criterion, clip_grad, epochs, trial)[source]¶

Train the Autoencoder. It is used to train the Autoencoder.

Parameters:

model (nn.Module) – The Autoencoder model to be trained. It should be a nn.Module.
optimizer (optim.Optimizer) – The optimizer to be used in the Autoencoder. It should be a optim.Optimizer.
criterion (nn.Module) – The loss function to be used in the Autoencoder. It should be a nn.Module.
clip_grad (float) – The gradient clipping value to be used in the Autoencoder. It should be a float.
epochs (int) – The number of epochs to be used in the Autoencoder. It should be a positive integer.
trial (optuna.Trial) – The Optuna trial to be used in the Autoencoder. It should be a optuna.Trial.

Returns:

The best validation and training RMSE. It is used to get the best validation and training RMSE.

Return type:

tuple