OCDocker.OCScore.Transformer.TransOptimizer module¶

Module to perform the optimization of the Transformer parameters model using Optuna.

It is imported as:

from OCDocker.OCScore.Transformer.TransOptimizer import TransOptimizer

class OCDocker.OCScore.Transformer.TransOptimizer.CustomDataset(*args, **kwargs)[source]¶

Bases: Dataset

Create a custom dataset for the PyTorch DataLoader.

Parameters:

features (Any) –
target (Any) –

__getitem__(idx)[source]¶

Get the item at the index.

Parameters:: idx (int) – The index.
Returns:: The features and the target.
Return type:: tuple

__init__(features, target)[source]¶

Initialize the dataset.

Parameters:

features (list) – The features.
target (list) – The target.

Return type:

None

__len__()[source]¶

Get the length of the dataset.

Returns:: The length of the dataset.
Return type:: int

class OCDocker.OCScore.Transformer.TransOptimizer.TransformerModel(*args, **kwargs)[source]¶

Bases: Module

Transformer-based neural network model with configurable initialization and structure.

Parameters:

input_dim (int) – The input dimension.
d_model (int) – The dimension of the model.
output_dim (int) – The output dimension.
nhead (int) – The number of heads in the multihead attention.
num_encoder_layers (int) – The number of encoder layers.
dim_feedforward (int) – The dimension of the feedforward network model.
dropout (float, optional) – The dropout value (default is 0.1).
init_type (str, optional) – The type of initialization (default is ‘zeros’).
init_params (dict, optional) – The parameters for the initialization function (default is None, treated as an empty dict).
random_seed (int, optional) – The random seed for reproducibility (default is 42).
device (torch.device, optional) – The device to use (default is torch.device(‘cuda’)).
verbose (bool, optional) – If True, print the model summary (default is False).

__init__(input_dim, d_model, output_dim, nhead, num_encoder_layers, dim_feedforward, dropout=0.1, init_type='zeros', init_params=None, random_seed=42, device=torch.device, verbose=False)[source]¶

Constructor for the TransformerModel class.

Parameters:

input_dim (int) – The input dimension.
d_model (int) – The dimension of the model.
output_dim (int) – The output dimension.
nhead (int) – The number of heads in the multihead attention.
num_encoder_layers (int) – The number of encoder layers.
dim_feedforward (int) – The dimension of the feedforward network model.
dropout (float, optional) – The dropout value (default is 0.1).
init_type (str, optional) – The type of initialization (default is ‘zeros’).
init_params (dict, optional) – The parameters for the initialization function (default is None, treated as an empty dict).
random_seed (int, optional) – The random seed for reproducibility (default is 42).
device (torch.device, optional) – The device to use (default is torch.device(‘cuda’)).
verbose (bool, optional) – If True, print the model summary (default is False).

Return type:

None

forward(src)[source]¶

Forward pass through the model.

Parameters:: src (torch.Tensor) – The input tensor.
Return type:: torch.Tensor

initialize_weights()[source]¶

Initialize the weights of the model.

Return type:: None

set_random_seed()[source]¶

Set the random seed for reproducibility.

Return type:: torch.Generator

class OCDocker.OCScore.Transformer.TransOptimizer.Transformer(*args, **kwargs)[source]¶

Bases: Module

Transformer-based neural network model with configurable initialization and structure.

Parameters:

input_size (int) – The input dimension.
output_size (int) – The output dimension.
trans_params (dict) – The parameters for the transformer model.
random_seed (int, optional) – The random seed for reproducibility (default is 42).
use_gpu (bool, optional) – If True, use GPU (default is True).
verbose (bool, optional) – If True, print the model summary (default is False).

__init__(input_size, output_size, trans_params, random_seed=42, use_gpu=True, verbose=False)[source]¶

Constructor for the Transformer class.

Parameters:

input_size (int) – The input dimension.
output_size (int) – The output dimension.
trans_params (dict) – The parameters for the transformer model.
random_seed (int, optional) – The random seed for reproducibility (default is 42).
use_gpu (bool, optional) – If True, use GPU (default is True).
verbose (bool, optional) – If True, print the model summary (default is False).

Return type:

None

get_model()[source]¶

Get the model.

Returns:: The model.
Return type:: nn.Module

set_random_seed()[source]¶

Set the random seed for the Autoencoder. It is used to set the random seed for the Autoencoder.

Return type:: None

train_model(X_train, y_train, X_test, y_test, X_validation=None, y_validation=None, criterion=torch.nn.MSELoss)[source]¶

Train the model.

Parameters:

X_train (Union[np.ndarray, pd.DataFrame, list]) – The training features.
y_train (Union[np.ndarray, pd.DataFrame, list]) – The training labels.
X_test (Union[np.ndarray, pd.DataFrame, list]) – The test features.
y_test (Union[np.ndarray, pd.DataFrame, list]) – The test labels.
X_validation (Union[np.ndarray, pd.DataFrame, list, None], optional) – The validation features (default is None).
y_validation (Union[np.ndarray, pd.DataFrame, list, None], optional) – The validation labels (default is None).
criterion (nn.Module, optional) – The loss function (default is nn.MSELoss()).

Returns:

True if the model was trained successfully, False otherwise.

Return type:

bool

class OCDocker.OCScore.Transformer.TransOptimizer.TransOptimizer(X_train, y_train, X_test, y_test, X_validation=None, y_validation=None, storage='sqlite:///Transoptimization.db', output_size=1, random_seed=42, use_gpu=True, verbose=False)[source]¶

Bases: object

Class to optimize the Transformer model using Optuna.

Parameters:

X_train (Union[np.ndarray, pd.DataFrame, list]) – The training features.
y_train (Union[np.ndarray, pd.DataFrame, list]) – The training labels.
X_test (Union[np.ndarray, pd.DataFrame, list]) – The test features.
y_test (Union[np.ndarray, pd.DataFrame, list]) – The test labels.
X_validation (Union[np.ndarray, pd.DataFrame, list, None], optional) – The validation features (default is None).
y_validation (Union[np.ndarray, pd.DataFrame, list, None], optional) – The validation labels (default is None).
storage (str, optional) – The storage for the Optuna study (default is ‘sqlite:///Transoptimization.db’).
output_size (int, optional) – The output size (default is 1).
random_seed (int, optional) – The random seed for reproducibility (default is 42).
use_gpu (bool, optional) – If True, use GPU (default is True).
verbose (bool, optional) – If True, print the model summary (default is False).

__init__(X_train, y_train, X_test, y_test, X_validation=None, y_validation=None, storage='sqlite:///Transoptimization.db', output_size=1, random_seed=42, use_gpu=True, verbose=False)[source]¶

Constructor for the TransOptimizer class.

Parameters:

X_train (Union[np.ndarray, pd.DataFrame, list]) – The training features.
y_train (Union[np.ndarray, pd.DataFrame, list]) – The training labels.
X_test (Union[np.ndarray, pd.DataFrame, list]) – The test features.
y_test (Union[np.ndarray, pd.DataFrame, list]) – The test labels.
X_validation (Union[np.ndarray, pd.DataFrame, list, None], optional) – The validation features (default is None).
y_validation (Union[np.ndarray, pd.DataFrame, list, None], optional) – The validation labels (default is None).
storage (str, optional) – The storage for the Optuna study (default is ‘sqlite:///Transoptimization.db’).
output_size (int, optional) – The output size (default is 1).
random_seed (int, optional) – The random seed for reproducibility (default is 42).
use_gpu (bool, optional) – If True, use GPU (default is True).
verbose (bool, optional) – If True, print the model summary (default is False).

Return type:

None

objective(trial)[source]¶

Objective function for the Optuna study.

Parameters:: trial (optuna.Trial) – The Optuna trial object.
Returns:: The RMSE of the model on the test set.
Return type:: float

optimize(direction='maximize', n_trials=10, study_name='NN_Optimization', load_if_exists=True, sampler=optuna.samplers.TPESampler, n_jobs=1)[source]¶

Optimize the model using Optuna.

Parameters:

direction (str, optional) – The direction of the optimization (default is “maximize”).
n_trials (int, optional) – The number of trials to run (default is 10).
study_name (str, optional) – The name of the study (default is “NN_Optimization”).
load_if_exists (bool, optional) – If True, load the study if it exists (default is True).
sampler (optuna.samplers.BaseSampler, optional) – The sampler to use (default is TPESampler()).
n_jobs (int, optional) – The number of jobs to run in parallel (default is 1).

Returns:

The best hyperparameters found by Optuna.

Return type:

dict

set_random_seed()[source]¶

Set the random seed for the Autoencoder. It is used to set the random seed for the Autoencoder.

Return type:: None

train_test_model(model, train_loader, test_loader, optimizer, criterion, clip_grad, trial, batch_size, epochs=100)[source]¶

Train and test the model.

Parameters:

model (nn.Module) – The model to train and test.
train_loader (DataLoader) – The training data loader.
test_loader (DataLoader) – The test data loader.
optimizer (optim.Optimizer) – The optimizer to use.
criterion (nn.Module) – The loss function to use.
clip_grad (float) – The gradient clipping value.
trial (optuna.Trial) – The Optuna trial object.
batch_size (int) – The batch size to use.
epochs (int, optional) – The number of epochs to train for (default is 100).

Returns:

The RMSE of the model on the test set.

Return type:

float