OCDocker.OCScore.Utils.Workers module

Parallel worker utilities for Optuna-based model optimization.

Usage:

import OCDocker.OCScore.Utils.Workers as ocscoreworkers

OCDocker.OCScore.Utils.Workers.AEworker(pid, id, X_train, X_test, X_val, encoding_dims, storage, models_folder, random_seed=42, use_gpu=True, verbose=False, direction='minimize', n_trials=250, load_if_exists=True, n_jobs=1, study_name='Autoencoder_Optimization')[source]

Autoencoder optimization worker function.

This function is used to run the optimization of an autoencoder model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • X_test (np.ndarray) – Testing data.

  • X_val (np.ndarray) – Validation data.

  • encoding_dims (tuple) – Tuple with the encoding dimensions.

  • storage (str) – Storage string.

  • models_folder (str) – Folder to save the models.

  • random_seed (int, optional) – Random seed. The default is 42.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • direction (str, optional) – Optimization direction. The default is “minimize”.

  • n_trials (int, optional) – Number of trials. The default is 250.

  • load_if_exists (bool, optional) – Load if exists. The default is True.

  • n_jobs (int, optional) – Number of jobs. The default is 1.

  • study_name (str, optional) – Study name. The default is “Autoencoder_Optimization”.

Returns:

study – Study object.

Return type:

optuna.study.Study

OCDocker.OCScore.Utils.Workers.GAWorker(pid, id, X_train, y_train, X_test, y_test, X_validation=None, y_validation=None, storage='sqlite:///GA.db', best_params=None, n_trials=100, study_name='GA_Feature_Selection', random_state=42, use_gpu=True, verbose=False, n_jobs=1)[source]

Feature selection worker function using Genetic Algorithms.

This function is used to run the optimization of a feature selection model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • y_train (np.ndarray) – Training labels.

  • X_test (np.ndarray) – Testing data.

  • y_test (np.ndarray) – Testing labels.

  • X_validation (Union[np.ndarray, None], optional) – Validation data. The default is None.

  • y_validation (Union[np.ndarray, None], optional) – Validation labels. The default is None.

  • storage (str, optional) – Storage string. The default is “sqlite:///GA.db”.

  • best_params (dict, optional) – Best parameters. The default is None (treated as an empty dict).

  • algorithm (str, optional) – Algorithm. The default is “ga”.

  • n_trials (int, optional) – Number of trials. The default is 100.

  • study_name (str, optional) – Study name. The default is “GA_Feature_Selection”.

  • random_state (int, optional) – Random state. The default is 42.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • n_jobs (int) –

Returns:

  • study (optuna.study.Study) – Study object.

  • best_features (list) – Best features.

  • best_score (float) – Best score.

Return type:

tuple[optuna.study.Study, dict, float]

OCDocker.OCScore.Utils.Workers.NNAblationworker(pid, id, X_train, y_train, X_test, y_test, X_val, y_val, mask, storage, network_params, encoder_params=None, output_size=1, random_seed=42, use_gpu=True, verbose=False, load_if_exists=True, n_jobs=1, study_name='NN_Ablation_Optimization')[source]

Neural network optimization worker function.

This function is used to run the optimization of a neural network model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • y_train (np.ndarray) – Training labels.

  • X_test (np.ndarray) – Testing data.

  • y_test (np.ndarray) – Testing labels.

  • X_val (np.ndarray) – Validation data.

  • y_val (np.ndarray) – Validation labels.

  • mask (list[Union[int, bool]]) – Mask list.

  • storage (str) – Storage string.

  • network_params (dict[str, Any]) – Network parameters.

  • encoder_params (Union[dict, None], optional) – Encoder parameters. The default is None.

  • output_size (int, optional) – Output size. The default is 1.

  • random_seed (int, optional) – Random seed. The default is 42.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • load_if_exists (bool) –

  • n_jobs (int) –

  • study_name (str) –

Return type:

None

OCDocker.OCScore.Utils.Workers.NNSeedAblationworker(pid, id, X_train, y_train, X_test, y_test, X_val, y_val, mask, storage, network_params, random_seeds, encoder_params=None, output_size=1, use_gpu=True, verbose=False, load_if_exists=True, n_jobs=1, study_name='NN_Seed_Ablation_Optimization')[source]

Neural network optimization worker function.

This function is used to run the optimization of a neural network model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • y_train (np.ndarray) – Training labels.

  • X_test (np.ndarray) – Testing data.

  • y_test (np.ndarray) – Testing labels.

  • X_val (np.ndarray) – Validation data.

  • y_val (np.ndarray) – Validation labels.

  • mask (list[Union[int, bool]]) – Mask list.

  • storage (str) – Storage string.

  • network_params (dict[str, Any]) – Network parameters.

  • random_seeds (list[int] | int) – Random seed list to ablate.

  • encoder_params (Union[dict, None], optional) – Encoder parameters. The default is None.

  • output_size (int, optional) – Output size. The default is 1.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • load_if_exists (bool) –

  • n_jobs (int) –

  • study_name (str) –

Return type:

None

OCDocker.OCScore.Utils.Workers.NNworker(pid, id, X_train, y_train, X_test, y_test, X_val, y_val, storage, encoder_params=None, output_size=1, random_seed=42, use_gpu=True, verbose=False, direction='minimize', n_trials=250, load_if_exists=True, n_jobs=1, study_name='NN_Optimization')[source]

Neural network optimization worker function.

This function is used to run the optimization of a neural network model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • y_train (np.ndarray) – Training labels.

  • X_test (np.ndarray) – Testing data.

  • y_test (np.ndarray) – Testing labels.

  • X_val (np.ndarray) – Validation data.

  • y_val (np.ndarray) – Validation labels.

  • storage (str) – Storage string.

  • encoder_params (Union[dict, None], optional) – Encoder parameters. The default is None.

  • output_size (int, optional) – Output size. The default is 1.

  • random_seed (int, optional) – Random seed. The default is 42.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • direction (str) –

  • n_trials (int) –

  • load_if_exists (bool) –

  • n_jobs (int) –

  • study_name (str) –

Return type:

None

OCDocker.OCScore.Utils.Workers.Transworker(pid, id, X_train, y_train, X_test, y_test, X_val, y_val, storage, output_size=1, random_seed=42, use_gpu=True, verbose=False, direction='minimize', n_trials=250, load_if_exists=True, n_jobs=1, study_name='Trans_Optimization')[source]

Transformer optimization worker function.

This function is used to run the optimization of a transformer model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • y_train (np.ndarray) – Training labels.

  • X_test (np.ndarray) – Testing data.

  • y_test (np.ndarray) – Testing labels.

  • X_val (np.ndarray) – Validation data.

  • y_val (np.ndarray) – Validation labels.

  • storage (str) – Storage string.

  • output_size (int, optional) – Output size. The default is 1.

  • random_seed (int, optional) – Random seed. The default is 42.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • direction (str, optional) – Optimization direction. The default is “minimize”.

  • n_trials (int, optional) – Number of trials. The default is 250.

  • load_if_exists (bool, optional) – Load if exists. The default is True.

  • n_jobs (int, optional) – Number of jobs. The default is 1.

  • study_name (str, optional) – Study name. The default

  • None

Return type:

None

OCDocker.OCScore.Utils.Workers.XGBworker(pid, id, X_train, X_test, X_val, y_train, y_test, y_val, storage, random_seed=42, use_gpu=True, verbose=False, n_trials=250, load_if_exists=True, n_jobs=10, study_name='XGB_Optimization', early_stopping_rounds=50, params=None)[source]

XGBoost optimization worker function.

This function is used to run the optimization of an XGBoost model in a separate process. It is used to parallelize the optimization process.

Parameters:
  • pid (int) – Process ID.

  • id (int) – Instance ID.

  • X_train (np.ndarray) – Training data.

  • X_test (np.ndarray) – Testing data.

  • X_val (np.ndarray) – Validation data.

  • y_train (np.ndarray) – Training labels.

  • y_test (np.ndarray) – Testing labels.

  • y_val (np.ndarray) – Validation labels.

  • storage (str) – Storage string.

  • random_seed (int, optional) – Random seed. The default is 42.

  • use_gpu (bool, optional) – Use GPU. The default is True.

  • verbose (bool, optional) – Verbose. The default is False.

  • n_trials (int, optional) – Number of trials. The default is 250.

  • load_if_exists (bool, optional) – Load if exists. The default is True.

  • n_jobs (int, optional) – Number of jobs. The default is 10.

  • study_name (str, optional) – Study name. The default is “XGB_Optimization”.

  • early_stopping_rounds (int, optional) – Early stopping rounds. The default is 50.

  • params (dict, optional) – Parameters. The default is None (treated as an empty dict).

Returns:

study_pre – Study object.

Return type:

optuna.study.Study