OCDocker.OCScore.Dimensionality.future.AETrainer module¶

Trainer for the future Autoencoder pipeline (denoising + multi-task).

class OCDocker.OCScore.Dimensionality.future.AETrainer.EarlyStopping(patience=20, min_delta=0.0)[source]¶

Bases: object

Simple early stopping helper.

Parameters:

patience (int, optional) – Number of epochs without improvement to wait, by default 20.
min_delta (float, optional) – Minimum improvement to reset patience, by default 0.0.

__init__(patience=20, min_delta=0.0)[source]¶

Initialize early stopping state.

Parameters:

patience (int, optional) – Number of epochs without improvement to wait, by default 20.
min_delta (float, optional) – Minimum improvement to reset patience, by default 0.0.

Return type:

None

step(value)[source]¶

Update early stopping state.

Parameters:: value (float) – Current monitored value.
Returns:: True if training should stop.
Return type:: bool

class OCDocker.OCScore.Dimensionality.future.AETrainer.AETrainer(model, config, device, verbose=False, models_folder=None, run_name='autoencoder_future')[source]¶

Bases: object

Trainer for the future Autoencoder.

Notes

This trainer supports two stages with separate configs: - stage1: denoising reconstruction + optional energy supervision (default enabled). - stage2: optional fine-tuning stage with different weights/noise settings.

Data Flow¶

Each batch provides (features, energies, energy_mask).
energy_mask marks which samples have valid energy labels; energy loss is only computed on those samples.
Unlabeled samples can be added via X_unlabeled and are used only for reconstruction (energy_mask False).

Example

>>> trainer = AETrainer(model, config, device)
>>> metrics = trainer.fit(X_train, X_val, y_train, y_val)

__init__(model, config, device, verbose=False, models_folder=None, run_name='autoencoder_future')[source]¶

Initialize the trainer.

Parameters:

model (Autoencoder) – Autoencoder model.
config (dict) – Training configuration.
device (torch.device) – Execution device.
verbose (bool, optional) – Verbose mode, by default False.
models_folder (str | None, optional) – Folder to save checkpoints, by default None.
run_name (str, optional) – Checkpoint base name, by default “autoencoder_future”.

Return type:

None

evaluate(X, y=None, feature_mask=None, stage='stage1')[source]¶

Evaluate reconstruction/energy metrics on a dataset.

Parameters:

X (np.ndarray) – Feature matrix.
y (np.ndarray | None, optional) – Energy targets, by default None.
feature_mask (np.ndarray | None, optional) – Feature mask to apply, by default None.
stage (str, optional) – Stage configuration name, by default “stage1”.

Returns:

Dictionary of reconstruction/energy metrics.

Return type:

Dict[str, float]

fit(X_train, X_val=None, y_train=None, y_val=None, feature_mask=None, X_unlabeled=None)[source]¶

Train the autoencoder on the provided data.

Parameters:

X_train (np.ndarray) – Training feature matrix.
X_val (np.ndarray | None, optional) – Validation feature matrix, by default None.
y_train (np.ndarray | None, optional) – Training energy targets, by default None.
y_val (np.ndarray | None, optional) – Validation energy targets, by default None.
feature_mask (np.ndarray | None, optional) – Feature mask to apply, by default None.
X_unlabeled (np.ndarray | None, optional) – Additional unlabeled data for reconstruction, by default None.

Returns:

Training metrics and embedding statistics.

Return type:

Dict[str, object]

Notes

Stage semantics are defined by the configuration: - stage1 focuses on denoising reconstruction and (if available) energy supervision. - stage2 is optional and can reweight losses or change noise to refine the latent space. If y_train/y_val are None, the energy head is ignored and only reconstruction loss is optimized (energy_mask is all False).

Parameters:

model (Autoencoder) –
config (dict) –
device (torch.device) –
verbose (bool) –
models_folder (Optional[str]) –
run_name (str) –