OCDocker.OCScore.Dimensionality.future.AETrainer module¶
Trainer for the future Autoencoder pipeline (denoising + multi-task).
- class OCDocker.OCScore.Dimensionality.future.AETrainer.EarlyStopping(patience=20, min_delta=0.0)[source]¶
Bases:
objectSimple early stopping helper.
- Parameters:
patience (int, optional) – Number of epochs without improvement to wait, by default 20.
min_delta (float, optional) – Minimum improvement to reset patience, by default 0.0.
- class OCDocker.OCScore.Dimensionality.future.AETrainer.AETrainer(model, config, device, verbose=False, models_folder=None, run_name='autoencoder_future')[source]¶
Bases:
objectTrainer for the future Autoencoder.
Notes
This trainer supports two stages with separate configs: - stage1: denoising reconstruction + optional energy supervision (default enabled). - stage2: optional fine-tuning stage with different weights/noise settings.
Data Flow¶
Each batch provides (features, energies, energy_mask).
energy_mask marks which samples have valid energy labels; energy loss is only computed on those samples.
Unlabeled samples can be added via X_unlabeled and are used only for reconstruction (energy_mask False).
Example
>>> trainer = AETrainer(model, config, device) >>> metrics = trainer.fit(X_train, X_val, y_train, y_val)
- __init__(model, config, device, verbose=False, models_folder=None, run_name='autoencoder_future')[source]¶
Initialize the trainer.
- Parameters:
model (Autoencoder) – Autoencoder model.
config (dict) – Training configuration.
device (torch.device) – Execution device.
verbose (bool, optional) – Verbose mode, by default False.
models_folder (str | None, optional) – Folder to save checkpoints, by default None.
run_name (str, optional) – Checkpoint base name, by default “autoencoder_future”.
- Return type:
None
- evaluate(X, y=None, feature_mask=None, stage='stage1')[source]¶
Evaluate reconstruction/energy metrics on a dataset.
- Parameters:
X (np.ndarray) – Feature matrix.
y (np.ndarray | None, optional) – Energy targets, by default None.
feature_mask (np.ndarray | None, optional) – Feature mask to apply, by default None.
stage (str, optional) – Stage configuration name, by default “stage1”.
- Returns:
Dictionary of reconstruction/energy metrics.
- Return type:
Dict[str, float]
- fit(X_train, X_val=None, y_train=None, y_val=None, feature_mask=None, X_unlabeled=None)[source]¶
Train the autoencoder on the provided data.
- Parameters:
X_train (np.ndarray) – Training feature matrix.
X_val (np.ndarray | None, optional) – Validation feature matrix, by default None.
y_train (np.ndarray | None, optional) – Training energy targets, by default None.
y_val (np.ndarray | None, optional) – Validation energy targets, by default None.
feature_mask (np.ndarray | None, optional) – Feature mask to apply, by default None.
X_unlabeled (np.ndarray | None, optional) – Additional unlabeled data for reconstruction, by default None.
- Returns:
Training metrics and embedding statistics.
- Return type:
Dict[str, object]
Notes
Stage semantics are defined by the configuration: - stage1 focuses on denoising reconstruction and (if available) energy supervision. - stage2 is optional and can reweight losses or change noise to refine the latent space. If y_train/y_val are None, the energy head is ignored and only reconstruction loss is optimized (energy_mask is all False).
- Parameters:
model (Autoencoder) –
config (dict) –
device (torch.device) –
verbose (bool) –
models_folder (Optional[str]) –
run_name (str) –