OCDocker.OCScore.Dimensionality.future.Autoencoder module

Autoencoder models for the future Dimensionality pipeline.

class OCDocker.OCScore.Dimensionality.future.Autoencoder.ForwardOutput[source]

Bases: TypedDict

reconstruction: torch.Tensor
latent: torch.Tensor
mu: torch.Tensor
logvar: torch.Tensor
energy: torch.Tensor | None
class OCDocker.OCScore.Dimensionality.future.Autoencoder.MLP(*args, **kwargs)[source]

Bases: Module

Simple MLP with normalization and dropout.

Parameters:
  • input_size (int) – Input feature dimension.

  • layer_sizes (list[int]) – Layer sizes for the MLP (last size is output).

  • activations (str | list[tuple[str, dict]]) – Activation name or per-layer activation config.

  • dropout (float, optional) – Dropout probability, by default 0.0.

  • norm (str, optional) – Normalization type: ‘batch’, ‘layer’, or ‘none’. Default ‘batch’.

  • output_activation (str, optional) – Activation for the last layer (if provided). Default ‘Identity’.

Example

>>> mlp = MLP(input_size=128, layer_sizes=[256, 64, 10], activations="ReLU", dropout=0.1)
>>> out = mlp(torch.randn(32, 128))
__init__(input_size, layer_sizes, activations='GELU', dropout=0.0, norm='batch', output_activation='Identity')[source]

Initialize MLP.

Parameters:
  • input_size (int) – Input feature dimension.

  • layer_sizes (list[int]) – Layer sizes for the MLP.

  • activations (str | list[tuple[str, dict]], optional) – Activation configuration, by default “GELU”.

  • dropout (float, optional) – Dropout probability, by default 0.0.

  • norm (str, optional) – Normalization type, by default “batch”.

  • output_activation (str, optional) – Output activation name, by default “Identity”.

Return type:

None

forward(x)[source]

Forward pass through the MLP.

Parameters:

x (torch.Tensor) – Input tensor.

Returns:

Output tensor.

Return type:

torch.Tensor

class OCDocker.OCScore.Dimensionality.future.Autoencoder.EncoderModule(*args, **kwargs)[source]

Bases: Module

Encoder module with optional VAE heads.

Parameters:
  • input_size (int) – Input feature dimension.

  • encoder_hidden_sizes (list[int]) – Hidden layer sizes for the encoder (excluding latent).

  • latent_dim (int) – Latent embedding dimension.

  • activation (str, optional) – Activation for encoder hidden layers, by default ‘GELU’.

  • latent_activation (str, optional) – Activation for latent layer, by default ‘Identity’.

  • dropout (float, optional) – Dropout probability, by default 0.0.

  • latent_dropout (float, optional) – Dropout applied to latent embedding, by default 0.0.

  • norm (str, optional) – Normalization type: ‘batch’, ‘layer’, or ‘none’. Default ‘batch’.

  • use_vae (bool, optional) – If True, use VAE reparameterization, by default False.

Example

>>> encoder = EncoderModule(input_size=256, encoder_hidden_sizes=[512, 256], latent_dim=64)
>>> z = encoder(torch.randn(8, 256))
__init__(input_size, encoder_hidden_sizes, latent_dim, activation='GELU', latent_activation='Identity', dropout=0.0, latent_dropout=0.0, norm='batch', use_vae=False)[source]

Initialize encoder module.

Parameters:
  • input_size (int) – Input feature dimension.

  • encoder_hidden_sizes (list[int]) – Encoder hidden layer sizes.

  • latent_dim (int) – Latent dimension.

  • activation (str, optional) – Encoder activation, by default “GELU”.

  • latent_activation (str, optional) – Latent activation, by default “Identity”.

  • dropout (float, optional) – Dropout probability, by default 0.0.

  • latent_dropout (float, optional) – Latent dropout probability, by default 0.0.

  • norm (str, optional) – Normalization type, by default “batch”.

  • use_vae (bool, optional) – Enable VAE heads, by default False.

Return type:

None

forward(x: torch.Tensor, sample: bool = False, return_stats: Literal[False] = False) torch.Tensor[source]
forward(x: torch.Tensor, sample: bool = False, return_stats: Literal[True] = True) Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

Forward pass for the encoder.

Parameters:
  • x (torch.Tensor) – Input tensor.

  • sample (bool, optional) – If True, sample from posterior when VAE is enabled, by default False.

  • return_stats (bool, optional) – If True, return (z, mu, logvar), by default False.

Returns:

Latent tensor (and optional stats).

Return type:

torch.Tensor | tuple[torch.Tensor, torch.Tensor, torch.Tensor]

reparameterize(mu, logvar)[source]

Reparameterization trick for VAE.

Parameters:
  • mu (torch.Tensor) – Latent mean tensor.

  • logvar (torch.Tensor) – Latent log-variance tensor.

Returns:

Sampled latent tensor.

Return type:

torch.Tensor

class OCDocker.OCScore.Dimensionality.future.Autoencoder.Autoencoder(*args, **kwargs)[source]

Bases: Module

Denoising autoencoder with optional energy head and VAE support.

Parameters:
  • input_size (int) – Input feature dimension.

  • encoder_hidden_sizes (list[int]) – Hidden layer sizes for the encoder (excluding latent).

  • latent_dim (int) – Latent embedding dimension.

  • decoder_sizes (list[int] | None, optional) – Decoder sizes (including output). If None, mirror encoder. Default None.

  • activation (str, optional) – Activation for encoder/decoder hidden layers, by default ‘GELU’.

  • latent_activation (str, optional) – Activation for latent layer, by default ‘Identity’.

  • decoder_output_activation (str, optional) – Activation for decoder output, by default ‘Identity’.

  • dropout (float, optional) – Dropout probability, by default 0.0.

  • latent_dropout (float, optional) – Dropout applied to latent embedding, by default 0.0.

  • norm (str, optional) – Normalization type: ‘batch’, ‘layer’, or ‘none’. Default ‘batch’.

  • use_vae (bool, optional) – If True, use VAE reparameterization, by default False.

  • energy_head_sizes (list[int] | None, optional) – Hidden sizes for energy head. If None, energy head is disabled.

  • device (torch.device, optional) – Device to place the module on.

Example

>>> model = Autoencoder(input_size=256, encoder_hidden_sizes=[512, 256], latent_dim=64)
>>> z = model.encode(torch.randn(8, 256))

Notes

Forward returns a dictionary with: - reconstruction: decoded input - latent: latent embedding - mu/logvar: VAE statistics (zeros when use_vae=False) - energy: optional energy head output (None if disabled)

__init__(input_size, encoder_hidden_sizes, latent_dim, decoder_sizes=None, activation='GELU', latent_activation='Identity', decoder_output_activation='Identity', dropout=0.0, latent_dropout=0.0, norm='batch', use_vae=False, energy_head_sizes=None, device=torch.device)[source]

Initialize autoencoder.

Parameters:
  • input_size (int) – Input feature dimension.

  • encoder_hidden_sizes (list[int]) – Encoder hidden layer sizes.

  • latent_dim (int) – Latent dimension.

  • decoder_sizes (list[int] | None, optional) – Decoder layer sizes, by default None.

  • activation (str, optional) – Encoder/decoder activation, by default “GELU”.

  • latent_activation (str, optional) – Latent activation, by default “Identity”.

  • decoder_output_activation (str, optional) – Output activation for decoder, by default “Identity”.

  • dropout (float, optional) – Dropout probability, by default 0.0.

  • latent_dropout (float, optional) – Latent dropout probability, by default 0.0.

  • norm (str, optional) – Normalization type, by default “batch”.

  • use_vae (bool, optional) – Enable VAE mode, by default False.

  • energy_head_sizes (list[int] | None, optional) – Energy head hidden sizes, by default None.

  • device (torch.device, optional) – Device to place the module on, by default CPU.

Return type:

None

encode(x: torch.Tensor, sample: bool = False, return_stats: Literal[False] = False) torch.Tensor[source]
encode(x: torch.Tensor, sample: bool = False, return_stats: Literal[True] = True) Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

Encode inputs into latent embeddings.

Parameters:
  • x (torch.Tensor) – Input tensor.

  • sample (bool, optional) – If True and VAE enabled, sample from posterior, by default False.

  • return_stats (bool, optional) – If True, return (z, mu, logvar), by default False.

Returns:

Latent tensor (and optional stats).

Return type:

torch.Tensor | tuple[torch.Tensor, torch.Tensor, torch.Tensor]

forward(x, sample=True)[source]

Forward pass returning reconstruction and auxiliary outputs.

Parameters:
  • x (torch.Tensor) – Input tensor.

  • sample (bool, optional) – If True and VAE enabled, sample from posterior, by default True.

Returns:

Dictionary with reconstruction, latent, and auxiliary outputs.

Return type:

Dict[str, torch.Tensor]

Notes

When use_vae is False, mu/logvar are zero tensors for API consistency. When energy_head is disabled, energy is returned as None.

get_decoder()[source]

Return the decoder module.

Returns:

Decoder module.

Return type:

nn.Module

get_decoder_topology()[source]

Return decoder topology description.

Returns:

Decoder topology tokens.

Return type:

list[str]

get_encoder()[source]

Return the encoder module.

Returns:

Encoder module.

Return type:

nn.Module

get_encoder_topology()[source]

Return encoder topology description.

Returns:

Encoder topology tokens.

Return type:

list[str]

load_encoder(path, map_location=None)[source]

Load encoder weights from a file.

Parameters:
  • path (str) – Path to encoder state dict.

  • map_location (str | None, optional) – Torch map location override, by default None.

Return type:

None

reconstruct(x, sample=False)[source]

Reconstruct inputs through the autoencoder.

Parameters:
  • x (torch.Tensor) – Input tensor.

  • sample (bool, optional) – If True and VAE enabled, sample from posterior, by default False.

Returns:

Reconstructed tensor.

Return type:

torch.Tensor

sanity_check(batch_size=4)[source]

Run a lightweight sanity check on shapes and reconstruction.

Parameters:

batch_size (int, optional) – Batch size for the synthetic check, by default 4.

Returns:

Dictionary with shapes and reconstruction RMSE.

Return type:

Dict[str, object]

save_encoder(path)[source]

Save encoder weights to a file.

Parameters:

path (str) – Output path for encoder state dict.

Return type:

None