OCDocker.OCScore.Utils.IO module¶
Set of functions to manage I/O operations in OCDocker in the context of scoring functions.
Usage:
import OCDocker.OCScore.Utils.IO as ocscoreio
- OCDocker.OCScore.Utils.IO.get_models_dir()[source]¶
Get the path to the OCScore models directory.
This directory is used to store models and masks that are shipped with the code. The directory is located at the project root level (same level as ODDT_models), separate from the code folder. The directory is created if it doesn’t exist.
- Returns:
Path to the models directory.
- Return type:
str
- OCDocker.OCScore.Utils.IO.load_data(file_name, exclude_column='experimental')[source]¶
Loads a CSV file into a DataFrame, removes rows with NaNs (except in a specified column), and notifies the user.
- Parameters:
file_name (str) – Name of the CSV file to load.
exclude_column (str) – Column to exclude from the NaN removal process.
- Returns:
DataFrame containing the data from the CSV file.
- Return type:
pd.DataFrame
- OCDocker.OCScore.Utils.IO.load_mask(name, models_dir=None)[source]¶
Load a mask from a file in the models directory.
- Parameters:
name (str) – Name of the mask file (without extension). The function will look for ‘{name}_mask.pkl’ in the models directory.
models_dir (str, optional) – Custom directory to load the mask from. If None, uses the default OCScore models directory. Default is None.
- Returns:
The loaded mask array.
- Return type:
np.ndarray
- Raises:
FileNotFoundError – If the mask file is not found.
- OCDocker.OCScore.Utils.IO.load_object(file_name, serialization_method='auto', trusted=False)[source]¶
Load an object from a file using pickle, joblib, or torch.
Security¶
Only load serialized files from trusted sources. Pickle/joblib deserialization can execute arbitrary code if the file is malicious or untrusted.
- param file_name:
The name of the file from which to load the object.
- type file_name:
str
- param serialization_method:
The serialization method used to save the object. Options are: - “auto”: Automatically detect from file extension (.pt/.pth -> torch, .pkl -> joblib/pickle) - “joblib”: Use joblib to load - “pickle”: Use pickle to load - “torch”: Use torch.load to load (for PyTorch models)
- type serialization_method:
str
- param trusted:
Explicit opt-in that the serialized input is trusted. If False, loading is blocked unless
OCDOCKER_ALLOW_UNSAFE_DESERIALIZATION=1is set. Default is False.- type trusted:
bool, optional
- returns:
The loaded object.
- rtype:
Any
- raises ValueError:
If the serialization method is not recognized.
- Parameters:
file_name (str) –
serialization_method (str) –
trusted (bool) –
- Return type:
Any
- OCDocker.OCScore.Utils.IO.save_mask(mask, name, models_dir=None)[source]¶
Save a mask to a file in the models directory.
- Parameters:
mask (list | np.ndarray) – The mask array of 0s and 1s to save.
name (str) – Name for the mask file (without extension). The file will be saved as ‘{name}_mask.pkl’ in the models directory.
models_dir (str, optional) – Custom directory to save the mask. If None, uses the default OCScore models directory. Default is None.
- Returns:
Path to the saved mask file.
- Return type:
str
- Raises:
ValueError – If the mask is not a valid array of 0s and 1s.
- OCDocker.OCScore.Utils.IO.save_object(obj, filename, serialization_method='auto')[source]¶
Save an object to a file using pickle, joblib, or torch.
- Parameters:
obj (Any) – The object to be saved.
filename (str) – The name of the file where the object will be stored.
serialization_method (str) – The serialization method to use. Options are: - “auto”: Automatically detect from file extension (.pt/.pth -> torch, .pkl -> joblib) - “joblib”: Use joblib to save (recommended for sklearn models, XGBoost) - “pickle”: Use pickle to save - “torch”: Use torch.save to save (for PyTorch models)
- Return type:
None