OCDocker.OCScore.Analysis.SHAP.Runner module

Run the end-to-end SHAP analysis workflow.

Usage:

from OCDocker.OCScore.Analysis.SHAP.Runner import run_shap_analysis

class OCDocker.OCScore.Analysis.SHAP.Runner.OutputPaths(out_dir, feature_importance_png, beeswarm_png, shap_values_npy, shap_values_csv=None)[source]

Bases: object

Container for SHAP analysis output file paths.

Parameters:
  • out_dir (str) –

  • feature_importance_png (str) –

  • beeswarm_png (str) –

  • shap_values_npy (str) –

  • shap_values_csv (str | None) –

out_dir

Base output directory.

Type:

str

feature_importance_png

Path to feature importance bar plot PNG file.

Type:

str

beeswarm_png

Path to SHAP beeswarm plot PNG file.

Type:

str

shap_values_npy

Path to SHAP values NumPy array file.

Type:

str

shap_values_csv

Path to SHAP values CSV file. None if CSV was not saved. Default is None.

Type:

Optional[str], optional

out_dir: str
feature_importance_png: str
beeswarm_png: str
shap_values_npy: str
shap_values_csv: str | None = None
OCDocker.OCScore.Analysis.SHAP.Runner.run_shap_analysis(studies, df_path, base_models_folder, study_number, out_dir, background_size=None, eval_size=None, explainer='deep', stratify_by=None, seed=0, save_csv=True)[source]

Run complete SHAP analysis workflow.

Parameters:
  • studies (StudyHandles) – Handles to Optuna studies for selecting best model parameters.

  • df_path (str) – Path to the main dataframe file.

  • base_models_folder (str) – Base path to the models folder.

  • study_number (int) – Study number identifier.

  • out_dir (str) – Output directory for SHAP results.

  • background_size (Optional[int], optional) – Number of samples to use for SHAP background. If None, uses all training data. Default is None.

  • eval_size (Optional[int], optional) – Number of samples to evaluate SHAP values for. If None, uses all test data. Default is None.

  • explainer (str, optional) – SHAP explainer type: “deep” or “kernel”. Default is “deep”.

  • stratify_by (Optional[List[str]], optional) – Column names to stratify sampling by. Default is None.

  • seed (int, optional) – Random seed for reproducibility. Default is 0.

  • save_csv (bool, optional) – Whether to save SHAP values as CSV file. Default is True.

Returns:

Container with paths to all generated output files.

Return type:

OutputPaths