OCDocker.OCScore.Analysis.Plotting package

Plotting package exports commonly used Analysis plotting utilities.

Usage:

import OCDocker.OCScore.Analysis.Plotting as ocstatplot

Modules

  • Colouring: Color palette helpers.

  • Core: Matplotlib styling helpers.

  • ImpactPlots: Feature impact plotting utilities.

  • MetricsPlots: ROC/PR/enrichment plotting utilities.

  • Stats: Statistical summary plots.

OCDocker.OCScore.Analysis.Plotting.plot_combined_metric_scatter(df, n_trials, colour_mapping, output_dir, alpha=0.9)[source]

Generate a detailed scatter plot showing RMSE vs AUC across methods with shading and symbol cues.

Parameters:
  • df (pd.DataFrame) – DataFrame with RMSE, AUC, and Methodology columns.

  • n_trials (int) – Number of top trials considered.

  • colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.

  • output_dir (str) – Directory to save the scatter plot image.

  • alpha (float, optional) – Transparency for the markers. Default is 0.9.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_boxplots(df, n_trials, colour_mapping, output_dir, show_simple_consensus=False)[source]

Generate enhanced boxplots of RMSE and AUC across methodologies, with group shading and mean lines.

Parameters:
  • df (pd.DataFrame) – Data containing ‘RMSE’, ‘AUC’, and ‘Methodology’.

  • n_trials (int) – Number of trials used for title and filenames.

  • colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.

  • output_dir (str) – Directory to save the boxplot images.

  • show_simple_consensus (bool) – Whether to include consensus methodologies (any label ending with “consensus”).

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_barplots(df, n_trials, colour_mapping, output_dir)[source]

Generate sorted barplots of mean RMSE and AUC across methodologies with annotations.

Parameters:
  • df (pd.DataFrame) – Data containing ‘RMSE’, ‘AUC’, and ‘Methodology’.

  • n_trials (int) – Trial number for title and output naming.

  • colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.

  • output_dir (str) – Directory to save the barplot images.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_scatterplot(df_rmse, df_auc, df_all, n_trials, colour_mapping, output_dir, orientation='horizontal', alpha=0.9)[source]

Create scatter plots of RMSE vs AUC for all methods and filtered subsets.

Create a 1x3 panel of scatter plots (RMSE vs AUC): - All filtered points - RMSE-filtered subset - AUC-filtered subset

Parameters:
  • df_all (pd.DataFrame) – DataFrame with all filtered points.

  • df_rmse (pd.DataFrame) – DataFrame filtered by RMSE threshold.

  • df_auc (pd.DataFrame) – DataFrame filtered by AUC threshold.

  • n_trials (int) – Number of top trials considered.

  • colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.

  • output_dir (str) – Directory to save the scatter plot image.

  • orientation (str, optional) – Orientation of the scatter plot. Default is ‘horizontal’. Options: ‘horizontal’, ‘vertical’.

  • alpha (float, optional) – Transparency for the markers. Default is 0.9.

Raises:

ValueError – If the orientation parameter is not ‘horizontal’ or ‘vertical’.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_bar_with_significance(gh_df, metric, y_col='diff', colour_mapping=None, output_dir='plots', top_n=30)[source]

Plot Games-Howell pairwise differences as a horizontal bar chart.

Parameters:
  • gh_df (pd.DataFrame) – Output of pingouin.pairwise_gameshowell (expects columns ‘A’,’B’,’diff’,’pval’).

  • metric (str) – Metric label for titling (‘AUC’ or ‘RMSE’).

  • y_col (str) – Which column from gh_df to plot as bar length (default ‘diff’).

  • colour_mapping (dict | None, optional) – Unused here, accepted for API compatibility. Default: None.

  • output_dir (str) – Where to save the plot image. Default: ‘plots’.

  • top_n (int | None, optional) – If given, keep the top-N pairs by smallest p-value. Default: 30.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_heatmap(gh_df, title, metric, output_dir='plots')[source]

Heatmap of Games-Howell p-values across methodology pairs.

Parameters:
  • gh_df (pd.DataFrame) – Output of pingouin.pairwise_gameshowell (expects columns ‘A’,’B ‘diff’,’pval’).

  • title (str) – Title for the heatmap.

  • metric (str) – Metric label for titling (‘AUC’ or ‘RMSE’).

  • output_dir (str) – Where to save the plot image. Default: ‘plots’.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_normality_and_variance_diagnostics(df, metric, n_trials, output_dir='plots')[source]

Perform and plot normality and variance diagnostics across methodologies.

Quick diagnostics across groups: - Shapiro-Wilk p-values per methodology (bar of -log10 p) - Group variances (bar) and Levene’s p-value annotated

Parameters:
  • df (pd.DataFrame) – Data containing ‘Methodology’ and the specified metric.

  • metric (str) – Metric column to analyze (e.g., ‘AUC’ or ‘RMSE’).

  • n_trials (int) – Number of trials for title and output naming.

  • output_dir (str) – Directory to save the diagnostics plot. Default: ‘plots’.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_pca_importance_barplot(importance_df, pca_type, n_features, n_trials, output_dir='plots')[source]

Barplot of top-N PCA feature importances.

Parameters:
  • importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.

  • pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).

  • n_features (int) – Number of top features to display.

  • n_trials (int) – Number of trials for title and output naming.

  • output_dir (str) – Directory to save the barplot image. Default: ‘plots’.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.plot_pca_importance_histogram(importance_df, pca_type, n_trials, output_dir='plots')[source]

Histogram of PCA feature importances.

Parameters:
  • importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.

  • pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).

  • n_trials (int) – Number of trials for title and output naming.

  • output_dir (str) – Directory to save the histogram image. Default: ‘plots’.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.save_pca_importance_groups(importance_df, pca_type, n_trials, output_dir='plots')[source]

Assign coarse groups by quantiles and save as CSV.

Parameters:
  • importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.

  • pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).

  • n_trials (int) – Number of trials for title and output naming.

  • output_dir (str) – Directory to save the plot image. Default: ‘plots’.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.save_pca_importance_bins(importance_df, pca_type, n_trials, output_dir='plots', n_bins=10)[source]

Assign quantile bins (qcut) and save as CSV.

Parameters:
  • importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.

  • pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).

  • n_trials (int) – Number of trials for title and output naming.

  • output_dir (str) – Directory to save the plot image. Default: ‘plots’.

  • n_bins (int) – Number of quantile bins to create. Default: 10.

Return type:

None

OCDocker.OCScore.Analysis.Plotting.set_color_mapping(df, palette_colour='glasbey')[source]

Set the color palette for plotting based on the unique methodologies in the DataFrame.

Parameters:
  • df (pd.DataFrame) – DataFrame containing a ‘Methodology’ column with unique methodologies.

  • palette_colour (str) – Name of the color palette to use. Options include: - “glasbey” - “Set2” - “Set3” - “tab10” - “tab20” - “colorblind” - “pastel” - “bright” - “dark” - “deep” - “muted” - “viridis”

Returns:

color_mapping – Dictionary mapping each methodology to a color in RGB format.

Return type:

dict[str, tuple[float, float, float]]

Raises:

ValueError – If an unsupported palette is provided.

Submodules