OCDocker.OCScore.Analysis.Plotting package¶
Plotting package exports commonly used Analysis plotting utilities.
Usage:
import OCDocker.OCScore.Analysis.Plotting as ocstatplot
Modules¶
Colouring: Color palette helpers.
Core: Matplotlib styling helpers.
ImpactPlots: Feature impact plotting utilities.
MetricsPlots: ROC/PR/enrichment plotting utilities.
Stats: Statistical summary plots.
- OCDocker.OCScore.Analysis.Plotting.plot_combined_metric_scatter(df, n_trials, colour_mapping, output_dir, alpha=0.9)[source]¶
Generate a detailed scatter plot showing RMSE vs AUC across methods with shading and symbol cues.
- Parameters:
df (pd.DataFrame) – DataFrame with RMSE, AUC, and Methodology columns.
n_trials (int) – Number of top trials considered.
colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.
output_dir (str) – Directory to save the scatter plot image.
alpha (float, optional) – Transparency for the markers. Default is 0.9.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_boxplots(df, n_trials, colour_mapping, output_dir, show_simple_consensus=False)[source]¶
Generate enhanced boxplots of RMSE and AUC across methodologies, with group shading and mean lines.
- Parameters:
df (pd.DataFrame) – Data containing ‘RMSE’, ‘AUC’, and ‘Methodology’.
n_trials (int) – Number of trials used for title and filenames.
colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.
output_dir (str) – Directory to save the boxplot images.
show_simple_consensus (bool) – Whether to include consensus methodologies (any label ending with “consensus”).
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_barplots(df, n_trials, colour_mapping, output_dir)[source]¶
Generate sorted barplots of mean RMSE and AUC across methodologies with annotations.
- Parameters:
df (pd.DataFrame) – Data containing ‘RMSE’, ‘AUC’, and ‘Methodology’.
n_trials (int) – Trial number for title and output naming.
colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.
output_dir (str) – Directory to save the barplot images.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_scatterplot(df_rmse, df_auc, df_all, n_trials, colour_mapping, output_dir, orientation='horizontal', alpha=0.9)[source]¶
Create scatter plots of RMSE vs AUC for all methods and filtered subsets.
Create a 1x3 panel of scatter plots (RMSE vs AUC): - All filtered points - RMSE-filtered subset - AUC-filtered subset
- Parameters:
df_all (pd.DataFrame) – DataFrame with all filtered points.
df_rmse (pd.DataFrame) – DataFrame filtered by RMSE threshold.
df_auc (pd.DataFrame) – DataFrame filtered by AUC threshold.
n_trials (int) – Number of top trials considered.
colour_mapping (dict[str, tuple[float, float, float]]) – Dictionary mapping methodologies to colors.
output_dir (str) – Directory to save the scatter plot image.
orientation (str, optional) – Orientation of the scatter plot. Default is ‘horizontal’. Options: ‘horizontal’, ‘vertical’.
alpha (float, optional) – Transparency for the markers. Default is 0.9.
- Raises:
ValueError – If the orientation parameter is not ‘horizontal’ or ‘vertical’.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_bar_with_significance(gh_df, metric, y_col='diff', colour_mapping=None, output_dir='plots', top_n=30)[source]¶
Plot Games-Howell pairwise differences as a horizontal bar chart.
- Parameters:
gh_df (pd.DataFrame) – Output of pingouin.pairwise_gameshowell (expects columns ‘A’,’B’,’diff’,’pval’).
metric (str) – Metric label for titling (‘AUC’ or ‘RMSE’).
y_col (str) – Which column from gh_df to plot as bar length (default ‘diff’).
colour_mapping (dict | None, optional) – Unused here, accepted for API compatibility. Default: None.
output_dir (str) – Where to save the plot image. Default: ‘plots’.
top_n (int | None, optional) – If given, keep the top-N pairs by smallest p-value. Default: 30.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_heatmap(gh_df, title, metric, output_dir='plots')[source]¶
Heatmap of Games-Howell p-values across methodology pairs.
- Parameters:
gh_df (pd.DataFrame) – Output of pingouin.pairwise_gameshowell (expects columns ‘A’,’B ‘diff’,’pval’).
title (str) – Title for the heatmap.
metric (str) – Metric label for titling (‘AUC’ or ‘RMSE’).
output_dir (str) – Where to save the plot image. Default: ‘plots’.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_normality_and_variance_diagnostics(df, metric, n_trials, output_dir='plots')[source]¶
Perform and plot normality and variance diagnostics across methodologies.
Quick diagnostics across groups: - Shapiro-Wilk p-values per methodology (bar of -log10 p) - Group variances (bar) and Levene’s p-value annotated
- Parameters:
df (pd.DataFrame) – Data containing ‘Methodology’ and the specified metric.
metric (str) – Metric column to analyze (e.g., ‘AUC’ or ‘RMSE’).
n_trials (int) – Number of trials for title and output naming.
output_dir (str) – Directory to save the diagnostics plot. Default: ‘plots’.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_pca_importance_barplot(importance_df, pca_type, n_features, n_trials, output_dir='plots')[source]¶
Barplot of top-N PCA feature importances.
- Parameters:
importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.
pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).
n_features (int) – Number of top features to display.
n_trials (int) – Number of trials for title and output naming.
output_dir (str) – Directory to save the barplot image. Default: ‘plots’.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.plot_pca_importance_histogram(importance_df, pca_type, n_trials, output_dir='plots')[source]¶
Histogram of PCA feature importances.
- Parameters:
importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.
pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).
n_trials (int) – Number of trials for title and output naming.
output_dir (str) – Directory to save the histogram image. Default: ‘plots’.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.save_pca_importance_groups(importance_df, pca_type, n_trials, output_dir='plots')[source]¶
Assign coarse groups by quantiles and save as CSV.
- Parameters:
importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.
pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).
n_trials (int) – Number of trials for title and output naming.
output_dir (str) – Directory to save the plot image. Default: ‘plots’.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.save_pca_importance_bins(importance_df, pca_type, n_trials, output_dir='plots', n_bins=10)[source]¶
Assign quantile bins (qcut) and save as CSV.
- Parameters:
importance_df (pd.DataFrame) – DataFrame with ‘Feature’ and ‘Importance’ columns.
pca_type (str) – PCA type label for titling (e.g., ‘1’, ‘2’).
n_trials (int) – Number of trials for title and output naming.
output_dir (str) – Directory to save the plot image. Default: ‘plots’.
n_bins (int) – Number of quantile bins to create. Default: 10.
- Return type:
None
- OCDocker.OCScore.Analysis.Plotting.set_color_mapping(df, palette_colour='glasbey')[source]¶
Set the color palette for plotting based on the unique methodologies in the DataFrame.
- Parameters:
df (pd.DataFrame) – DataFrame containing a ‘Methodology’ column with unique methodologies.
palette_colour (str) – Name of the color palette to use. Options include: - “glasbey” - “Set2” - “Set3” - “tab10” - “tab20” - “colorblind” - “pastel” - “bright” - “dark” - “deep” - “muted” - “viridis”
- Returns:
color_mapping – Dictionary mapping each methodology to a color in RGB format.
- Return type:
dict[str, tuple[float, float, float]]
- Raises:
ValueError – If an unsupported palette is provided.