Cross-Sectional Comparison Report
This page documents the multi-method comparison workflow used to evaluate raw and harmonised datasets side by side.
The comparison report expects a dictionary of equally shaped data matrices that share the same batch labels and optional covariates. It runs the same diagnostic suite on each method, then combines the outputs into a scorecard and a short recommendation summary.
Public entry point
Create a comparative diagnostic report for multiple harmonisation methods.
The comparison report runs the same diagnostic suite on each candidate dataset, then aggregates the resulting metrics into a method scorecard and a short recommendation summary. It is intended for side-by-side evaluation of raw and harmonised outputs that share the same sample order, batch labels, and optional covariates.
The report reuses the same per-method diagnostic pipeline as the single cross-sectional workflow through the following helpers:
validate_comparison_datasets: checks that all methods are compatible._run_single_method_diagnostics: runs the full diagnostic suite.summarise_method_performance: builds the comparison scorecard.generate_comparison_advice: turns the scorecard into a recommendation._save_comparison_results: exports per-method CSV artifacts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
datasets
|
dict[str, ndarray]
|
Mapping of method name to data matrix |
required |
batch
|
Batch vector of length |
required | |
covariates
|
Optional covariate matrix |
None
|
|
covariate_names
|
Optional covariate names. |
None
|
|
save_data
|
bool
|
Whether to save per-method per-test CSV outputs. |
True
|
save_data_name
|
str | None
|
Optional prefix to include in per-method saved CSV names. |
None
|
feature_names
|
Optional feature names. |
None
|
|
save_dir
|
str | PathLike | None
|
Directory for report and CSV outputs. |
None
|
report_name
|
str | None
|
HTML report name. |
None
|
scoring_config
|
dict | None
|
Optional scoring configuration. |
None
|
rep
|
Optional existing |
None
|
|
SaveArtifacts
|
bool
|
Whether to save report artifacts. |
False
|
show
|
bool
|
Whether to show plots interactively. |
False
|
timestamped_reports
|
bool
|
Whether to timestamp the report filename. |
True
|
covariate_types
|
Optional covariate type codes. |
None
|
|
ratio_type
|
str
|
Variance-ratio mode. |
'rest'
|
Returns:
| Name | Type | Description |
|---|---|---|
StatsReporter |
StatsReporter
|
Report object containing method-wise diagnostics, |
StatsReporter
|
side-by-side plots, scorecard, and advice. |
Supporting helpers
The report uses a small set of helper utilities to validate inputs, run the diagnostics, and aggregate the results:
Validate and normalize the datasets used by the comparison report.
The function enforces a non-empty mapping of method name to 2D data array, checks that every method has the same shape, and validates that batch, covariate, and feature-name dimensions are compatible with the data.
Execute the full diagnostic suite for one harmonisation method.
The function runs the same core tests used by the single-dataset report, captures any failures as strings in the result object, and returns the collected outputs for downstream comparison, plotting, and export.
Turn per-method diagnostics into a comparable scorecard.
The summary combines the extracted metrics into category-level scores for additive, multiplicative, linear-modelling, distributional, and PCA behaviour. Optional scoring configuration can reweight the metrics or mark specific metrics as higher-is-better.