Skip to content

Cross-Sectional Comparison Report

This page documents the multi-method comparison workflow used to evaluate raw and harmonised datasets side by side.

The comparison report expects a dictionary of equally shaped data matrices that share the same batch labels and optional covariates. It runs the same diagnostic suite on each method, then combines the outputs into a scorecard and a short recommendation summary.

Public entry point

Create a comparative diagnostic report for multiple harmonisation methods.

The comparison report runs the same diagnostic suite on each candidate dataset, then aggregates the resulting metrics into a method scorecard and a short recommendation summary. It is intended for side-by-side evaluation of raw and harmonised outputs that share the same sample order, batch labels, and optional covariates.

The report reuses the same per-method diagnostic pipeline as the single cross-sectional workflow through the following helpers:

  • validate_comparison_datasets: checks that all methods are compatible.
  • _run_single_method_diagnostics: runs the full diagnostic suite.
  • summarise_method_performance: builds the comparison scorecard.
  • generate_comparison_advice: turns the scorecard into a recommendation.
  • _save_comparison_results: exports per-method CSV artifacts.

Parameters:

Name Type Description Default
datasets dict[str, ndarray]

Mapping of method name to data matrix (n_samples, n_features).

required
batch

Batch vector of length n_samples.

required
covariates

Optional covariate matrix (n_samples, n_covariates).

None
covariate_names

Optional covariate names.

None
save_data bool

Whether to save per-method per-test CSV outputs.

True
save_data_name str | None

Optional prefix to include in per-method saved CSV names.

None
feature_names

Optional feature names.

None
save_dir str | PathLike | None

Directory for report and CSV outputs.

None
report_name str | None

HTML report name.

None
scoring_config dict | None

Optional scoring configuration.

None
rep

Optional existing StatsReporter instance.

None
SaveArtifacts bool

Whether to save report artifacts.

False
show bool

Whether to show plots interactively.

False
timestamped_reports bool

Whether to timestamp the report filename.

True
covariate_types

Optional covariate type codes.

None
ratio_type str

Variance-ratio mode.

'rest'

Returns:

Name Type Description
StatsReporter StatsReporter

Report object containing method-wise diagnostics,

StatsReporter

side-by-side plots, scorecard, and advice.

Supporting helpers

The report uses a small set of helper utilities to validate inputs, run the diagnostics, and aggregate the results:

Validate and normalize the datasets used by the comparison report.

The function enforces a non-empty mapping of method name to 2D data array, checks that every method has the same shape, and validates that batch, covariate, and feature-name dimensions are compatible with the data.

Execute the full diagnostic suite for one harmonisation method.

The function runs the same core tests used by the single-dataset report, captures any failures as strings in the result object, and returns the collected outputs for downstream comparison, plotting, and export.

Turn per-method diagnostics into a comparable scorecard.

The summary combines the extracted metrics into category-level scores for additive, multiplicative, linear-modelling, distributional, and PCA behaviour. Optional scoring configuration can reweight the metrics or mark specific metrics as higher-is-better.

Generate a short natural-language recommendation from the scorecard.

The advice selects the best overall method, identifies the strongest method for each diagnostic theme, and adds a short note when the diagnostics favor different methods in different domains.

Write all available per-method diagnostics to disk.

The helper mirrors the standard single-report export behavior, but prefixes files with the method name so the raw outputs from each harmonisation strategy stay separate inside the comparison report output directory.