Cross-Sectional Comparison Report

This page documents the multi-method comparison workflow used to evaluate raw and harmonised datasets side by side.

The comparison report expects a dictionary of equally shaped data matrices that share the same batch labels and optional covariates. It runs the same diagnostic suite on each method, then combines the outputs into a scorecard and a short recommendation summary.

Public entry point

Create a comparative diagnostic report for multiple harmonisation methods.

The comparison report runs the same diagnostic suite on each candidate dataset, then aggregates the resulting metrics into a method scorecard and a short recommendation summary. It is intended for side-by-side evaluation of raw and harmonised outputs that share the same sample order, batch labels, and optional covariates.

The report reuses the same per-method diagnostic pipeline as the single cross-sectional workflow through the following helpers:

validate_comparison_datasets: checks that all methods are compatible.
_run_single_method_diagnostics: runs the full diagnostic suite.
summarise_method_performance: builds the comparison scorecard.
generate_comparison_advice: turns the scorecard into a recommendation.
_save_comparison_results: exports per-method CSV artifacts.

Parameters:

Name	Type	Description	Default
`datasets`	`dict[str, ndarray]`	Mapping of method name to data matrix `(n_samples, n_features)`.	required
`batch`		Batch vector of length `n_samples`.	required
`covariates`		Optional covariate matrix `(n_samples, n_covariates)`.	`None`
`covariate_names`		Optional covariate names.	`None`
`save_data`	`bool`	Whether to save per-method per-test CSV outputs.	`True`
`save_data_name`	`str \| None`	Optional prefix to include in per-method saved CSV names.	`None`
`feature_names`		Optional feature names.	`None`
`save_dir`	`str \| PathLike \| None`	Directory for report and CSV outputs.	`None`
`report_name`	`str \| None`	HTML report name.	`None`
`scoring_config`	`dict \| None`	Optional scoring configuration.	`None`
`rep`		Optional existing `StatsReporter` instance.	`None`
`SaveArtifacts`	`bool`	Whether to save report artifacts.	`False`
`show`	`bool`	Whether to show plots interactively.	`False`
`timestamped_reports`	`bool`	Whether to timestamp the report filename.	`True`
`covariate_types`		Optional covariate type codes.	`None`
`ratio_type`	`str`	Variance-ratio mode.	`'rest'`

Returns:

Name	Type	Description
`StatsReporter`	`StatsReporter`	Report object containing method-wise diagnostics,
	`StatsReporter`	side-by-side plots, scorecard, and advice.

Supporting helpers

The report uses a small set of helper utilities to validate inputs, run the diagnostics, and aggregate the results:

Validate and normalize the datasets used by the comparison report.

The function enforces a non-empty mapping of method name to 2D data array, checks that every method has the same shape, and validates that batch, covariate, and feature-name dimensions are compatible with the data.

Execute the full diagnostic suite for one harmonisation method.

The function runs the same core tests used by the single-dataset report, captures any failures as strings in the result object, and returns the collected outputs for downstream comparison, plotting, and export.

Turn per-method diagnostics into a comparable scorecard.

The summary combines the extracted metrics into category-level scores for additive, multiplicative, linear-modelling, distributional, and PCA behaviour. Optional scoring configuration can reweight the metrics or mark specific metrics as higher-is-better.

Generate a short natural-language recommendation from the scorecard.

The advice selects the best overall method, identifies the strongest method for each diagnostic theme, and adds a short note when the diagnostics favor different methods in different domains.

Write all available per-method diagnostics to disk.

The helper mirrors the standard single-report export behavior, but prefixes files with the method name so the raw outputs from each harmonisation strategy stay separate inside the comparison report output directory.