DiagnosticReport
These are the main functions for the library. The functions here will produce an end-to-end report of each result in the analysis, with an explanation of how to interpret each one.
covariate_to_numeric(covariates)
Convert categorical covariates to numeric codes for downstream analyses.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
covariates
|
ndarray or DataFrame
|
Covariate matrix with categorical variables. |
required |
Returns:
| Type | Description |
|---|---|
ndarray | None
|
np.ndarray | None: Covariates converted to a numeric array, or |
ndarray | None
|
no covariates were provided. |
Notes
If covariates is a DataFrame, each categorical column is factorized
independently.
If covariates is a NumPy array, each categorical column is factorized
independently.
Numeric columns are left unchanged.
CrossSectionalReportMin(data, batch, covariates=None, covariate_names=None, save_data=True, save_data_name=None, save_dir=None, feature_names=None, report_name=None, SaveArtifacts=False, rep=None, show=False, timestamped_reports=True, covariate_types=None, ratio_type='rest')
Create a minimal cross-sectional diagnostic report for quick checks.
This version keeps the report lightweight by running a reduced subset of
diagnostics and visualizations. For a more comprehensive analysis, use
CrossSectionalReport.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
ndarray
|
Data matrix (samples x features). |
required |
batch
|
list or ndarray
|
Batch labels for each sample. |
required |
covariates
|
ndarray
|
Covariate matrix (samples x covariates). |
None
|
covariate_names
|
list of str
|
Names of covariates. |
None
|
save_data
|
bool
|
Whether to save input data and results. |
True
|
save_data_name
|
str
|
Filename for saved data. |
None
|
save_dir
|
str or PathLike
|
Directory to save report and data. |
None
|
feature_names
|
list
|
Names of features. |
None
|
report_name
|
str
|
Name of the report file. |
None
|
SaveArtifacts
|
bool
|
Whether to save intermediate artifacts. |
False
|
rep
|
StatsReporter
|
Existing report object to use. |
None
|
show
|
bool
|
Whether to display plots interactively. |
False
|
timestamped_reports
|
bool
|
Whether to append a timestamp to the report filename. |
True
|
covariate_types
|
list
|
Types of covariates (e.g., 'categorical', 'numeric'). |
None
|
ratio_type
|
str
|
Variance-ratio comparison mode passed to |
'rest'
|
Returns:
| Name | Type | Description |
|---|---|---|
StatsReporter |
StatsReporter
|
The report object containing the generated figures, text, |
StatsReporter
|
and saved artifact references. |
CrossSectionalReport(data, batch, covariates=None, covariate_names=None, save_data=True, save_data_name=None, save_dir=None, feature_names=None, report_name=None, SaveArtifacts=False, rep=None, show=False, timestamped_reports=True, covariate_types=None, ratio_type='rest', UMAP_embedding=True, UMAP_tuning='auto')
Create a full cross-sectional diagnostic report for batch effects.
The report combines summary text, statistical tests, and visualizations for mean, variance, covariance, clustering, and distributional differences across batches.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
ndarray
|
Data matrix (samples x features). |
required |
batch
|
list or ndarray
|
Batch labels for each sample. |
required |
covariates
|
ndarray
|
Covariate matrix (samples x covariates). |
None
|
covariate_names
|
list of str
|
Names of covariates. |
None
|
save_data
|
bool
|
Whether to save input data and results. |
True
|
save_data_name
|
str
|
Filename for saved data. |
None
|
save_dir
|
str or PathLike
|
Directory to save report and data. |
None
|
feature_names
|
list
|
Names of features. |
None
|
report_name
|
str
|
Name of the report file. |
None
|
SaveArtifacts
|
bool
|
Whether to save intermediate artifacts. |
False
|
rep
|
StatsReporter
|
Existing report object to use. |
None
|
show
|
bool
|
Whether to display plots interactively. |
False
|
timestamped_reports
|
bool
|
Whether to append a timestamp to the report filename. |
True
|
covariate_types
|
list
|
Types of covariates used by the report's numeric and categorical workflows. |
None
|
ratio_type
|
str
|
Variance-ratio comparison mode passed to |
'rest'
|
Returns:
| Name | Type | Description |
|---|---|---|
StatsReporter |
StatsReporter
|
The report object containing the generated narrative, |
StatsReporter
|
figures, and saved outputs. |
Notes
covariate_types should align with covariate_names so the report can
decide when to factorize categorical covariates and when to keep numeric
covariates unchanged.
If covariate_types is not provided, the function infers categorical
versus numeric handling from the supplied data.
LongitudinalReport(data, batch, subject_ids, timepoints, covariates=None, covariate_names=None, features=None, save_data=False, save_data_name=None, save_dir=None, report_name=None, SaveArtifacts=False, rep=None, show=False, timestamped_reports=True)
Create a diagnostic report for dataset differences across batches in longitudinal data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
ndarray
|
Data matrix (samples x features). |
required |
batch
|
list or ndarray
|
Batch labels for each sample. |
required |
subject_ids
|
list or ndarray
|
Subject IDs for each sample. |
required |
covariates
|
ndarray
|
Covariate matrix (samples x covariates). |
None
|
covariate_names
|
list of str
|
Names of covariates. |
None
|
save_data
|
bool
|
Whether to save input data and results. |
False
|
save_data_name
|
str
|
Filename for saved data. |
None
|
save_dir
|
str or PathLike
|
Directory to save report and data. |
None
|
report_name
|
str
|
Name of the report file. |
None
|
SaveArtifacts
|
bool
|
Whether to save intermediate artifacts. |
False
|
rep
|
StatsReporter
|
Existing report object to use. |
None
|
show
|
bool
|
Whether to display plots interactively. |
False
|
Outputs
Generates an HTML report with diagnostic plots and statistics for longitudinal data.
If save_data is True, also returns a dictionary and csv with input data and results.
If SaveArtifacts is True, saves intermediate plots to save_dir.
Note: This function is designed for repeated data where we do not expect to see a longitudinal trent over time. If need arises, we will revise this to include an additional function where we would expect to see a longitudinal trend and want to test for that explicitly.