This function provides a robust estimate of the number of significant, shared components in a list of data matrices. It uses a streamlined Parallel Analysis approach, comparing the singular value spectrum of the real data against the spectrum of permuted (noise) data.

estimate_joint_rank(
  mat_list,
  n_permutations = 20,
  pre_scaling = c("frobenius", "none"),
  plot_scree = TRUE
)

Arguments

mat_list

A list of numeric matrices [subjects x features].

n_permutations

The number of permutations to perform to create a stable null distribution. A higher number is more stable but slower. Defaults to 20.

pre_scaling

Method to scale matrices before combining them. `"frobenius"` (default) scales each matrix to have a total variance of 1, ensuring fair contribution. `"none"` performs no scaling.

plot_scree

Logical. If TRUE, generates a scree plot comparing the real and permuted singular values, which is essential for visual diagnosis.

Value

A list containing:

optimal_k

The estimated number of significant components.

results

A tibble showing the real vs. permuted eigenvalues for each component.

plot

A ggplot object of the scree plot.

Details

The function first combines all modalities into a single matrix. It then computes the SVD and its eigenvalues (squared singular values) for this real data. It repeats this process `n_permutations` times on shuffled versions of the data to generate a stable "null" or "chance" eigenvalue distribution. The optimal k is determined as the last component where the real eigenvalue exceeds the mean of the permuted eigenvalues.