sparseDistanceMatrixXY.Rd
Exploit k-nearest neighbor algorithms to estimate a sparse matrix measuring the distance, correlation or covariance between two matched datasets. Critical to the validity of this function is the basic mathematical relationships between euclidean distance and correlation and between correlation and covariance. For applications of such matrices, one may see relevant publications by Mauro Maggioni and other authors.
sparseDistanceMatrixXY(
x,
y,
k = 3,
r = Inf,
sigma = NA,
kmetric = c("euclidean", "correlation", "covariance", "gaussian"),
eps = 1e-06,
kPackage = "RcppHNSW",
ncores = NA,
verbose = FALSE
)
input matrix, should be n (samples) by p (measurements)
input matrix second view, should be n (samples) by q (measurements)
number of neighbors
radius of epsilon-ball
parameter for kernel PCA.
similarity or distance metric determining k nearest neighbors
epsilon error for rapid knn
name of package to use for knn. FNN is reproducbile but RcppHNSW is much faster (with nthreads controlled by enviornment variable ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS) for larger problems. For large problems, compute the regularization once and save to disk; load for repeatability.
number of cores to use
verbose output
matrix sparse p by q matrix is output with p by k nonzero entries
if (FALSE) { # \dontrun{
set.seed(120)
mat <- matrix(rnorm(60), nrow = 6)
mat2 <- matrix(rnorm(120), nrow = 6)
smat <- sparseDistanceMatrixXY(mat, mat2, 3)
smat2 <- sparseDistanceMatrixXY(mat2, mat, 3)
testthat::expect_is(smat, "Matrix")
testthat::expect_is(smat, "dgCMatrix")
testthat::expect_is(smat2, "Matrix")
testthat::expect_is(smat2, "dgCMatrix")
testthat::expect_equal(sum(smat), 154.628961265087)
testthat::expect_equal(sum(smat2), 63.7344262003899)
} # }