Exploit k-nearest neighbor algorithms to estimate a sparse similarity matrix. Critical to the validity of this function is the basic mathematical relationships between euclidean distance and correlation and between correlation and covariance. For applications of such matrices, one may see relevant publications by Mauro Maggioni and other authors.

sparseDistanceMatrix(
  x,
  k = 3,
  r = Inf,
  sigma = NA,
  kmetric = c("euclidean", "correlation", "covariance", "gaussian"),
  eps = 1e-06,
  ncores = NA,
  sinkhorn = FALSE,
  kPackage = "RcppHNSW",
  verbose = FALSE
)

Arguments

x

input matrix, should be n (samples) by p (measurements)

k

number of neighbors

r

radius of epsilon-ball

sigma

parameter for kernel PCA.

kmetric

similarity or distance metric determining k nearest neighbors

eps

epsilon error for rapid knn

ncores

number of cores to use

sinkhorn

boolean

kPackage

name of package to use for knn. FNN is reproducbile but RcppHNSW is much faster (with nthreads controlled by enviornment variable ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS) for larger problems. For large problems, compute the regularization once and save to disk; load for repeatability.

verbose

verbose output

Value

matrix sparse p by p matrix is output with p by k nonzero entries

Author

Avants BB

Examples

if (FALSE) { # \dontrun{
set.seed(120)
mat <- matrix(rnorm(60), ncol = 10)
smat <- sparseDistanceMatrix(mat, 2)
r16 <- antsImageRead(getANTsRData("r16"))
mask <- getMask(r16)
mat <- getNeighborhoodInMask(
  image = r16, mask = mask, radius = c(0, 0),
  physical.coordinates = TRUE, spatial.info = TRUE
)
smat <- sparseDistanceMatrix(t(mat$indices), 10) # close points
testthat::expect_is(smat, "Matrix")
testthat::expect_is(smat, "dgCMatrix")
testthat::expect_equal(sum(smat), 18017)
} # }