multiple resolution neighborhood random forest regression

Represents feature images as a neighborhood across scales to build a random forest prediction from an image population. A use case for this function is to predict cognition from multiple image features, e.g. from the voxelwise FA of the corpus callosum and, in parallel, voxelwise measurements of the volume of the inferior frontal gyrus.

multiResRandomForestRegression(
  y,
  x,
  labelmasks,
  rad = NA,
  nsamples = 10,
  multiResSchedule = c(0),
  ntrees = 500
)

Arguments

y: vector of scalar values or labels. if a factor, do classification, otherwise regression.
x: a list of lists where each list contains feature images
labelmasks: a list of masks where each mask defines the image space for the given list and the number of parallel predictors. more labels means more predictors. alternatively, separate masks may be used for each feature in which case this should be a list of lists. see examples.
rad: vector of dimensionality d define nhood radius
nsamples: (per subject to enter training)
multiResSchedule: an integer vector defining multi-res levels
ntrees: (for the random forest model)

Value

list with a random forest model, a vector identifying which rows correspond to which subjects and a prediction vector.

References

Pustina, D, et al. Automated segmentation of chronic stroke lesions using LINDA: Lesion Identification with Neighborhood Data Analysis, Human Brain Mapping, 2016. (related work, not identical)

Author

Avants BB, Tustison NJ

Examples


mask <- makeImage(c(100, 100), 0)
mask[30:60, 30:60] <- 1
mask[35:45, 50:60] <- 2
ilist <- list()
masklist <- list()
inds <- 1:8
yvec <- rep(0, length(inds))
scl <- 0.33 # a noise parameter
for (i in inds) {
  img <- antsImageClone(mask)
  imgb <- antsImageClone(mask)
  limg <- antsImageClone(mask)
  img[3:6, 3:6] <- rnorm(16, 1) * scl * (i) + scl * mean(rnorm(1))
  imgb[3:6, 3:6] <- rnorm(16, 1) * scl * (i) + scl * mean(rnorm(1))
  ilist[[i]] <- list(img, imgb) # two features
  yvec[i] <- i^2.0 # a real outcome
  masklist[[i]] <- antsImageClone(mask)
}
r <- c(1, 1)
mr <- c(2, 0)
featMat <- getMultiResFeatureMatrix(ilist[[1]], masklist[[1]],
  rad = r, , multiResSchedule = mr
)
rfm <- multiResRandomForestRegression(
  yvec, ilist, masklist,
  rad = r, multiResSchedule = mr
)
preds <- predict(rfm, newdata = featMat)
if (FALSE) { # \dontrun{
# data: https://github.com/stnava/ANTsTutorial/tree/master/phantomData
fns <- Sys.glob("phantom*wmgm.jpg")
ilist <- imageFileNames2ImageList(fns)
masklist <- list()
flist <- list()
for (i in 1:length(fns))
{
  # 2 labels means 2 sets of side by side predictors and features at each scale
  locseg <- kmeansSegmentation(ilist[[i]], 2)$segmentation
  masklist[[i]] <- list(locseg, locseg %>% thresholdImage(2, 2), locseg)
  flist[[i]] <- list(
    ilist[[i]], ilist[[i]] %>% iMath("Laplacian", 1),
    ilist[[i]] %>% iMath("Grad", 1)
  )
}
yvec <- factor(rep(c(1, 2), each = 4)) # classification
r <- c(1, 1)
mr <- c(2, 1, 0)
ns <- 50
trn <- c(1:3, 6:8)
ytrain <- yvec[trn]
ftrain <- flist[trn]
mtrain <- masklist[trn]
mrrfr <- multiResRandomForestRegression(ytrain, ftrain, mtrain,
  rad = c(1, 1),
  nsamples = ns, multiResSchedule = mr
)
mypreds <- rep(NA, length(fns))
mymode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}
for (i in 4:5) # test set
{
  fmat <- getMultiResFeatureMatrix(flist[[i]], masklist[[i]],
    rad = r, multiResSchedule = mr, nsamples = ns
  )
  myp <- predict(mrrfr, newdata = fmat)
  mypreds[i] <- mymode(myp) # get the most frequent observation
  # use median or mean for continuous predictions
}
print("predicted")
print(mypreds[-trn])
print("ground truth")
print(yvec[-trn])
} # }

Arguments

Value

References

See also

Author

Examples