This function matches rows from a smaller dataframe to rows in a larger dataframe based on specified numerical variables. The matching is performed using a greedy algorithm that maximizes uniqueness in the matched subset.
match_data_frames(df1, df2, match_vars)
A dataframe that is the smaller set to match from.
A dataframe that is the larger set to match to.
A character vector of column names to match on. These columns should be numerical or convertible to numerical.
A dataframe containing the rows from df2
that are matched to each row in df1
.
The returned dataframe will contain only the columns that are common to both input dataframes.
if (FALSE) {
df1 <- data.frame(age_BL = c(30, 40, 50), commonSex = c("M", "F", "M"), MOCA = c(25, 28, 27))
df2 <- data.frame(age_BL = c(31, 39, 51, 60), commonSex = c("M", "F", "F", "M"), MOCA = c(26, 27, 29, 28))
matched_df <- match_data_frames(df1, df2, c("age_BL", "commonSex", "MOCA"))
}