Impute missing data using GLM models

glm_impute(
  dataframe,
  columns_to_impute,
  predictor_columns,
  family = "gaussian"
)

Arguments

dataframe

A data frame containing the data to impute.

columns_to_impute

A vector of column names to impute.

predictor_columns

A vector of column names to use as predictors.

family

A string specifying the GLM family (default is 'gaussian').

Value

A data frame with imputed values.

Examples

set.seed(123)
df <- data.frame(
  age = c(25, 30, 35, NA, 45, 50, NA, 40, 35, NA),
  income = c(50000, 60000, 70000, 80000, 90000, 100000, 110000, NA, 120000, 130000),
  education = c(12, 16, 14, 12, NA, 18, 20, 16, 14, 12)
)
columns_to_impute <- c("age")
predictor_columns <- c( "income", "education")
imputed_data <- glm_impute(df, columns_to_impute, predictor_columns, family = 'gaussian')
print(imputed_data)
#>         age income education
#> 1  25.00000  50000        12
#> 2  30.00000  60000        16
#> 3  35.00000  70000        14
#> 4  27.25296  80000        12
#> 5  45.00000  90000        NA
#> 6  50.00000 100000        18
#> 7  53.12253 110000        20
#> 8  40.00000     NA        16
#> 9  35.00000 120000        14
#> 10 33.47826 130000        12