Given a dataframe, this function performs the following steps:

  • Removal of variables with a fraction of missing values greater than the chosen threshold, within each group.

  • Removal of variables with a fraction of missing values greater than the chosen threshold, for the entire dataframe.

  • Imputation of the remaining variables.

handle_missing_values(
  dat,
  covariates,
  use_additional_covariates,
  selected_covariates,
  id_var,
  by_var,
  threshold_within,
  threshold_overall,
  method_imputation,
  k,
  path_save_res
)

Arguments

dat

A dataframe containing the variables of interest. A dataframe.

covariates

A dataframe containing additional variables. A dataframe.

id_var

The variable name to be used to identify subjects. A string.

by_var

The variable name to group by. A string.

threshold_within

The missing value threshold within each group. An integer.

threshold_overall

The overall missing value threshold. An integer.

k

Number of nearest neighbors used for kNN.

path_save_res

Value

A named list containing the results of the steps described above. The imputed dataframe is named dat_imputed.