Estimate weights to reduce bias due to selection
estimate_selection_weights.Rd
This is the main function to estimate balancing weights for the reduction
of bias due to selection (e.g., loss to follow-up). It is essentially a
wrapper around the weightit function. Please refer to its
documentation for further details.
Essentially, the function takes as input a dataset and a formula describing
how the selection process might depend on the chosen covariates. It will
then estimate weights such that the distribution of the covariates among
selected and non-selected subjects is similar. These weights can finally
be used in most functions to fit regression models (e.g. the weights
parameter in the glm function).
If the formula is not passed as input, the function will simply create
a main-effects only formula (e.g., no interactions).
More information can be found in specialized texts, like
Hernán, M.A. and Robins, J.M., 2010. Causal inference.
Usage
estimate_selection_weights(
dat,
id_str,
ids_not_censored,
formula,
method_estimation,
link_function,
stabilized,
winsorization,
estimate_by,
sampling_weights,
moments,
interactions,
library_sl,
cv_control_sl,
discrete_sl
)
Arguments
- dat
The dataset containing a column for the identifier, and all the covariates to be used when estimating the weights. A dataframe.
- id_str
A string indicating the name of the column in
dat
that contains the identifiers. A string.- ids_not_censored
A vector of identifiers corresponding to the subjects not censored. That is, the subjects that were selected for e.g., the follow-up. It must be a subset of the identifiers contained in the
id_str
column ofdat
. A vector.- formula
A string representing the formula to be passed to weightit. It must be of the form
"covariate_1 + ..."
. It can include interactions and functions of the individual covariates (e.g., a cubic spline). A string.- method_estimation
The type of model to fit to estimate the balancing weights. It must be supported by the
WeightIt
R package. The list of specific estimation methods can be found here. A string.- link_function
The link used in the generalized linear model for the propensity scores. If
glm
is used to estimate the balancing weights, this simply is the model link function (e.g.,logit
). A string.- stabilized
For the methods that estimate propensity scores, whether to stabilize the weights or not. That is, whether to multiply the individual weights by the proportion of subjects in their treatment group. A boolean.
- discrete_sl