rk_weight.Rd
Produces fractional weights using the iterative proportional fitting algorithm.
rk_weight(cons, inds, vars = NULL, iterations = 10)
cons | A data frame containing all the constraints. This should be in the format of one row per zone, one column per constraint category. The first column should be a zone code; all other columns must be numeric counts. |
---|---|
inds | A data frame containing individual-level (survey) data. This should be in the format of one row per individual, one column per constraint. The first column should be an individual ID. |
vars | A character vector of variables that constrain the simulation (i.e. independent variables) |
iterations | The number of iterations the algorithm should complete. Defaults to 10 |
A data frame of fractional weights for each individual in each zone with zone codes recorded in column names and individual id recorded in row names.
rk_weight() requires three arguments:
A data frame of constraints (e.g. census tables)
A data frame of individual data (e.g. a survey)
A character vector of constraint variable names
The first column of each data frame should be an ID. The first column of
cons
should contain the zone codes. The first column of inds
should contain the individual unique identifier.
Both data frames should only contain:
an ID column (zone ID cons
or individual ID inds
).
constraints inds
or constraint category cons
.
inds
can optionally contain additional dependent variables
that do not influence the weighting process.
No other columns should be present (the user can merge these back in later).
It is essential that the levels in each inds
constraint (i.e. column)
match exactly with the column names in cons
. In the example below see
how the column names in cons ('age_0_49', 'sex_f', ...
) match exactly
the levels in the appropriate inds
variables.
The columns in cons
must be arranged in alphabetical order because
these are created alphabetically when they are 'spread' in the
individual-level data.
# SimpleWorld cons <- data.frame( "zone" = letters[1:3], "age_0_49" = c(8, 2, 7), "age_gt_50" = c(4, 8, 4), "sex_f" = c(6, 6, 8), "sex_m" = c(6, 4, 3), stringsAsFactors = FALSE ) inds <- data.frame( "id" = LETTERS[1:5], "age" = c( "age_gt_50", "age_gt_50", "age_0_49", "age_gt_50", "age_0_49" ), "sex" = c("sex_m", "sex_m", "sex_m", "sex_f", "sex_f"), "income" = c(2868, 2474, 2231, 3152, 2473), stringsAsFactors = FALSE ) # Set variables to constrain over vars <- c("age", "sex") weights <- rk_weight(cons = cons, inds = inds, vars = vars) print(weights)#> a b c #> A 1.227998 1.7250828 0.7250828 #> B 1.227998 1.7250828 0.7250828 #> C 3.544004 0.5498344 1.5498344 #> D 1.544004 4.5498344 2.5498344 #> E 4.455996 1.4501656 5.4501656