Produces fractional weights using the iterative proportional fitting algorithm.

rk_weight(cons, inds, vars = NULL, iterations = 10)

Arguments

cons

A data frame containing all the constraints. This should be in the format of one row per zone, one column per constraint category. The first column should be a zone code; all other columns must be numeric counts.

inds

A data frame containing individual-level (survey) data. This should be in the format of one row per individual, one column per constraint. The first column should be an individual ID.

vars

A character vector of variables that constrain the simulation (i.e. independent variables)

iterations

The number of iterations the algorithm should complete. Defaults to 10

Value

A data frame of fractional weights for each individual in each zone with zone codes recorded in column names and individual id recorded in row names.

Details

rk_weight() requires three arguments:

  • A data frame of constraints (e.g. census tables)

  • A data frame of individual data (e.g. a survey)

  • A character vector of constraint variable names

The first column of each data frame should be an ID. The first column of cons should contain the zone codes. The first column of inds should contain the individual unique identifier.

Both data frames should only contain:

  • an ID column (zone ID cons or individual ID inds).

  • constraints inds or constraint category cons.

  • inds can optionally contain additional dependent variables that do not influence the weighting process.

No other columns should be present (the user can merge these back in later).

It is essential that the levels in each inds constraint (i.e. column) match exactly with the column names in cons. In the example below see how the column names in cons ('age_0_49', 'sex_f', ...) match exactly the levels in the appropriate inds variables.

The columns in cons must be arranged in alphabetical order because these are created alphabetically when they are 'spread' in the individual-level data.

Examples

# SimpleWorld cons <- data.frame( "zone" = letters[1:3], "age_0_49" = c(8, 2, 7), "age_gt_50" = c(4, 8, 4), "sex_f" = c(6, 6, 8), "sex_m" = c(6, 4, 3), stringsAsFactors = FALSE ) inds <- data.frame( "id" = LETTERS[1:5], "age" = c( "age_gt_50", "age_gt_50", "age_0_49", "age_gt_50", "age_0_49" ), "sex" = c("sex_m", "sex_m", "sex_m", "sex_f", "sex_f"), "income" = c(2868, 2474, 2231, 3152, 2473), stringsAsFactors = FALSE ) # Set variables to constrain over vars <- c("age", "sex") weights <- rk_weight(cons = cons, inds = inds, vars = vars) print(weights)
#> a b c #> A 1.227998 1.7250828 0.7250828 #> B 1.227998 1.7250828 0.7250828 #> C 3.544004 0.5498344 1.5498344 #> D 1.544004 4.5498344 2.5498344 #> E 4.455996 1.4501656 5.4501656