Multivariate optimal allocation for different domains in one and two stages stratified sample design. R2BEAT extends the Neyman (1934) – Tschuprow (1923) allocation method to the case of several variables, adopting a generalization of the Bethel’s proposal (1989). R2BEAT develops this methodology but, moreover, it allows to determine the sample allocation in the multivariate and multi-domains case of estimates for two-stage stratified samples. It also allows to perform both Primary Stage Units and Secondary Stage Units selection.

R2BEAT easily manages all the complexity due to the optimal sample allocation in two-stage sampling design and provides several outputs for evaluating the allocation. Its name stands for “R ‘to’ Bethel Extended Allocation for Two-stage”. It is an extension of another open-source software called Mauss-R (Multivariate Allocation of Units in Sampling Surveys), implemented by ISTAT researchers (https://www.istat.it/en/methods-and-tools/methods-and-it-tools/design/design-tools/mauss-r). Mauss-R determines the optimal sample allocation in multivariate and multi-domains estimation, for one-stage stratified samples.

To complete the suite of tools developed by Istat in order to cover the stratified sample design, we cite SamplingStrata (https://CRAN.R-project.org/package=SamplingStrata), that allows to jointly optimize both the stratification of the sampling frame and the allocation, still in the multivariate multidomain case (only for one-stage designs), and MultiWay.Sample.Allocation, that allows to determine the optimal sample allocation for multi-way stratified sampling designs and incomplete stratified sampling designs (https://www.istat.it/en/methods-and-tools/methods-and-it-tools/design/design-tools/multiwaysampleallocation).

For a complete illustration of the methodology see the vignette “R2BEAT methodology and use” (https://barcaroli.github.io/R2BEAT/articles/R2BEAT_methodology.html).

A complete example is illustrated in the vignette “Two-stage sampling design workflow” (https://barcaroli.github.io/R2BEAT/articles/R2BEAT_workflow.html).

Installation

You can install the released version of R2BEAT from CRAN with:

install.packages("R2BEAT")

or the last version of R2BEAT from github with:

install.packages("devtools")
devtools::install_github("barcaroli/R2BEAT")

Example

This is a basic example which shows you how to solve a common problem:

library(R2BEAT)
# Load example data
data(beat.example)
head(strata)
#   STRATUM       N DOM1  DOM2         M1         M2 COST CENS        S1        S2
# 1       1 2302353 REG1 PROV1 0.02097409 0.01732549    1    0 0.1432975 0.1304811
# 2       2  179562 REG1 PROV2 0.02304681 0.01997698    1    0 0.1500522 0.1399210
# 3       3  371802 REG2 PROV3 0.02284502 0.01894097    1    0 0.1494093 0.1363166
# 4       4  592303 REG2 PROV3 0.02375591 0.01077033    1    0 0.1522878 0.1032198
# 5       5  221687 REG2 PROV3 0.03215470 0.01088398    1    0 0.1764108 0.1037570
# 6       6  440613 REG2 PROV3 0.04496313 0.01967690    1    0 0.2072232 0.1388874
head(errors)
#    DOM  CV1  CV2
# 1 DOM1 0.10 0.99
# 2 DOM2 0.99 0.99
head(design)
#   STRATUM STRAT_MOS DELTA MINIMUM
# 1       1   2302353     1      48
# 2       2    179562     1      48
# 3       3    371802     1      48
# 4       4    592303     1      48
# 5       5    221687     1      48
# 6       6    440613     1      48
head(PSU_strat)
#   STRATUM PSU_MOS PSU_ID
# 1       1    2591      1
# 2       1    3808      2
# 3       1     465      3
# 4       1    1778      4
# 5       1     713      5
# 6       1    6378      6
head(rho)
#   STRATUM RHO_AR1   RHO_NAR1 RHO_AR2     RHO_NAR2
# 1       1       1 0.07563272       1 0.0005484989
# 2       2       1 0.08327200       1 0.0157297282
# 3       3       1 0.03444641       1 0.0001944330
# 4       4       1 0.05433433       1 0.0021201253
# 5       5       1 0.03926152       1 0.0020177360
# 6       6       1 0.02452771       1 0.0293768074
# Allocate the sample
allocation <- beat.2st(stratif=strata, 
                            errors=errors,
                            des_file=design, 
                            psu_file=PSU_strat,
                            rho=rho)
#    iterations PSU_SR PSU NSR PSU Total   SSU
# 1           0      0       0         0  6858
# 2           1     16      79        95 17170
# 3           2     42     161       203 14736
# 4           3     33     153       186 15273
# 5           4     36     155       191 15147
# 6           5     33     156       189 15274
# 7           6     36     155       191 15146
# 8           7     33     156       189 15273
# 9           8     36     155       191 15147
# 10          9     33     156       189 15274
# 11         10     36     155       191 15146
# 12         11     33     156       189 15273
# 13         12     36     155       191 15147
# 14         13     33     156       189 15274
# 15         14     36     155       191 15146
# 16         15     33     156       189 15273
# 17         16     36     155       191 15147
# 18         17     33     156       189 15274
# 19         18     36     155       191 15146
# 20         19     33     156       189 15273
# 21         20     36     155       191 15147
str(allocation)
# Selection of PSUs
allocat <- allocation$alloc[-nrow(allocation$alloc),]
str(allocat)
allocat$STRATUM <- as.character(as.numeric(allocat$STRATUM)*100)
unique(allocat$STRATUM)
PSU_strat$STRATUM <- PSU_strat$STRATUM*100
unique(PSU_strat$STRATUM)
sampled_PSU <- StratSel(dataPop= PSU_strat,
                       idpsu= ~ PSU_ID,
                       dom= ~ STRATUM,
                       final_pop= ~ PSU_MOS,
                       size= ~ PSU_MOS,
                       PSUsamplestratum= 1,
                       min_sample= 12,
                       min_sample_index= FALSE,
                       dataAll=allocat,
                       domAll= ~ factor(STRATUM),
                       f_sample= ~ ALLOC,
                       planned_min_sample= NULL,
                       launch= F)
sampled_PSU[[2]]
#    Domain SRdom nSRdom SRdom+nSRdom SR_PSU_final_sample_unit NSR_PSU_final_sample_unit
# 1     100   126     53          179                     8240                       660
# 2     200    20     15           35                      305                        99
# 3     300     2      8           10                       28                        37
# 4     400     6     16           22                      199                       288
# 5     500     1      6            7                       68                       203
# 6     600     6      8           14                       11                        81
# 7     700     1      6            7                      939                       381
# 8     800     4      5            9                      140                       473
# 9     900     2      8           10                      423                       498
# 10   1000     3      8           11                      751                       186
# 11   1100     1      3            4                       51                        97
# 12   1200    10     23           33                       88                       192
# 13   1300     3     17           20                       37                        70
# 14   1400     1      7            8                      113                       104
# 15   1500    26     30           56                       26                        70
# 16   1600     7     39           46                       98                        59
# 17   1700    25     39           64                       42                        95
# 18  Total   244    291          535                    11559                      3593
# 19   Mean                                                680                       211