Compute multivariate optimal allocation for different domains in one stage stratified sample design

beat.1st(stratif, errors, minnumstrat=2, maxiter=200, maxiter1=25, epsilon=10^(-11))

Arguments

stratif

Data frame of survey strata, for more details see, e.g.,strata.

errors

Data frame of expected coefficients of variation (CV) for each domain, for more details see, e.g.,errors.

minnumstrat

Minimum number of elementary units per strata (default=2).

maxiter

Maximum number of iterations (default=200) of the general procedure. This kind of iteration may be required by the fact that when in a stratum the number of allocated units is greater or equal to its population, that stratum is set as "census stratum", and the whole procedure is re-initialised.

maxiter1

Maximum number of iterations in Chromy algorithm (default=25).

epsilon

Tollerance for the maximum absolute differences between the expected CV and the realised CV with the allocation obtained in the last iteraction for all domains. The default is 10^(-11).

Details

The methodology is a generalization of Bethel multivariate allocation (1989) that extended the Neyman (1959) - Tchuprov (1923) allocation for multi-purpose and multi-domains surveys. The generalized Bethel’s algorithm allows to determine the optimal sample size for each stratum in a stratified sample design. The overall sample size and the allocation among the different strata is determined starting from the accuracy constraints imposed in the survey on interest estimates.

Value

Object of class list. The list contains 4 objects:

n

Vector with the optimal sample size for each stratum.

file_strata

Data frame corresponding to the input data.frame stratif with the n optimal sample size column added.

alloc

Data frame with optimal (ALLOC), proportional (PROP), equal (EQUAL) sample size allocation.

sensitivity

Data frame with a summary of expected coefficients of variation (Planned CV), realized coefficients of variation with the given optimal allocation (Actual CV) and the sensitivity at 10% for each domain and each variable. Sensitivity can be a useful tool to help in finding the best allocation, because it provides a hint of the expected sample size variation for a 10% change in planned CVs.

References

Bethel, J. (1989) Sample allocation in multivariate surveys. Survey methodology, 15.1: 47-57.

Cochran, W. (1977) Sampling Techniques. John Wiley & Sons, Inc., New York

Neyman, J. (1934). On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4): 558-625.

Tschuprow, A. A. (1923). On the mathematical expectation of the moments of frequency distributions in the case of correlated observation. (Chapters 4-6). Metron, 2: 646-683.

Author

Developed by Stefano Falorsi, Andrea Fasulo, Alessio Guandalini, Daniela Pagliuca, Marco D. Terribili.

Examples

# Load example data
data(beat.example)

## Example 1
# Allocate the sample
allocation_1 <- beat.1st(stratif=strata, errors=errors)

# The total sample size is
sum(allocation_1$n)
#> [1] 6858

## Example 2
# Assume 5700 units is the maximum sample size to stick to our budget.
# Looking at allocation_1$sensitivity we can see that most of the 
# sensitivity is in DOM1 for REG1 and REG2 due to V1.
allocation_1$sensitivity
#>    Type Dom Var Planned CV Actual CV Sensitivity 10%
#> 2  DOM1   1  V1       0.10    0.1000             925
#> 3  DOM1   1  V2       0.99    0.1100               1
#> 4  DOM1   2  V1       0.10    0.0999             446
#> 5  DOM1   2  V2       0.99    0.1718               1
#> 6  DOM2   1  V1       0.99    0.1044               1
#> 7  DOM2   1  V2       0.99    0.1151               1
#> 8  DOM2   2  V1       0.99    0.3477               1
#> 9  DOM2   2  V2       0.99    0.3740               1
#> 10 DOM2   3  V1       0.99    0.3253               1
#> 11 DOM2   3  V2       0.99    0.4718               1
#> 12 DOM2   4  V1       0.99    0.4599               1
#> 13 DOM2   4  V2       0.99    0.8582               1
#> 14 DOM2   5  V1       0.99    0.2024               1
#> 15 DOM2   5  V2       0.99    0.3717               1
#> 16 DOM2   6  V1       0.99    0.1268               1
#> 17 DOM2   6  V2       0.99    0.2186               1
# We can relax the constraints increasing the expected coefficients of variation for X1 by 10%
errors1 <- errors 
errors1[1,2] <- errors[1,2]+errors[1,2]*0.1

# Try the new allocation 
allocation_2 <- beat.1st(stratif=strata, errors=errors1)
sum(allocation_2$n)
#> [1] 5671

## Example 3
# On the contrary, if we tighten the constraints decreasing the expected coefficients of variation 
# for X1 by 10%
errors2 <- errors 
errors2[1,2] <- errors[1,2]-errors[1,2]*0.1

# The new allocation leads to a larger sample than the first example 
allocation_3 <- beat.1st(stratif=strata, errors=errors2)
sum(allocation_3$n)
#> [1] 8463