beth1.Rd
Compute multivariate optimal allocation for different domains in one stage stratified sample design
beat.1st(stratif, errors, minnumstrat=2, maxiter=200, maxiter1=25, epsilon=10^(-11))
Data frame of survey strata, for more details see, e.g.,strata.
Data frame of expected coefficients of variation (CV) for each domain, for more details see, e.g.,errors.
Minimum number of elementary units per strata (default=2).
Maximum number of iterations (default=200) of the general procedure. This kind of iteration may be required by the fact that when in a stratum the number of allocated units is greater or equal to its population, that stratum is set as "census stratum", and the whole procedure is re-initialised.
Maximum number of iterations in Chromy algorithm (default=25).
Tollerance for the maximum absolute differences between the expected CV and the realised CV with the allocation obtained in the last iteraction for all domains. The default is 10^(-11).
The methodology is a generalization of Bethel multivariate allocation (1989) that extended the Neyman (1959) - Tchuprov (1923) allocation for multi-purpose and multi-domains surveys. The generalized Bethel’s algorithm allows to determine the optimal sample size for each stratum in a stratified sample design. The overall sample size and the allocation among the different strata is determined starting from the accuracy constraints imposed in the survey on interest estimates.
Object of class list
. The list contains 4 objects:
Vector with the optimal sample size for each stratum.
Data frame corresponding to the input data.frame stratif
with the n
optimal sample size column added.
Data frame with optimal (ALLOC
), proportional (PROP
), equal (EQUAL
) sample size allocation.
Data frame with a summary of expected coefficients of variation (Planned CV
), realized coefficients of variation with the given optimal allocation (Actual CV
) and the sensitivity at 10% for each domain and each variable. Sensitivity can be a useful tool to help in finding the best allocation, because it provides a hint of the expected sample size variation for a 10% change in planned CVs.
Bethel, J. (1989) Sample allocation in multivariate surveys. Survey methodology, 15.1: 47-57.
Cochran, W. (1977) Sampling Techniques. John Wiley & Sons, Inc., New York
Neyman, J. (1934). On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4): 558-625.
Tschuprow, A. A. (1923). On the mathematical expectation of the moments of frequency distributions in the case of correlated observation. (Chapters 4-6). Metron, 2: 646-683.
# Load example data
data(beat.example)
## Example 1
# Allocate the sample
allocation_1 <- beat.1st(stratif=strata, errors=errors)
# The total sample size is
sum(allocation_1$n)
#> [1] 6858
## Example 2
# Assume 5700 units is the maximum sample size to stick to our budget.
# Looking at allocation_1$sensitivity we can see that most of the
# sensitivity is in DOM1 for REG1 and REG2 due to V1.
allocation_1$sensitivity
#> Type Dom Var Planned CV Actual CV Sensitivity 10%
#> 2 DOM1 1 V1 0.10 0.1000 925
#> 3 DOM1 1 V2 0.99 0.1100 1
#> 4 DOM1 2 V1 0.10 0.0999 446
#> 5 DOM1 2 V2 0.99 0.1718 1
#> 6 DOM2 1 V1 0.99 0.1044 1
#> 7 DOM2 1 V2 0.99 0.1151 1
#> 8 DOM2 2 V1 0.99 0.3477 1
#> 9 DOM2 2 V2 0.99 0.3740 1
#> 10 DOM2 3 V1 0.99 0.3253 1
#> 11 DOM2 3 V2 0.99 0.4718 1
#> 12 DOM2 4 V1 0.99 0.4599 1
#> 13 DOM2 4 V2 0.99 0.8582 1
#> 14 DOM2 5 V1 0.99 0.2024 1
#> 15 DOM2 5 V2 0.99 0.3717 1
#> 16 DOM2 6 V1 0.99 0.1268 1
#> 17 DOM2 6 V2 0.99 0.2186 1
# We can relax the constraints increasing the expected coefficients of variation for X1 by 10%
errors1 <- errors
errors1[1,2] <- errors[1,2]+errors[1,2]*0.1
# Try the new allocation
allocation_2 <- beat.1st(stratif=strata, errors=errors1)
sum(allocation_2$n)
#> [1] 5671
## Example 3
# On the contrary, if we tighten the constraints decreasing the expected coefficients of variation
# for X1 by 10%
errors2 <- errors
errors2[1,2] <- errors[1,2]-errors[1,2]*0.1
# The new allocation leads to a larger sample than the first example
allocation_3 <- beat.1st(stratif=strata, errors=errors2)
sum(allocation_3$n)
#> [1] 8463