Initial solution obtained by applying kmeans clustering of atomic strata

In order to speed the convergence towards the optimal solution, an initial one can be given as "suggestion" to "optimizeStrata" function. The function "KmeansSolution" produces this initial solution using the k-means algorithm by clustering atomic strata on the basis of the values of the means of target variables in them. Also, if the parameter "nstrata" is not indicated, the optimal number of clusters is determined inside each domain, and the overall solution is obtained by concatenating optimal clusters obtained in domains. The result is a dataframe with two columns: the first indicates the clusters, the second the domains.

KmeansSolution(strata, 
               errors, 
               nstrata=NA, 
               minnumstrat=2,
               maxclusters = NA,
               showPlot=TRUE)

Arguments

strata: The (mandatory) dataframe containing the information related to atomic strata.
errors: The (mandatory) dataframe containing the precision constraints on target variables.
nstrata: Number of aggregate strata (if NULL, it is optimized by varying the number of cluster from 2 to half number of atomic strata). Default is NA.
minnumstrat: Minimum number of units to be selected in each stratum. Default is 2.
maxclusters: Maximum number of clusters to be considered in the execution of kmeans algorithm. If not indicated it will be set equal to the number of atomic strata divided by 2.
showPlot: Allows to visualise on a plot the different sample sizes for each number of aggregate strata. Default is TRUE.

Value

A dataframe containing the solution

Author

Giulio Barcaroli

Examples

if (FALSE) {
library(SamplingStrata)
data(swisserrors)
data(swissstrata)

# suggestion
solutionKmean <- KmeansSolution(strata=swissstrata,
                errors=swisserrors,
                nstrata=NA,
                showPlot=TRUE)

# number of strata to be obtained in each domain    
nstrat <- tapply(solutionKmean$suggestions,
                 solutionKmean$domainvalue,
                 FUN=function(x) length(unique(x)))

# optimisation of sampling strata
solution <- optimStrata ( 
  method = "atomic",
  errors = swisserrors, 
  strata = swissstrata,
  nStrata = nstrat,
  suggestions = solutionKmean
)
}