optimizeStrataSpatial.Rd
This function runs a set of other functions to optimise the stratification of a sampling frame, only when stratification variables are of the continuous type, and if there is also a component of spatial autocorrelation in frame units.
optimizeStrataSpatial( errors, framesamp, framecens = NULL, strcens = FALSE, alldomains = TRUE, dom = NULL, nStrata = c(5), fitting=c(1), range=c(0), kappa=3, minnumstr = 2, iter = 50, pops = 20, mut_chance = NA, elitism_rate = 0.2, highvalue = 1e+08, suggestions = NULL, realAllocation = TRUE, writeFiles = FALSE, showPlot = TRUE, parallel = TRUE, cores )
errors | This is the (mandatory) dataframe containing the precision levels expressed in terms of maximum expected value of the Coefficients of Variation related to target variables of the survey. |
---|---|
framesamp | This is the (mandatory) dataframe containing the information related to the sampling frame. |
framecens | This the (optional) dataframe containing the units to be selected in any case. It has same structure than "frame" dataframe. |
strcens | Flag (TRUE/FALSE) to indicate if takeall strata do exist or not. Default is FALSE. |
alldomains | Flag (TRUE/FALSE) to indicate if the optimization must be carried out on all domains (default is TRUE). If it is set to FALSE, then a value must be given to parameter 'dom'. |
dom | Indicates the domain on which the optimization must be carried. It is an integer value that has to be internal to the interval (1 <--> number of domains). If 'alldomains' is set to TRUE, it is ignored. |
nStrata | Indicates the number of strata for each variable. |
fitting | Fitting of the model(s). Default is 1. |
range | Maximum range for spatial autocorrelation. It is a vector with as many elements as the number of target variables Y. |
kappa | Factor used in evaluating spatial autocorrelation. Default is 3. |
minnumstr | Indicates the minimum number of units that must be allocated in each stratum. Default is 2. |
iter | Indicated the maximum number of iterations (= generations) of the genetic algorithm. Default is 50. |
pops | The dimension of each generations in terms of individuals. Default is 20. |
mut_chance | Mutation chance: for each new individual, the probability to change each single chromosome, i.e. one bit of the solution vector. High values of this parameter allow a deeper exploration of the solution space, but a slower convergence, while low values permit a faster convergence, but the final solution can be distant from the optimal one. Default is NA, in correspondence of which it is computed as 1/(vars+1) where vars is the length of elements in the solution. |
elitism_rate | This parameter indicates the rate of better solutions that must be preserved from one generation to another. Default is 0.2 (20 |
highvalue | Parameter for genetic algorithm. In should not be changed |
suggestions | Optional parameter for genetic algorithm that indicates a suggested solution to be introduced in the initial population. The most convenient is the one found by the function "KmeanSolution". Default is NULL. |
realAllocation | If FALSE, the allocation is based on INTEGER values; if TRUE, the allocation is based on REAL values. Default is TRUE. |
writeFiles | Indicates if the various dataframes and plots produced during the execution have to be written in the working directory. Default is FALSE. |
showPlot | Indicates if the plot showing the trend in the value of the objective function has to be shown or not. In parallel = TRUE, this defaults to FALSE Default is TRUE. |
parallel | Should the analysis be run in parallel. Default is TRUE. |
cores | If the analysis is run in parallel, how many cores should be used. If not specified n-1 of total available cores are used OR if number of domains < (n-1) cores, then number of cores equal to number of domains are used. |
A list containing (1) the vector of the solution, (2) the optimal aggregated strata, (3) the total sampling frame with the label of aggregated strata
Giulio Barcaroli
if (FALSE) { ############################# # Example of "spatial" method ############################# library(sp) data("meuse") data("meuse.grid") meuse.grid$id <- c(1:nrow(meuse.grid)) coordinates(meuse) <- c('x','y') coordinates(meuse.grid) <- c('x','y') library(gstat) library(automap) v <- variogram(lead ~ dist + soil, data = meuse) fit.vgm.lead <- autofitVariogram(lead ~ dist + soil, meuse, model = "Exp") plot(v, fit.vgm.lead$var_model) lead.kr <- krige(lead ~ dist + soil, meuse, meuse.grid, model = fit.vgm.lead$var_model) lead.pred <- ifelse(lead.kr[1]$var1.pred < 0,0, lead.kr[1]$var1.pred) lead.var <- ifelse(lead.kr[2]$var1.var < 0, 0, lead.kr[2]$var1.var) df <- as.data.frame(list(dom = rep(1,nrow(meuse.grid)), lead.pred = lead.pred, lead.var = lead.var, lon = meuse.grid$x, lat = meuse.grid$y, id = c(1:nrow(meuse.grid)))) frame <-buildFrameSpatial(df = df, id = "id", X = c("lead.pred"), Y = c("lead.pred"), variance = c("lead.var"), lon = "lon", lat = "lat", domainvalue = "dom") cv <- as.data.frame(list(DOM = rep("DOM1",1), CV1 = rep(0.05,1), domainvalue = c(1:1) )) solution <- optimizeStrataSpatial(errors = cv, framesamp = frame, iter = 25, pops = 10, nStrata = 5, fitting = 1, kappa = 1, range = fit.vgm.lead$var_model$range[2]) framenew <- solution$framenew outstrata <- solution$aggr_strata frameres <- SpatialPixelsDataFrame(points = framenew[c("LON","LAT")], data = framenew) frameres$LABEL <- as.factor(frameres$LABEL) spplot(frameres,c("LABEL"), col.regions=bpy.colors(5)) s <- selectSample(framenew,outstrata) }