Chapter 3: Further Use of Auxiliary Information

Chapter 3: Further Use of Auxiliary Information


In Chapter 3 of the textbook Practical Methods for Design and Analysis of Complex Surveys, the use of auxiliary information is demonstrated further. Auxiliary information can be used to improve the efficiency of estimation by incorporating the auxiliary data into the sampling design, as in stratified sampling discussed in Section 3.1 of the textbook. Auxiliary information also can be used to improve the efficiency of a given sample, by using model-assisted estimation techniques discussed in Section 3.3. In model assisted estimation, the auxiliary data are incorporated in estimation by using statistical models. In poststratification, a linear analysis of variance or ANOVA model is assumed, and the auxiliary data consists of population cell and marginal frequencies of one or several categorical variables. Ratio estimation uses a linear regression model where the intercept is excluded, and the auxiliary data consists of the population totals of one or several continuous variables, which can come from a source such as official statistics. In regression estimation, a standard linear regression model is used to incorporate the auxiliary data in the estimation procedure. The methods are special cases of generalized regression (GREG) estimators. In all these methods, estimation can be more effective than that from just simple random sampling (SRS) if there is a relation between the study variable and auxiliary variable, such as a strong correlation.


In Training Key 63, stratified sampling is demonstrated by first calculating the design effect DEFF for proportional allocation, reproducing results of Example 3.1. Then, the various allocation schemes are examined and results of Example 3.2 are reproduced.


In Training Key 101, regression estimation is demonstrated by first reproducing the results of Example 3.13. Then, regression estimation is extended to samples with different sample sizes. Finally, the performance of SRSWOR estimators is examined by using Monte Carlo simulation methods. A Horvitz-Thompson estimator for a PPS sample is compared with regression estimation for a SRSWOR sample using the same auxiliary information in both cases.


In Training Key 104, the calibration technique is demonstrated for a SRSWOR sample for three cases: poststratification, ratio estimation and regression estimation.



NOTE: Instructions for the use of Training Keys are given in the Instructions section.