# TRAINING KEY 54: The use of auxiliary information in PPS sampling

*Practical Methods for Design and Analysis of Complex Surveys.*

Risto Lehtonen and Erkki Pahkinen

#### INTRODUCTION

In point A we will show how to perform estimation under __systematic PPS sampling (PPSSYS)__. Point A refers to example 2.6 (page 54). In point B we will demonstrate how the use of __auxiliary information__ affects to the total estimate of UE91 (number of unemployed in the province) and the corresponding variance estimate. This is to get familiar on the selection and use of auxiliary information in __PPS sampling__. Point C is an option for interactive analysis. We use Province'91 data set as the frame population.

**A) REFERENCE EXAMPLE 2.6: Estimation under systematic PPS sampling **

Systematic PPS sample (n=8) is drawn from the __Province'91 population__ such that the number of households (HOU85) is used as the size measure z. PPSSYS estimate and the corresponding standard error estimate are calculated from the selected sample. The results are compared to estimation under __SRSWOR sampling__. Further instructions will be given once you start.

**B) THE USE OF ALTERNATIVE AUXILIARY INFORMATION IN PPSWOR SAMPLING **

We use three different auxiliary variables in __PPSWOR__ sampling. The first auxiliary variable is HOU85 (the number of households in the municipality). The two others are artificial variables x and z. These variables are constructed for pedagogical purposes to demonstrate the role of correlation of the study variable and auxiliary variable, and that a strong correlation alone does not guarantee good efficiency for PPSWOR. Further instructions are given once you start.

**C) INTERACTIVE SAS USE **

Please download the SAS code for your own further training. Further instructions are given once you download the code.

NOTE! You need to have access to SAS in your computer.