5 Estimators

In this section the estimator used for computing averages or proportions, is described. The estimates may be proportions of the total survey population as well as of any sub-population, including those of the three main regions. As the issue of discussion is of a highly technical character, mathematical formulas have to be used extensively.

The variable of interest is denoted x. Given that x is a variable defined on the individual level (examples: level of education, age etc.), x(s,k,c,h,d,i) is the x-value of population unit (individual) (s,k,c,h,d,i). The sum of x-values for all (population) members of a household (s,k,c,h,d) is denoted x(s,k,c,h,d). We will use the latter notation also in cases where x is a variable defined on the basis of households.

We introduce a random variable Y(s,k,c,h,d,i) taking the value 1 if individual (s,k,c,h,d,i) is included in the sample, and 0 otherwise. For the sample of households the variable is denoted Y(s,k,c,h,d). The variable Y is a sample indicator in which all the information about the sample design is incorporated. In fact, Y is the only random element involved in the survey design, aiming at separating the sample units (having Y=1) from the non-sample units (Y=0). The probability of Y being 1 is of course the overall inclusion probability P. The sample indicator may be extended by attaching a subscript T indicating a subset of the population. Thus YT (s, k, c, h, d, i)=1 if unit (s,k,c,h,d,i) both is included in the sample and belongs to the sub-population T, and 0 otherwise (and similarly for the sample of households).T may also be the total survey population itself.

The survey observations may be expressed through the composite variable defined on the basis of individuals, or in case x is defined on the basis of households.
For variables on individuals the sum of x-values for all members of the household is estimated by:

where the summation runs through all population members of the household, and the denominator is the conditional inclusion probability of the i-th member of the household (provided the household has been selected at the previous stage). In the present design, only one individual has been selected from each sample household, all members (15 years or more) of the household having the same probability of being included.

If the variable is defined according to households, the "x-aggregate" of the household is of course the observation itself:

By aggregating step by step through the various sampling stages, we obtain estimates for the respective sampling unit sums by summation of weighted observations, the weights being the inverse of the inclusion probabilities at each stage (Horvitz-Thompson estimator8):
Estimated x-total for housing unit (s,k,c,h):

where the 4th stage inclusion probability is

for all households, d, of the housing unit, d=1,..., D(s,k,c,h). Estimate of cell (s,k,c) x-total:

where the 3rd stage inclusion probability is

for all housing units of the cell, h=1,...,H(s,k,c).
Estimated PSU (s,k) x-total:

where the 2nd stage inclusion probability is

for all cells of the PSU, c=1,...,B(s,k).
For the s-th stratum the x-total estimate is:

where the 1st stage inclusion probabilities have been defined in previous sections, and k=1,...,K(s).
Finally we have the estimate for the aggregate total for all strata:

This estimator is unbiased.
By successively inserting the various components it is easily seen that the estimator (5.1) may be written thus:

where the denominator is the overall inclusion probability of the unit (in this case individual (s,k,c,h,d,i)). The inverse of this probability may thus be regarded as an individual weight to be attached to the respective observations in order to obtain unbiased estimates.

The very size of sub-population T is estimated similarly by putting x(s,k,c,h,d,i) or x(s,k,c,h,d) equal to 1 for all units of the population. For the sake of clarity we will use the following notations for these estimates:

Our estimator for the x-mean of sub-population T (or proportion if x is an attribute variable) is:


al@mashriq                       960428/960710