The sample of the FALUP study
By Jon Pedersen
The sample design of the FALUP study follows closely that of the FALCOT 92 study. The interested reader should therefore also consult the appendix on sampling strategy in the report "Palestinian Society" (Heiberg & Øvensen 1993). Here we will only outline the sampling strategy and point out the main differences in design between the two surveys.
The first, and main, difference between the FALUP 93 and the FALCOT 92 sampling strategies lies in the coverage of the sample. In FALCOT 92, the sampling plan covered the whole Palestinian population in the Occupied Territories, whereas FALUP has been based on samples drawn from two domains only, namely Gaza and refugee camps on the West Bank. The main reason for the FALUP approach was that it was assumed that Gaza and the West Bank Camps would be most affected by the border closure, as they were the areas found to be worst off in the FALCOT 92 survey. An important corrollary of the design of the present study is that pooled statistics for the whole of the Occupied Territories cannot be given.
A second difference between the FALUP and FALCOT sampling lies in the final stages of the sample. In FALCOT, a household was chosen, and the household head was interviewed about household variables. Then a person was selected at random within the household, and posed questions relating to his or her personal experience. If the randomly selected person turned out to be a woman (which in fact was determined before the interview, to ease allocation of field workers), she was also asked questions from a separate questionnaire.
In FALUP a household was chosen, and the household head interviewed about household variables as well as selected activities of the household members. There was no randomly selected person, and no special questionnaire for women.
The Gaza sample totalled 960 households. The sampling strategy was the same as that of the FALCOT survey down to the household level. Thus, the sample had the following stages:
Selection of PSUs by simple random sampling within each stratum.
Each of the selected PSUs were divided into cells, by using maps provided by the local statistical office. Cells were selected with equal probability within each PSU. The number of cells to be selected from each PSU was chosen so that the product of the probability of selection of a PSU and cells was constant across all cells and PSUs.
Within each cell, housing units were selected. A housing unit was defined as a group of households sharing a common entrance. The reason for this stage in the sampling was that households could not be directely identified, while housing units could easily be identified. The actual selection was made through specified "enumeration walks" in which every third housing unit was selected until a full subsample had been obtained. Random starting points were chosen, and each walk entailed four to six selected housing units. On average, ten housing units were selected from each cell, but the actual number for a given cell was determined by allocation a number of households proportionate to the total number of households in the cell.
Finally, a single household was selected by simple random sample from the households constituting a household unit. This was done by first constructing a list of households in the housing unit, and then making the selection by drawing from a list of random numbers prepared for the purpose.
Because of the last stage of the sample, i.e. the choice of households from housing units, the sample is not self-weighting, but the household weights and the individual weights are identical.
The West bank sample included households from all the refugee camps in the West Bank. The sample had the following stages:
In each camp, using the random walk to select housing units as described above.
From each housing unit selecting a household, also as described above.
Because the samples have multi-stage designs, tests of significance or confidence intervals based on the assumption of a simple random sample design are not appropriate. The design of a sample influences the variance of statistical measures, such as percentages or means. The influence of the sample design on the variance is commonly measured by the "design effect", DEFF, which is the ratio between the actual variance of the measure and the variance the measure would have had with simple random sampling. Although experience suggests that the DEFF for the type of sample design used here will be around 1.5, one cannot assume that this will be the case for any specific measure, variable or table. The effect of a DEFF of 1.5 will be to increase the confidence interval around a percentage with about 20 per cent compared to simple random sampling.
To estimate empirically the variances or DEFFS of the present designs is exceedingly complicated. Nevertheless they may be approximated by the use of the so-called "ultimate cluster"-method (Hansen, Hurwitz & Madow 1953). With the aid of the computer program CENVAR (US Bureau of the Census 1994), we have thus calculated confidence intervals and design effects for some of the variables in the Gaza sample. As may be seen from the table which details design effects of some individual level variabeles, the design effects are generally quite acceptable.
Of the 960 households selected in Gaza, only 5 households were not interviewed, giving a response rate of 99.5% In the West Bank Camps 504 households were selected, and 498 interviewed, corresponding to a response rate of 98.8%. Hence, selected households that were not interviewed, cannot be said to bias the sample in any significant manner. In addition to omissions that occurred in the field work process, the match between households and individuals could not be achieved in the data files in 26 cases because of errors during entry of identification numbers. These errors afffect only the analysis of the link between household activities and individual activities, but are far too few to influence the analysis.
The results of all surveys based on probability samples are subject to uncertainty resulting from the nature of the sampling process and from errors due to imperfections in the execution in the design. In the case of FALUP the uncertainties, or sampling errors, due to the sampling process do not pose any problems for the analyses reported here. The response rates are also insignificant as a source of error. For the discussion of non-sampling errors, i.e. errors due to the wording or questions or conduct of the interviews, etc, the reader should consult the section on field work procedures and the discussions in the main text.