2. The post-enumeration survey

2.1 Overview

2.2 Scope

2.3 Sample design

2.4 Questionnaire design

2.5 Enumeration

2.6 First phase data capture

2.7 Matching process

2.7.1 Matching EAs

2.7.2 Matching households

2.7.3 Matching persons

2.8 Second phase data capture

 

2.1 Overview 

The post-enumeration survey (PES) was conducted as soon as possible after the completion of the census enumeration, from 15 to 24 November 1996. A sample of around 800 enumerator areas (EAs) was selected, as described in Section 2.3. Within these selected areas, an attempt was made to visit every household, using better quality and senior census fieldwork staff under direct supervision of head office professionals. The interviewer posed a series of questions about the household to the householder regarding all persons present on the previous night as well as all persons present on the night of 9-10 October, census night. In addition to obtaining basic demographic information about all household members, the PES questionnaire included the question ‘Was the person counted in the census?’. The design of the questionnaire is covered in Section 2.4.

The data on the questionnaires was captured and used in elementary fashion in the calculation of a preliminary undercount rate. This was based on the yes/no responses to the question on whether each person was counted in the census. The method used for establishing the preliminary population estimates is explained further in Section 3.2.

Subsequently, in a much more protracted undertaking, a match of the PES questionnaires with the corresponding census questionnaires was sought to determine whether the people included on the PES questionnaire were enumerated in the census at the same address, as described in Section 2.7. The final calculation of undercount incorporated the results of the matching exercise, plus some intricate imputations for different kinds of non-match, providing a more reliable final undercount. This is explained in detail in Section 3.3.

The PES was designed to provide an independent check of census coverage. As such, it was important that the survey should be conducted as independently of the census as possible. A different section of Statistics South Africa (Stats SA) was assigned responsibility for the PES to that with responsibility for the census. Although it was necessary for some aspects of the PES to utilise the infrastructure developed for the census, measures were taken to ensure that the processes were as independent as possible. Thus, while the enumeration area boundaries defined were those prepared for the census, the listing of dwellings within these areas was redone for the PES. In addition, while most of the interviewers for the PES had previously worked on the census, they were allocated to areas different from those where they had worked on in the census.

 

2.2 Scope 

Census ’96 was intended to cover every person present in South Africa on census night (except foreign diplomats and their families). However, for practical reasons, the coverage of the subsequent PES was limited to persons present in households in residential dwellings.

Difficulties in enumeration and matching meant that it was not possible to include prisons, hospitals and other institutions in the PES. An attempt was made to include a sample of hostels in the PES but matching proved to be impossible and the quality of the data obtained was not adequate to establish undercount. Thus the adjustments given to hostels had to be determined from the remainder of the sample for persons in households in residential dwellings. Homeless people were also beyond the scope of the PES although, as the process was finally applied, they would have received the same adjustment for undercount as people present in residential dwellings in urban formal areas.

 

2.3 Sample design 

A sample of around 800 enumerator areas (EAs) was drawn for the PES. This represented around 1% of EAs for most provinces with the exception of Northern Cape. As this province has a very small population, the sample size was doubled to ensure a sufficient sample size for provincial estimates to be calculated.

The sample was stratified by province and EA type. EAs were classified as formal urban, informal urban, tribal, commercial farms or other non-urban (see Appendix E for precise definitions). EAs corresponding to an EA type of hospital and prison institutions were beyond the scope of the PES. EAs in the EA type of hostels were initially included in the scope of the PES and were sampled and enumerated. However, they were later excluded during the estimation process.

Within each province and EA type stratum, the EAs were sorted by magisterial district and an independent systematic sample was drawn. Empty EAs and nearly empty EAs (based on the estimates from census demarcation) were excluded from the sampling frame. This was done as empty EAs would not contribute to the sample estimate, and it was felt that the expense of enumerating nearly empty EAs could not be justified given their minimal contribution to the overall results.

No sampling of dwellings was undertaken within EAs. Instead, PES enumerators should have visited every dwelling in selected EAs. As a result, the sample was self-weighting within each stratum.

The sample was drawn from a list compiled from administrative records created prior to census enumeration. The number and boundaries of EAs changed to some extent during census enumeration, in that some areas were still being demarcated as enumeration began, while other EAs were split or combined during enumeration. The areas demarcated during the census were not included in the original list and were therefore not covered in the PES. The other changes made during the census had implications for matching, as will be detailed in Section 2.6.

It is possible that some people may have been missed in areas not demarcated by the census. As the list of EAs for the PES sample selection was based on census records, any areas not demarcated for the census would not be in scope for the PES and would not be adjusted for in

the undercount calculations. However, after the completion of Census ’96 fieldwork, Statistics South Africa was involved in a project which involved capturing the 1996 census EAs on a geographical information system (GIS). This enabled a comprehensive review of the demarcation of EAs used for Census ’96. While it is possible that some areas were completely missed by the census, the project showed that most areas that were not demarcated were either unpopulated or were enumerated as a part of another EA

 

2.4 Questionnaire design 

The PES questionnaire was very brief. Every person who spent the previous night in the household should have been included on the PES questionnaire. In addition, persons who were not present on the previous night but who spent census night in the household should also have been included. The questions covered the following issues:

  • Basic demographic information for each person present (age, sex, marital status, language, education).
  • Whether the household was visited and whether each person was counted in the census.
  • The opinions of the householder towards the census.
  • Whether the questionnaire could be matched back to a census questionnaire and, if so, whether each person was found on the questionnaire (these questions were marked ‘For office use only’).

A reproduction of the questionnaire is included in Appendix A. The results of the opinion questions were summarised in the preliminary report. The other questions were used in the matching process and in the calculation of the undercount estimate.

There was no opportunity to test the questionnaire or the methodology. Subsequent discussions have indicated that more thorough information on people who had moved since census night might have been useful in the estimation process. There was no question specifically addressing this issue. Related information was obtained from the questions on where the person was on census night and whether this dwelling was their usual residence, but this was not used in the estimation process.

In order to identify possible overcount, the last questions on the PES questionnaire attempted to obtain a complete list of all people who should have been included on the census questionnaire. This was done by asking about people who were absent at the time of the PES but present on census night. People should have been enumerated on a census questionnaire at the address where they spent census night, 9-10 October 1996. Thus, for households which had not moved since the census, anyone present on the census questionnaire but not on the PES questionnaire should, in theory, not have been entered onto the census questionnaire at that household as they were not present on census night. These people may have been overcounted if they were also entered on the census questionnaire at the address where they actually spent census night. However, examination of the PES showed that it was not possible to draw this conclusion as there were various causes of discrepancies between the census and PES listings. For example, as the census was conducted over a month, many people were enumerated at a location other than their census night address.

While this information could not be used in the manner originally intended, it had some consequences for the final PES data although, as the number of people involved is small, these are minor. It was not possible to distinguish people absent at the time of the PES but present on census night from those present at the time of the PES, due to the manner in which the data was captured. This reflects the fact that the way in which the data would be used was not considered at the time data capture (see Section 2.6). The impact on final data was on the applicable population of the PES, which is now wider than the scope of the census, potentially distorting comparisons of counts of EAs and average household size between the PES and census. However, the slight increase in the PES sample should not have had an effect on the final undercount estimates (see Section 3.3.2 for the alternative method used to adjust for overcount for the final calculations).

 

2.5 Enumeration 

The PES was designed to provide an independent check of the census count and, accordingly, the team responsible for the PES was not directly involved in the census. For practical reasons, PES enumeration procedures were largely the same as those used in the census. However, measures were taken to ensure that the PES was conducted as independently of the census as possible and was thorough in identifying persons missed by the census.

PES fieldworkers were recruited from the chief enumerators and controllers who had been employed on the census. Although this compromised the independence of the PES to some extent, this had a number of advantages for the PES. The recruitment procedures were simplified and the staff recruited were drawn from those known to have performed well in the census. Using experienced interviewers familiar with the processes helped to ensure that PES enumeration was as complete as possible. At the same time, independence between the census and PES was obtained by assigning fieldworkers to areas other than where they had worked in the census.

Around 1 850 temporary staff were involved in the PES enumeration including approximately 1 600 interviewers (two per EA), 200 fieldwork supervisors and 50 regional managers. All temporary staff received two days of training. As the staff were already familiar with the census questionnaire and the PES questionnaire was comparatively short and simple, the emphasis of the training was on the concepts and procedures specifically related to the PES.

Census enumeration had consisted of two main phases. The first was demarcation, when the country was divided into enumerator areas consisting of a sufficient number of dwellings to form a workload for a census enumerator. For each EA, an enumerator’s summary book was prepared which usually contained a description of the boundaries of the area, a map or aerial photograph, and a listing of all visiting points within the area. The second phase was the actual census enumeration. During this phase, enumerators attempted to obtain completed questionnaires from all households in each visiting point listed in the summary book. They should also have added in any visiting points missed during the original demarcation.

The PES consisted of two similar phases. Firstly, the selected EAs were identified and copies of the maps and boundary descriptions were obtained from enumerators’ summary books

prepared for the census. Although it was a feature of the design to use the same boundaries as prepared for the census, the visiting points were re-listed so any points not listed during the census should have been included in the PES. However, the PES could obviously not identify any households missed in the census when they were not included in the boundaries of an enumerator area originally demarcated for the census.

Secondly, each visiting point in the selected EAs was enumerated. Having two PES field workers per EA reduced the time taken to complete enumeration, which is important in the PES as it is essential to interview respondents as soon as possible after the census to ensure that they recall the details of census enumeration as accurately as possible. This required the two field workers to work closely together to ensure that every dwelling in the EA was enumerated and none were double-counted. PES enumeration took place from 15 to 24 November 1996.

Where it was impossible to obtain a completed questionnaire from a household, the enumerators were instructed to note this in their summary books. A completed questionnaire was obtained for almost 95% of households in the sample. The most common reason for non-completion of a questionnaire was non-contact (3,1%) with less than 1% of households refusing to participate (see Appendix C, Table C.1 for details by province).

 

2.6 First phase data capture 

Once the PES questionnaires were returned to head office, the data from the household and personal questions were captured by an external organisation for use in the calculation of the preliminary estimates.

Due to time pressures, the PES questionnaires were not checked before data entry and no editing of the data was performed before the preliminary estimates were produced. While this subsequently required investigations and corrections, as will be explained in Section 2.8, it had little impact on the very simple method used for the preliminary calculation of undercount, which was based on the yes/no responses to the question on whether each person was enumerated in the census. The establishing of the preliminary estimates is discussed further in Section 3.2.

 

2.7 Matching process 

After establishing the preliminary estimates, the next stage required PES questionnaires to be matched against the corresponding census questionnaires for that address to check the completeness of the census enumeration.

Matching proved to be a challenging exercise. There were around 80 000 households and 342 500 persons in the PES and a match status had to be determined for all of them. It took a team of usually 30, but up to 60, people around nine months to complete the process.

In the run-up to Census ’96, Stats SA had planned that questionnaires would be anonymous, to try and minimize the extent to which enumerators would encounter refusals in politically tense parts of the country. Hence, the census questionnaire asked for only ‘first name or initials to make it easy to complete the questionnaire’. The lack of detailed name information made matching to the same address difficult at times. Other obstacles faced included the lack of detailed addresses in many parts of the country and the logistical difficulties encountered in locating census questionnaires at the provincial processing centers.

As the methodology was developed during the process, matching was actually done twice. In the first phase, the ‘For office use only’ sections on the PES questionnaires, simply recording whether each household and person was found, were completed. However, further investigations showed that this information was insufficient for the calculation of undercount and matching was repeated with more detailed information being recorded on a ‘matching sheet’, a reproduction of which is included in Appendix B. This matching sheet had two main purposes:

  • To record whether the household was not matched and, if not, whether it was missed or unresolved and the reason why.
  • To record the identifying information for the census questionnaire against which the PES questionnaire was matched.

In both phases, matching involved several stages: matching EAs, matching households and matching individuals.

2.7.1 Matching EAs 

The first stage of matching an EA involved locating the census questionnaires corresponding to the PES EA. This was often not as straightforward as may have been expected. The process should have just involved obtaining the box of questionnaires for the census EA which had the same number as the PES EA. However, the PES EAs were based on the areas as determined prior to the census and the changes were made during census enumeration meant that EA numbers did not always correspond. For example, during the census some EAs were renumbered, split, combined or had boundaries altered. Thus, the box of census questionnaires for an EA with a different number, or for a combination of EAs, may have been required for matching to a PES EA.

Attempts were made in head office to resolve the problems but sometimes it was necessary for staff to visit the provincial offices to look at maps or other boxes of census questionnaires to find the corresponding EA or EAs. In some cases it was even necessary to visit the actual EA to try to determine what had happened. Where it was impossible to locate a corresponding EA, the PES EA was usually excluded from the PES sample used for calculating the undercount rate. This affected 23 of the PES EAs, and will be discussed further in Section 3.3 on the final estimation.

The problems with correspondence of boundaries between census and PES EAs, while making matching more difficult, should not have had an impact on the calculated undercount. The methodology was based on the match status of each household and person. If a household and occupants were enumerated as part of another EA in the census, the thorough searching

procedures would have ensured that they were not erroneously treated as undercount. If there was any doubt about whether they were enumerated, codes of unresolved would have been assigned.

Sometimes, even when it seemed a corresponding EA had been found, other problems meant that a visit to the province was necessary. For example, there were cases when a different numbering system appeared to have been used, with stand numbers used in the census and street numbers used in the PES, and the relationship between the numbers was determined by visiting the EA.

2.7.2 Matching households  

Once the corresponding census questionnaires were located, the next task was to match at household level. This was done by comparing the address listings in the enumerators’ summary books for the census and PES to try to identify the corresponding households. Where this was inconclusive, the questionnaires were compared to see if a match could be found based on names and household structure.

Each household was classified as matched, missed or unresolved on the matching sheet. Where a household was classed as missed or unresolved, the reason for this was recorded if known. For example, the corresponding entry in the census enumerator’s summary book may have recorded the visiting point as a refusal or non-contact. In this case, it is clear that the dwelling was missed and the cause was identified.

In areas with formal addresses, matching was a relatively straightforward process and it was not difficult to identify a corresponding household or to confirm whether a household was missed in the census. However, in areas without formal addresses, matching was more complex. In these areas, visiting points were usually listed in the summary book using the names of householders rather than a street address and more reliance was placed on the comparison of questionnaires. Difficulties arose when the householders’ names did not match or were not unique or when the composition of households had changed. Sometimes it was impossible to confirm whether a particular household and its members were enumerated or not. In these cases, the household was classed as unresolved.

Sometimes a corresponding questionnaire was found but the household present at the time of the census was completely different. This is possible, as the original household may have left and a new household moved in between the census and the PES. A code reflecting this situation was allocated on the matching sheet and the household was treated as unresolved.

2.7.3 Matching persons

Once a corresponding household was identified, a match status was allocated for each person on the PES questionnaire. In most cases it was possible to identify a match on the basis of the name. However, this was more difficult in some cases where, for example, a different name or initials appeared to have been used. In these cases, a judgment on whether or not a person was

matched was made based on a number of variables including age, marital status, population group and gender. These variables did not have to be exactly the same for a match to be made as often, particularly for age, the responses differed slightly.

Usually when a corresponding census questionnaire was identified, all members of the PES household were classified as enumerated or missed. However, there were still some situations where it was necessary to allocate a code of unresolved for a person on the PES questionnaire. For example, where some characteristics of a person on the census questionnaire were similar but others were very different, and the structure of the households did not assist in indicating whether the persons matched, the unresolved category was used.

No action was taken if a person was on the census questionnaire but not on the PES questionnaire. As mentioned in Section 2.5, initially the PES intended to identify these people as potential overcount if the household was the same in the census and PES. However, it was not possible to draw this conclusion and the estimation of undercount was based solely on the people selected in the PES, and not on the people who were included in the census but not in the PES.

2.8 Second phase data capture 

There were a number of steps involved in capturing the data. These reflected both the need for preliminary population estimates, as mentioned in Section 2.6, and the development of the methodology for the PES even after processing and estimation had commenced.

The first phase of data entry involved capturing the information recorded in the PES interviews which was used for the calculation of the preliminary estimates. After the first stage of matching was completed, the information from the ‘For office use only’ questions was captured. Next, the data from the matching sheet from the second stage of matching was captured. These three data sets were then merged together.

As mentioned earlier, the initial phase of data entry did not involve any checking of the questionnaires or editing of the captured results. Later stages of data entry did involve some checks and edits. However, as final estimation commenced, a number of errors were discovered on the file, relating particularly to the numbering of questionnaires which resulted in distorted household counts, and to the manner in which the data was captured for some questions. As a result, a final stage of data checks and corrections took place before an adequate dataset was available for the calculation of the final undercount rate and the establishing of final population estimates.

Back to Contents