NOTE: Data from Rounds I through IV cannot be merged with Rounds V through XIX data since entirely different samples were used for the RLMS Phase I (Rounds I through IV) and Phase II (Rounds V through XIX). The attempt to merge data sets from Phase I with those from Phase II will generate erroneous results.
In each of the RLMS HSE data sets, the unit of analysis is a household, a household member, or a survey site, depending upon whether one is looking at household-, individual-, or community-level data. Note that Person 1 is not necessarily the head of the household.
Rounds I through IV
In Rounds I through IV, the variables SITE and FAMILY identify a unique household; SITE, FAMILY, and PERSON identify a unique individual. SITE is a geographic descriptor, and there are up to 360 families within each SITE. The numbers for FAMILY are repeated from site to site, and the numbers for PERSON are repeated from family to family. To prevent errors in merging data, use SITE as a primary sort key; FAMILY as a secondary sort key for household-level data; and SITE, FAMILY, and PERSON as sort keys for individual-level data.
In Rounds I through IV, as in all subsequent rounds, a code of "1" for gender indicates that the respondent is male, while a code of "2" indicates a female respondent. For birth years, a code of "00" means that the respondent was born in 1900. A code of "99" means that 1899 was the respondent's year of birth.
In Round V, the variables SITE5 and FAMILY5 identify a unique household. SITE5, FAMILY5, and PERSON5 identify a unique individual. As in Round I, SITE5 is a geographic descriptor and the numbers for FAMILY5 and PERSON5 are repeated within their respective broader categories.
Rounds VI through XVII
Starting in Round VI, an additional variable was required for unique identification of a household or person. The variables SITE, CENSUSD, and FAMILY identify a unique household, while SITE, CENSUSD, FAMILY, and PERSON identify an individual. As in Round I, SITE is a geographic descriptor and the numbers for CENSUSD, FAMILY, and PERSON are repeated within their respective broader categories.
Rounds XVIII and Above
Starting in Round XVIII, the variables REGION and FAMILY identify a unique household, while REGION, FAMILY, and PERSON identify an individual. REGION is a geographic descriptor and the numbers for FAMILY, and PERSON are repeated within their respective broader categories.
Starting with Round V, each round has a unique individual or family identifier. In both files, this identifier has the same name. For Round V it was AID, for Round VI it was BID, etc. To merge Round V with a later round, use the merge variable AID (at either the household- or individual-level) from the data sets; to merge Round VI with a later round, use BID, etc.
Starting with Round XVI, the variable IDIND was added to the individual-level data files, which allows people across rounds to be merged. This variable needs to be merged with earlier rounds as described above using AID, BID, etc. This variable is included in a file of Household and Individual Longitudinal Identifiers and is available on the Data Downloads page.
Do NOT try to merge rounds using any other combination of variables. To do so will generate erroneous results.
Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!
To be used only for spelling or punctuation mistakes.