Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

International migrants comprised 14% of the UK's population in 2020; however, their health is rarely studied at a population level using primary care electronic health records due to difficulties in their identification. We developed a migration phenotype using country of birth, visa status, non-English main/first language and non-UK-origin codes and applied it to the Clinical Practice Research Datalink (CPRD) GOLD database of 16,071,111 primary care patients between 1997 and 2018. We compared the completeness and representativeness of the identified migrant population to Office for National Statistics (ONS) country-of-birth and 2011 census data by year, age, sex, geographic region of birth and ethnicity. Between 1997 to 2018, 403,768 migrants (2.51% of the CPRD GOLD population) were identified: 178,749 (1.11%) had foreign-country-of-birth or visa -status codes, 216,731 (1.35%) non-English-main/first-language codes, and 8288 (0.05%) non-UK-origin codes. The cohort was similarly distributed versus ONS data by sex and region of birth. Migration recording improved over time and younger migrants were better represented than those aged ≥50. The validated phenotype identified a large migrant cohort for use in migration health research in CPRD GOLD to inform healthcare policy and practice. The under-recording of migration status in earlier years and older ages necessitates cautious interpretation of future studies in these groups.

Original publication




Journal article


Int J Environ Res Public Health

Publication Date





algorithm, clinical practice research datalink, migration, phenotype, primary care, validation, Aged, Data Management, Electronic Health Records, Humans, Middle Aged, Phenotype, Primary Health Care, United Kingdom