Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.


The genetic make-up of a population is shaped by its historical population dynamics including drift and founder effects, selection (i.e.  gene-environment interactions), assortative mating and consanguinity. These may be captured by genome wide array genotyping and related to epidemics such as obesity, mortality and health. China Kadoorie Biobank (CKB) spans 10 geographically distinct regions of China, offering a unique opportunity to compare groups with different dynamics and exposures to the environment. Genome-wide genotyping data are available for ~100,000 participants.

Over 200 putative regions of interest have been identified with signatures suggestive of recent natural selection. Some of these regions overlap with regions similarly identified in Europeans, but others appear to be unique to East Asians. Similarly, there is substantial evidence of consanguinity within the cohort, with over 300 regions of interest identified that may be significant for disease or disability.


The aim of the project is to:

  1. Construct an atlas of the signatures of drift, selection, assortative mating and consanguinity across different geographical regions studies in CKB;
  2. Relate these signatures to population health, the risk of disease, morbidity over the life course, mortality, and longevity.

Depending on the student’s interests, the project will involve:

  • Identifying genetic signatures of population dynamics and comparing these across regions studied in CKB or to those in UK Biobank (UKB);
  • Determining the relation of these signatures to phenotypes and risk of disease over the life course in CKB and UKB, e.g. searching for genetic signatures related to the great famine and their impact on health;
  • Integrating other multi-omics data that are available in the public domain (e.g.,  expression and methylation) within CKB and UKB;
  • Developing new methods to investigate population dynamics, environmental exposures and health, involving machine learning.


The work will involve computer-based big data analyses and data integration. The candidate will work in the Nuffield Department of Population Health team of the Big Data Institute, making use its unique facilities and excellent research community of data scientists and geneticists. The candidate will work in close collaboration with fellow scientists in the CKB group. The candidate will participate in consortia following up similar questions. 

Prospective candidate

The candidate should have a higher degree and should have a basic training and interest in: 1) statistics; 2) genetics; and 3) epidemiology or biology. The project will involve large-scale data and statistical analyses and, therefore, requires some previous statistical and programming training/experience, and aptitude for and interest in extending these skills.