Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.


Genome-wide association studies (GWAS) have implicated hundreds of common and low-frequency variants associated with cardiometabolic diseases such as obesity and type 2 diabetes, but almost all such studies have been conducted in populations of European ancestry. Similar disparities exist in large-scale sequencing studies used to identify rare coding variants with high effect sizes and directly implicated causal genes and molecular mechanisms.   

The Mexico City Prospective Study (MCPS) is a large blood-based prospective study of 150,000 Mexican adults recruited between 1998 and 2004 and followed for 20 years. In addition to extensive questionnaire and physical measurements, linkage to cause-specific mortality and blood-based assays, the MCPS has genotyping and whole-exome sequencing (WES) on the entire cohort. The population includes complex patterns of genetic relatedness and Mesoamerican admixture1, presenting analytical challenges but also opportunities for advancing the discovery of disease-associated rare variants.

The aim of this DPhil will be to leverage genetic features of the MCPS cohort to improve the power of rare-variant analysis. The specific aims are subject to discussion and student interests but may include the following:

  • Perform admixture mapping to identify genomic regions where local ancestry is significantly associated with cardiometabolic disease. This may be done with case-control or case-only designs, using methods which make explicit use of the extensive Mesoamerican and European admixture present.  
  • Resolve genomic regions with pronounced identity-by-descent (IBD) sharing using methods requiring phased or unphased data. Perform an “IBDWas” to identify significant associations between IBD segments and complex disease.
  • Conduct collapsing rare-variant burden tests, sequence kernel association tests, or exome-wide association studies (ExWAS) of single variants to identify rare variant associations with disease. Leverage results from admixture mapping and/or IBDWas to improve power to detect rare variant associations (for example, by lowering the significance threshold at sites with pronounced IBD-sharing or differences in local ancestry).  


This project will involve statistical analyses of large-scale genomic datasets on a high-performance computing cluster to elucidate genetic risk factors. The student will develop expertise in genetic epidemiology, computer programming and bioinformatics, and population genetics. The student will work within a team of scientists with diverse expertise in epidemiology, genetics, and clinical medicine. They will be expected to present their work regularly for internal meetings and at international meetings. 


Training opportunities will focus on deepening understanding of key concepts in population genetics (e.g. linkage disequilibrium, admixture, population structure) and genetic epidemiology, and developing practical skills pertaining to programming, statistical learning methods, and parallel computing on a high-performance cluster.  


Candidates must have a strong background in either mathematics, genetics, or biomedicine and postgraduate training in epidemiology, statistics, population genetics, public health, or machine learning. They must have a deep interest in genetic analysis and will be strongly encouraged to think creatively and develop novel methods where appropriate.