Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

  • 8 September 2025 to 2 December 2025
  • Project No: D26064
  • DPhil Project 2026

Background

Reproductive and women’s health factors (including pregnancy history, medical procedures, and conditions such as endometriosis and timing of menopause) are important determinants of long-term health. Yet in large population cohorts these factors are often incompletely characterised, as information is dispersed or inconsistently recorded across hospital data, primary care records, and self-reported questionnaires. UK Biobank, which includes ~500,000 participants with extensive health, biomarker, and genetic data, provides a unique opportunity to overcome these limitations. By linking and harmonising these data sources, it will be possible to develop validated reproductive and women’s health phenotypes and use them to investigate the determinants of reproductive and women’s health phenotypes, and the association of these phenotypes with major long-term outcomes, including cardiovascular disease, cancer, and other chronic conditions.  

research experience, research methods and skills training

The student will:

  1. Develop reproducible algorithms to define and validate reproductive and women’s health phenotypes (including pregnancy history, procedures, and diagnoses) by integrating hospital (HES), primary care (GP), and UK Biobank self-report data.
  2. Assess completeness, concordance, and discrepancies across data sources, and apply methods to address missingness, misclassification, and coding heterogeneity.
  3. Apply epidemiological and statistical approaches to investigate the determinants of reproductive and women’s health phenotypes, and the association of these phenotypes with major long-term outcomes (e.g. cardiovascular disease, cancer, other chronic conditions).
  4. Produce validated, well-documented phenotyping code as an open research resource, and lead original publications on reproductive health, its determinants, and its long-term health consequences.
  5. Training and supervision will include electronic health record (EHR) phenotyping, reproductive health, advanced epidemiology, reproducible programming, biostatistics, and scientific writing. The student will receive regular supervisory support, departmental training and opportunities to present findings at internal seminars and external conferences.

FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING

The project includes placements/liaison opportunities with the UK Biobank Health Outcomes Enhancement team and collaborators, and access to workshops on EHR linkage and secure data handling. Training courses in epidemiology, biostatistics, and advanced programming will be available through the department. 

PROSPECTIVE STUDENT

The ideal candidate will have a Bachelor’s or Master’s degree in a relevant discipline (e.g., epidemiology, statistics, public health, data science, clinical sciences) and demonstrable experience in large-scale data analysis. They should possess strong analytical skills, be comfortable with common programming languages, and have a keen interest in women’s health research, health inequalities, and interdisciplinary collaboration.