Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

  • 8 September 2025 to 31 July 2026
  • Project No: D26072
  • DPhil Project 2026
  • Cancer Epidemiology Unit (CEU)

Background

Understanding how biological pathways contribute to cancer development is a major goal in population health research. Advances in high-throughput proteomics now enable measurement of thousands of circulating proteins that reflect subtle biological processes and may reveal both early markers of disease and potential aetiological targets for future prevention trials. Because associations observed in proteomic studies may not always distinguish correlation from causation, integrating complementary genetic data can strengthen causal inference and help identify proteins with more robust aetiological evidence for a role in cancer development.

This project will focus on integrating complex proteomic and proteogenomic data from large population-based studies, including the European Prospective Investigation into Cancer and Nutrition (EPIC) and the UK Biobank, to identify key proteins and biological pathways involved in the development of common cancers.

The specific aims of the project may include, and can be adapted to the student’s interests:

  1. Develop approaches to harmonise proteomic data across large cohorts to improve comparability and data quality.
  2. Develop integrative methods that combine proteomic and proteogenomic findings with biological knowledge from publicly available resources, to improve understanding of cancer aetiology and prioritise candidate protein targets.
  3. Generate robust, interpretable networks of proteins and pathways involved in cancer development by combining data from multiple cohorts.

research experience, research methods and skills training

The student will gain experience in analysing large, high-dimensional molecular and epidemiological datasets, developing practical skills in data harmonisation, advanced statistical modelling, and causal inference. The work will involve computational and quantitative approaches to integrate rich, multivariate data across studies and to assess associations between proteins, genetic variants, and cancer risk.

Training will be provided in molecular epidemiology, advanced statistical programming (e.g. R and Python), and modern causal inference methods. The project offers opportunities to contribute to international collaborations and to develop transferable skills in reproducible research, data visualisation, and scientific communication.

FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING

The student will be based at the Richard Doll Building, within a vibrant research environment that includes experts in epidemiology, statistics, bioinformatics, and molecular epidemiology. They will attend departmental seminars and courses, collaborate with multidisciplinary teams, and engage with partners involved in major cohort studies such as EPIC and UK Biobank.

PROSPECTIVE STUDENT

The ideal candidate will have an MSc in a quantitative discipline such as statistics, bioinformatics, or data science, with strong coding skills and an interest in applying them to population health and cancer research. Some background in molecular or genetic epidemiology would be an advantage.