Understanding the potential of linked electronic health records for neurodegenerative disease research in UK Biobank
- 8 September 2025 to 2 December 2025
- Project No: D26061
- DPhil Project 2026
Background
Neurodegenerative disease research requires long follow-up periods to track exposure, premorbid and prodromal phases, and disease progression over time. Large-scale and population-representative longitudinal cohort studies are ideal for this research, yet funding, recruitment and retention challenges mean that very few such studies are sustainable in the long term. In their absence, many researchers are turning to electronic health records (EHR) as an alternative source of case ascertainment. When linked to extensive sociodemographic, lifestyle, health, biomarker, imaging, proteomic and genetic data, as is the case in UK Biobank, these records facilitate the investigation of a wide range of research questions regarding the determinants of neurodegenerative disease.
While EHR have the advantages of long-term and broad population coverage, they are ultimately collected for administrative rather than research purposes and are subject to limitations. Diseases such as dementia can be particularly vulnerable to ascertainment bias when diagnoses are not recorded in electronic records due to stigma or a lack of treatment options. The purpose of this project will be to evaluate the strengths and weaknesses of various EHR compared to other epidemiological sources, with scope to tailor the project to the student’s interest and background.
research experience, research methods and skills training
The student will:
- Evaluate and compare EHR and other sources for their suitability for answering research questions in relation to one or more neurodegenerative diseases (e.g. Alzheimer’s Disease or other dementias, Parkinson’s Disease or Parkinsonisms)
- Investigate and quantify the additional value of new data sources (EHR and/or newly collected project data) for case ascertainment
- Assess the sensitivity, Positive Predictive Value (PPV) and potential bias of EHR in comparison to other epidemiological data sources in UK Biobank and other studies
- Apply epidemiological and statistical techniques to investigate determinants of neurodegenerative disease.
FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING
The project includes placements/liaison opportunities with the UK Biobank team and collaborators, and access to workshops on EHR linkage and secure data handling. Training and supervision will include advanced epidemiology, reproducible programming, biostatistics, and scientific writing. The student will receive regular supervisory support, departmental training and opportunities to present findings at internal seminars and external national and international conferences.
PROSPECTIVE STUDENT
The ideal candidate will have a Bachelor’s or Master’s degree in a relevant discipline (e.g., epidemiology, statistics, public health, data science, biomedical science, clinical science) and demonstrable experience in large-scale data analysis. They should possess strong analytical skills, be comfortable with common programming languages, and have a keen interest in using big data in an interdisciplinary context to answer neurodegenerative disease research questions.
