AI for disease discovery using electronic health records
There is a growing body of evidence to suggest that complex diseases, such as heart attacks, asthma and chronic obstructive pulmonary disease (COPD), are composed of distinct subtypes with different risk factor and prognostic profiles. Artificial intelligence (AI) holds particular promise in identifying, describing and evaluating such novel disease subtypes. Identifying subtypes will improve our understanding of the causes of disease and enhance personalised treatments. This doctoral project will use large linked datasets of electronic health records (such as the Oxford Research Data Warehouse), together with AI methods, to identify novel subtypes of common diseases. A wide range of approaches will be used, including both unsupervised learning and supervised learnings. We will explore recent machine learning methods, such as deep learning and auxiliary learning. The findings will then be applied to linked electronic health records in UK Biobank, a prospective study of 0.5 million participants, to increase the understanding of these causes of disease. There is scope to tailor the project to the student’s interest and background, including engagement in international collaborations.
RESEARCH EXPERIENCE, RESEARCH METHODS AND TRAINING
The student will work within the rich academic environment of the Nuffield Department of Population Health and affiliated institutions, gaining research experience and skills training in epidemiology and statistics. The successful candidate will have access to several large dataset including the Oxford Research Data Warehouse (a large dataset of electronic health records from secondary care in Oxfordshire) and the UK Biobank, a prospective cohort study of 0.5 million adults of middle aged and older. The student will be supported through regular research meetings and will have the opportunity to participate in training and seminars offered by the unit.
FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING
By the end of the DPhil, it is expected that the candidate will be able to plan, undertake and interpret statistical analysis of large-scale epidemiological data, and to report their findings. The candidate will have acquired transferable skills including drafting project proposals, and presenting the research findings at national and international meetings. The candidate will be encouraged to publish peer-reviewed papers as lead author.
Candidates should have an MSc degree in statistics/epidemiology/machine learning, or equivalent mathematical background.