Machine learning for treating infections in hospital
Dr Davide Morelli, Department of Engineering Science, University of Oxford
The students will develop advanced machine learning (ML) methods for analysing longitudinal data that are time-varying (or at least measured at multiple times) and from multiple data sources. The developed methodology can be applied to a wide range of clinical settings, and this project will investigate time-to-normalisation of patient physiology and biomarkers for the treatment of infections in hospital.
The overall aim is to determine different recovery phenotypes from infection (or any other condition, e.g. gastrointestinal bleed, major surgery, etc.) which might help plan therapy and discharge, but also identify individuals who are not recovering as expected.
One specific challenge (also strength) arising from the Infections data is that multiple events occur at different time points, including updated vital sign measurements, laboratory tests and events of interest, e.g. a bloodstream infection, a list of operations, an Intensive Care Unit (ICU) stay, and particular diagnostic/procedure codes.
These events are typically analysed by the time-to-event (also called "survival") analysis that is widely used in clinical research, where the Cox model remains popular. Recent advances in ML have provided unprecedented research opportunities for furthering survival analysis. For example, the "dynamic time-to-event analysis" is well suited to overcome the challenge with the Infections data, because it allows the model to dynamically update its predictions based on events that occur at multiple time points.
RESEARCH EXPERIENCE, RESEARCH METHODS AND TRAINING
Candidates will acquire research skills through regular supervisory meetings, and by attending relevant seminars, courses, workshops. The student will learn how to handle and analyse large-scale epidemiological data, and how to interpret ongoing findings and subsequently explore potential methodologies.
Candidates will be able to access the Infections in Oxfordshire Research Database (IORD) datasets that cover approximately 600k people, providing comprehensive electronic healthcare data from patients attending hospitals in Oxfordshire.
FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING
There may be opportunities to work with external partners and/or on different datasets. For example, candidates can access the publicly-available datasets via the 'Physionet' resource, if needed.
Candidates should have postgraduate training in statistics/mathematics/machine learning, or equivalent mathematical background.
Proficiency with statistical language R and machine learning (Python/tensorflow/pytorch) is required; strong programming skills are essential. Students are expected to document their ongoing findings with clarity, and publish 2-3 articles as the lead author in peer-reviewed scientific journals by the end of their DPhil.