Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

  • 8 September 2025 to 2 December 2025
  • Project No: D26031
  • DPhil Project 2026
  • China Kadoorie Biobank (CKB)

Background

Electronic Health Records (EHR) combined with advanced analytic methods such as deep learning Artificial Intelligence (AI) can revolutionise medical research by predicting risk of disease onset, co-morbidity and improving accuracy of disease phenotypes. CKB has collected over 4 million clinical diagnoses from electronic health records and over 200,000 digitalised medical notes of major chronic diseases such as stroke, ischaemic heart disease (IHD), chronic obstructive pulmonary disease (COPD), chronic kidney disease (CKD) and neuro degenerative diseases including Parkinson disease and dementia.  

research experience, research methods and skills training

The specific DPhil project will be developed in discussion with the students and, depending on their interests and aptitude, may include some of the following objectives:

  1. to develop novel machine learning approaches to phenotype stroke, neurodegenerative disease in particular dementia and Parkinson diseases
  2. to develop machine learning models to predict risk of co-morbidity among major chronic diseases such as stroke, IHD, diabetes, COPD
  3. to use specific AI algorithms such as image processing and large language models to analyse EHR and digitalised medical information from multi-ethnic cohorts, e.g., UKB vs CKB
  4. to integrate demographic, life style, proteomic, genetic information, using multi-modal deep learning to more accurately predict the prognosis of major chronic diseases such as stroke, IHD, COPD and CKD.

The student will work within a multi-disciplinary team and have in-house training in epidemiology, statistical programming, computational genetics, and attendance of relevant courses. Additional supervision and support of AI expertise may be provided by Dr Qiang Zhang. By the end of the DPhil, the student will be competent to plan, undertake and interpret analyses of large datasets, and to report research findings in peer-reviewed journal and conference.

FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING

The project will be based within the CKB group in the Big Data Institute. There are excellent facilities and a world-class community of population health, data science, imaging process, and genomic medicine researchers. There will be opportunities to work with external research institutions.  

PROSPECTIVE STUDENT

The ideal candidate will have a good first degree (2.1) and MSc in computer science, bioinformatics, or medicine. Candidates should have strong analytical skills in handling large-scale EHR data and with experience of working on deep learning and machine learning.