Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.


Most of machine learning methodologies requires large data sets with high-quality labelling to efficiently train and validate the developed models. As medical imaging repositories used for population health studies such UK Biobank becomes larger, classic approaches to manually annotations by the expert become unfeasible. On the other hand, labelling process can be automated by extraction annotations from the associated meta-data or reports e.g. extraction of radiological findings from the radiological reports to annotate x-ray imaging repositories. In such scenario, automated labelling can introduce labelling errors due to ambiguities in the reports, leading to large but noisy-labelled (or even mislabelled) data sets, leading to e.g. poorer generalisation. 


Recent developments in Deep Learning have been shown to yield results of comparable accuracy to the human experts in various clinical applications including grading of diabetic retinopathy from fundus retinal imaging, detection of pneumonia from chest X-rays, or classification of skin cancer using dermatology imaging. However, Machine Learning (in particular Active Learning) can be also used to learn what subset of the large imaging repository needs to be labelled to achieve state-of-the-art performance, while reducing substantially number of the performed annotations. Thus the objective of this project will be to investigate machine learning strategies that can inform about the optimal annotation strategies to achieve high accuracy of the developed model.

The project will seek optimal strategies for various types of the annotations including categorical data (e.g. classification problems) or semantic annotations (e.g. segmentation, or detection problems).  


The student will have also opportunity to attend the research seminar offered at the NDPH, theBDI and the Institute of Biomedical Engineering (IBME) as the primary supervisor is a member of Imaging Hub at the IBME ( The student will be expected to attend relevant seminars within the department and those relevant in the wider University. Subject-specific training will be received through our group's weekly supervision meetings. Students will also attend external scientific conferences where they will be expected to present the research findings.


  • Degree in Computer Science, Statistics, Engineering or related discipline
  • Strong programming skills (preferable Python, or Matlab/C++ and willing to learn Python)
  • Experience or interest in Machine Learning (Deep Learning) and Medical Image Analysis
  • Experience or enthusiasm to work on clinically relevant problems 


  • Thomas Nichols
    Thomas Nichols

    Professor of Neuroimaging Statistics, Nuffield Department of Population Health

  • Bartek Papiez
    Bartek Papiez

    Research Fellow (Medical Image Analysis and Machine Learning)