Optimal strategies for annotation of large imaging data sets for population health studies

BACKGROUND

Most of machine learning methodologies requires large data sets with high-quality labelling to efficiently train and validate the developed models. As medical imaging repositories used for population health studies such UK Biobank becomes larger, classic approaches to manually annotations by the expert become unfeasible. On the other hand, labelling process can be automated by extraction annotations from the associated meta-data or reports e.g. extraction of radiological findings from the radiological reports to annotate x-ray imaging repositories. In such scenario, automated labelling can introduce labelling errors due to ambiguities in the reports, leading to large but noisy-labelled (or even mislabelled) data sets, leading to e.g. poorer generalisation.

RESEARCH EXPERIENCE, RESEARCH METHODS AND TRAINING

Recent developments in Deep Learning have been shown to yield results of comparable accuracy to the human experts in various clinical applications including grading of diabetic retinopathy from fundus retinal imaging, detection of pneumonia from chest X-rays, or classification of skin cancer using dermatology imaging. However, Machine Learning (in particular Active Learning) can be also used to learn what subset of the large imaging repository needs to be labelled to achieve state-of-the-art performance, while reducing substantially number of the performed annotations. Thus the objective of this project will be to investigate machine learning strategies that can inform about the optimal annotation strategies to achieve high accuracy of the developed model.

The project will seek optimal strategies for various types of the annotations including categorical data (e.g. classification problems) or semantic annotations (e.g. segmentation, or detection problems).

FIELD WORK, SECONDMENTS, INDUSTRY PLACEMENTS AND TRAINING

The student will have also opportunity to attend the research seminar offered at the NDPH, theBDI and the Institute of Biomedical Engineering (IBME) as the primary supervisor is a member of Imaging Hub at the IBME (https://eng.ox.ac.uk/biomedical-image-analysis/). The student will be expected to attend relevant seminars within the department and those relevant in the wider University. Subject-specific training will be received through our group's weekly supervision meetings. Students will also attend external scientific conferences where they will be expected to present the research findings.

PROSPECTIVE STUDENT

Degree in Computer Science, Statistics, Engineering or related discipline
Strong programming skills (preferable Python, or Matlab/C++ and willing to learn Python)
Experience or interest in Machine Learning (Deep Learning) and Medical Image Analysis
Experience or enthusiasm to work on clinically relevant problems

Supervisors

Thomas Nichols

Professor of Neuroimaging Statistics, Nuffield Department of Population Health
Bartek Papiez

Associate Professor

Cookies on this website