Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Module Leads

BSc MSc PhD CStat Derrick Bennett - Associate Professor Derrick Bennett

Photo of Aiden Doherty Aiden Doherty

Imen Hammami


Learning Objectives

By the end of this module students will gain experience of:

  • How to explore the large datasets.
  • How to prepare and clean large datasets for analysis (e.g. reformatting and pre-processing).
  • How to wrangle data to obtain key exposures and outcomes
  • How to visualise and present raw, intermediate and final datasets for effective communication of the final results.
  • How to investigate different types of data using a machine learning framework
  • How to develop code and analyses using reproducible methods. 


  1. Overview of data science and AI/ML methods
  2. Processing phenotype data
  3. Data visualisation using R
  4. Reproducible research
  5. Introduction to unsupervised machine learning
  6. Processing a complex exposure