Creation of the first national linked colorectal cancer dataset in Scotland: prospects for future research and a reflection on lessons learned.
Whenever a patient interacts with the healthcare system, data is routinely collected, this is called “Administrative Healthcare Data”. This data can be used to provide information on screening, surveillance, existing health conditions, diagnosis, treatments and patient outcomes. It can also be used to provide information on the real-world cost of healthcare. The data is held in individual datasets, which can be linked together to provide more information than just one dataset alone.
In the UK, Administrative Healthcare Datasets are generally held separately within each nation. In Scotland, cancer data is collected by the cancer registry. This dataset contains a lot of information such as the date of diagnosis and the type and stage of cancer, but it does not include detailed information on the treatment that was delivered. In order to be able to see a full picture of what the cancer services currently look like, the cancer registry data needs to be linked to other Administrative Healthcare Datasets.
The aim of this project is to create a linked dataset to allow mapping of the bowel cancer landscape in Scotland to identify differences in the treatment offered to patients and the outcomes associated with the different treatment approaches. An additional aim is to calculate the healthcare resource needed for bowel cancer diagnosis and treatment on a national scale, and the cost of providing this.
This manuscript documents the process of creating a specific and complete bowel cancer dataset for research in Scotland.
What we did:
There were four main stages in accessing and linking datasets on a national level.
Stage 1 – The first stage in accessing the data was to define the study requirements to apply to the Public Benefit and Privacy Panel (PBPP) for Health and Social Care in Scotland. The role of the PBPP is to weigh up the public benefits of granting access to healthcare data against the risks that the sharing of the data poses to an individual’s privacy.
Stage 2 – The second stage was to acquire the datasets to transfer them into the National Safe Haven (NSH). The NSH is a secure platform where the data can be used for research and analysis.
Stage 3 – All datasets that were to be released to the research team to analyse were checked by the electronic Data Research and Innovation Service (eDRIS) to make sure they matched the approved specification. The linkage of the datasets was performed by eDRIS once all the pre-checks had been completed.
Stage 4 – After the data linkages had been performed, the datasets were transferred to the National Safe Haven where researchers, with the correct approvals, could access the data. In this setting, all patient information like names and addresses were removed.
Linked Administrative Healthcare Datasets have huge potential to aid understanding of how patients interact with healthcare services and provide a detailed picture of the care they receive. This project demonstrates that the creation of a national linked administrative dataset is possible, by using bowel cancer data in Scotland as an example. It is however only possible through substantial effort and collaboration between researchers and the central team co-ordinating the data transfers and linkages.
The linked datasets have huge potential public and patient benefit by enabling researchers to analyse real world cancer data to improve outcomes for patients as well as the delivery of cancer services.
Publication Catherine R Hanna, Elizabeth Lemmon, Holly Ennis, Robert J Jones, Joy Hay, Roger Halliday, Steve Clark, Pete Hall, Eva J A Morris Creation of the first national linked colorectal cancer dataset in Scotland: prospects for future research and a reflection on lessons learned. International Journal of Population Data Science