Completeness and accuracy of crash outcome data in a cohort of cyclists: a validation study.
Tin Tin S., Woodward A., Ameratunga S.
BACKGROUND: Bicycling, despite its health and other benefits, raises safety concerns for many people. However, reliable information on bicycle crash injury is scarce as current statistics rely on a single official database of limited quality. This paper evaluated the completeness and accuracy of crash data collected from multiple sources in a prospective cohort study involving cyclists. METHODS: The study recruited 2438 adult cyclists from New Zealand's largest mass cycling event in November 2006 and another 190 in 2008, and obtained data regarding bicycle crashes that were attended by medical personnel or the police and occurred between the date of recruitment and 30 June 2011, through linkage to insurance claims, hospital discharges, mortality records and police reports. The quality of the linked data was assessed by capture-recapture methods and by comparison with self-reported injury data collected in a follow-up survey. RESULTS: Of the 2590 cyclists who were resident in New Zealand at recruitment, 855 experienced 1336 crashes, of which 755 occurred on public roads and 120 involved a collision with a motor vehicle, during a median follow-up of 4.6 years. Log-linear models estimated that the linked data were 73.7% (95% CI: 68.0%-78.7%) complete with negligible differences between on- and off-road crashes. The data were 83.3% (95% CI: 78.9%-87.6%) complete for collisions. Agreement with the self-reported data was moderate (kappa: 0.55) and varied by personal factors, cycling exposure and confidence in recalling crash events. If self-reports were considered as the gold standard, the linked data had 63.1% sensitivity and 93.5% specificity for all crashes and 40.0% sensitivity and 99.9% specificity for collisions. CONCLUSIONS: Routinely collected databases substantially underestimate the frequency of bicycle crashes. Self-reported crash data are also incomplete and inconsistent. It is necessary to improve the quality of individual data sources as well as record linkage techniques so that all available data sources can be used reliably.