The importance of family-based sampling for biobanks
Professor Neil Davies
Tuesday, 27 January 2026, 1pm to 2pm
Richard Doll Building, Lecture Theatre
Speaker: Prof Neil Davies, UCL
Bio: He is a Professor of Medical Statistics at the Division of Psychiatry, UCL. He holds a joint appointment with the Department of Statistical Sciences. I gained a BSc in Economics and Econometrics at the University of Bristol (2005) and completed an M.Sc. in Economics (Bristol) in 2006. Under the supervision of Richard Martin, Frank Windmeijer and George Davey Smith, he completed my Ph.D. in Epidemiology (Bristol, 2012). He has held an ESRC Future Research Leaders Fellowship 2014-2018. From 2018-2022 Prof Neil Davies led a stream of research at the MRC Integrative Epidemiology Unit which used family-based designs to improve causal inference in genetic epidemiology. He joined UCL in 2022 as a Professor of Medical Statistics.
Abstract: Large-scale population-based samples of genotyped individuals have transformed our understanding of the genetic and environmental causes of health and disease. A major advantage of molecular genetic data over other observational designs is its use for causal inference by exploiting the random transmission of genetic variants from parents to offspring as a natural experiment—an approach known as Mendelian randomization.
Most molecular genetic studies have relied on samples of unrelated individuals, with relatedness viewed primarily as a technical complication and potential source of bias. Mendelian randomization studies in these samples implicitly assume that the random allocation of variants within families also holds at the population level. However, a growing body of evidence from genotyped siblings and parent–offspring trios suggests this assumption is often violated, with population-based estimates susceptible to bias from population structure, assortative mating, and dynastic effects. These problems are compounded when using diverse samples, and larger samples of unrelated individuals will only yield more precise, but equally biased, estimates.
In this talk, I will first summarise evidence from family-based studies demonstrating the magnitude and pervasiveness of these biases. I will then argue that expanding the collection and use of family-based molecular genetic data is essential for improving causal inference in genetics. Finally, I will discuss the strengths and limitations of specific family-based designs, and what they can—and cannot—tell us about the causes of health and disease.
There will be tea/coffee and cakes served in the Atrium 1, 30 minutes prior to the lecture.

