Genome-wide identification of directed gene networks using large-scale population genomics data.
Luijk R., Dekkers KF., van Iterson M., Arindrarto W., Claringbould A., Hop P., Boomsma DI., van Duijn CM., van Greevenbroek MMJ., Veldink JH., Wijmenga C., Franke L., 't Hoen PAC., Jansen R., van Meurs J., Mei H., Slagboom PE., Heijmans BT., van Zwet EW., BIOS (Biobank-based Integrative Omics Study) Consortium None.
Identification of causal drivers behind regulatory gene networks is crucial in understanding gene function. Here, we develop a method for the large-scale inference of gene-gene interactions in observational population genomics data that are both directed (using local genetic instruments as causal anchors, akin to Mendelian Randomization) and specific (by controlling for linkage disequilibrium and pleiotropy). Analysis of genotype and whole-blood RNA-sequencing data from 3072 individuals identified 49 genes as drivers of downstream transcriptional changes (Wald P < 7 × 10-10), among which transcription factors were overrepresented (Fisher's P = 3.3 × 10-7). Our analysis suggests new gene functions and targets, including for SENP7 (zinc-finger genes involved in retroviral repression) and BCL2A1 (target genes possibly involved in auditory dysfunction). Our work highlights the utility of population genomics data in deriving directed gene expression networks. A resource of trans-effects for all 6600 genes with a genetic instrument can be explored individually using a web-based browser.