An Empirical Comparison of Joint and Stratified Frameworks for Studying G × E Interactions: Systolic Blood Pressure and Smoking in the CHARGE Gene-Lifestyle Interactions Working Group.
Sung YJ., Winkler TW., Manning AK., Aschard H., Gudnason V., Harris TB., Smith AV., Boerwinkle E., Brown MR., Morrison AC., Fornage M., Lin L-A., Richard M., Bartz TM., Psaty BM., Hayward C., Polasek O., Marten J., Rudan I., Feitosa MF., Kraja AT., Province MA., Deng X., Fisher VA., Zhou Y., Bielak LF., Smith J., Huffman JE., Padmanabhan S., Smith BH., Ding J., Liu Y., Lohman K., Bouchard C., Rankinen T., Rice TK., Arnett D., Schwander K., Guo X., Palmas W., Rotter JI., Alfred T., Bottinger EP., Loos RJF., Amin N., Franco OH., van Duijn CM., Vojinovic D., Chasman DI., Ridker PM., Rose LM., Kardia S., Zhu X., Rice K., Borecki IB., Rao DC., Gauderman WJ., Cupples LA.
Studying gene-environment (G × E) interactions is important, as they extend our knowledge of the genetic architecture of complex traits and may help to identify novel variants not detected via analysis of main effects alone. The main statistical framework for studying G × E interactions uses a single regression model that includes both the genetic main and G × E interaction effects (the "joint" framework). The alternative "stratified" framework combines results from genetic main-effect analyses carried out separately within the exposed and unexposed groups. Although there have been several investigations using theory and simulation, an empirical comparison of the two frameworks is lacking. Here, we compare the two frameworks using results from genome-wide association studies of systolic blood pressure for 3.2 million low frequency and 6.5 million common variants across 20 cohorts of European ancestry, comprising 79,731 individuals. Our cohorts have sample sizes ranging from 456 to 22,983 and include both family-based and population-based samples. In cohort-specific analyses, the two frameworks provided similar inference for population-based cohorts. The agreement was reduced for family-based cohorts. In meta-analyses, agreement between the two frameworks was less than that observed in cohort-specific analyses, despite the increased sample size. In meta-analyses, agreement depended on (1) the minor allele frequency, (2) inclusion of family-based cohorts in meta-analysis, and (3) filtering scheme. The stratified framework appears to approximate the joint framework well only for common variants in population-based cohorts. We conclude that the joint framework is the preferred approach and should be used to control false positives when dealing with low-frequency variants and/or family-based cohorts.