Scripts and data for the paper: gsQTL: Associating genetic risk variants with gene sets by exploiting their shared variability

doi:10.4121/0165ad4c-b3d5-42b5-a171-686621386afd.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/0165ad4c-b3d5-42b5-a171-686621386afd
Datacite citation style:
Bouland, Gerard; Mahfouz, Ahmed; Reinders, Marcel (2024): Scripts and data for the paper: gsQTL: Associating genetic risk variants with gene sets by exploiting their shared variability. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/0165ad4c-b3d5-42b5-a171-686621386afd.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
Dataset

Scripts and data used for the paper gsQTL: Associating genetic risk variants with gene sets by exploiting their shared variability.


To investigate the functional significance of genetic risk loci identified through genome-wide association studies (GWASs), genetic loci are linked to genes based on their capacity to account for variation in gene expression, resulting in expression quantitative trait loci (eQTL). Following this, gene set analyses are commonly used to gain insights into functionality. However, the efficacy of this approach is hampered by small effect sizes and the burden of multiple testing. We propose an alternative approach: instead of examining the cumulative associations of individual genes within a gene set, we consider the collective variation of the entire gene set. We introduce the concept of gene set QTL (gsQTL), and show it to be more adept at identifying links between genetic risk variants and specific gene sets. Notably, gsQTL experiences less susceptibility to inflation or deflation of significant enrichments compared with conventional methods. Furthermore, we demonstrate the broader applicability of shared variability within gene sets. This is evident in scenarios such as the coordinated regulation of genes by a transcription factor or coordinated differential expression.

history
  • 2024-10-15 first online, published, posted
publisher
4TU.ResearchData
format
R/.R R/.rds spreadsheets/.xlsx
organizations
TU Delft, Faculty of Electrical Engineering, Mathematics, and Computer Science, The Delft Bioinformatics Lab
Leiden University Medical Center, Department of Human Genetics

DATA - not available

Data and scripts already shared on github (see references)

DATA