TY - DATA T1 - Data underlying the publication: "GONNECT: Coupling Biological Systems to Neural Networks for Improved Model Interpretability" PY - 2025/11/07 AU - Martijn Lieftinck AU - Timo Verlaan AU - Marcel Reinders UR - DO - 10.4121/0d78788b-6bd7-4941-a942-245309107b6d.v1 KW - GONNECT KW - TCGA KW - GO KW - Gene Ontology KW - Gene Expression KW - Machine learning KW - Deep learning KW - BINNs KW - Biologically-informed Neural Networks KW - Autoencoders N2 -
This dataset contains all processed data required to reproduce the results of the GONNECT paper. In this work, we couple the structure of a neural network model to biological prior information, to gain interpretable activations in the neural network's hidden layers (see preprint/publication for more information). The data presented here includes processed gene expression data from The Cancer Genome Atlas (TCGA) data that is used as input data for the model, and both the raw and processed Gene Ontology (GO, https://geneontology.org/) knowledge base, from which the structure of the neural networks in this study is derived. There are also some miscellaneous files used to link genes to proteins. All data used in this study is publically available.
Code corresponding to the paper can be found here: https://github.com/DelftBioinformaticsLab/GONNECT
ER -