Replication Package for the master thesis "An Empirical Assessment on the Limits of Binary Code Summarisation with Transformer-based Models"
doi:10.4121/20301309.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future.
For a link that will always point to the latest version, please use
doi: 10.4121/20301309
doi: 10.4121/20301309
Datacite citation style:
Ali Al-Kaswan; Arie Van Deursen; Prem Devanbu; Ahmed, Toufique; Maliheh Izadi et. al. (2022): Replication Package for the master thesis "An Empirical Assessment on the Limits of Binary Code Summarisation with Transformer-based Models". Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/20301309.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
Dataset
usage stats
801
views
278
downloads
licence
Apache-2.0
This dataset is published as part of the master thesis: "An Empirical Assessment on the Limits of Binary Code Summarisation with Transformer-based Models".
It includes both the training/evaluation data as well as trained models.
For more information, please refer to the data.md file or to the master thesis.
history
- 2022-07-14 first online, published, posted
publisher
4TU.ResearchData
format
A 7zipped collection of:
- 7zipped .jsonl files of training and evaluation data.
- Trained pytorch_model.bin files
derived from
organizations
TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science, Department of Intelligent Systems.
DATA
files (2)
- 1,813 bytesMD5:
62817d2c44949e26b633bc16028720fe
data.md - 8,387,663,120 bytesMD5:
7164df4b961c33653f26c4b8be0b6309
Data_and_Models.7z -
download all files (zip)
8,387,664,933 bytes unzipped