Data underlying the BSc project: "An analysis of Java release practices on GitHub"

DOI:10.4121/67a790fe-b65a-4c30-aae0-c5b2dc7e5d4d.v1
The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/67a790fe-b65a-4c30-aae0-c5b2dc7e5d4d
Datacite citation style:
Roest, Vivian (2024): Data underlying the BSc project: "An analysis of Java release practices on GitHub". Version 1. 4TU.ResearchData. software. https://doi.org/10.4121/67a790fe-b65a-4c30-aae0-c5b2dc7e5d4d.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Software

Delft University of Technology logo

Usage statistics

115
views
56
downloads

Time coverage

2023

Licence

CC0

This dataset contains the following inside a tar.zst file:

  1. A list of all Java repositories on GitHub in a CSV format
  2. The POM.xml file from those repositories if there was one at the root of the repo
  3. A sample of 500 000 repositories that
  4. Have been searched recursively for POM.xml files
  5. Of those that have a POM.xml file an 'effective' POM.xml has been created
  6. Of those that have distribution repositories configured, GitHub workflow files if they exist
  7. a report.json file that contains aggregate information of the sample


The scraper written to retrieve this data is also included.


This dataset was created for a Computer Science Bachelor Research Project titled "An analysis of Java release practices on GitHub" by Vivian Roest.

History

  • 2024-01-29 first online, published, posted

Publisher

4TU.ResearchData

Format

tar.zst of xml, csv and json files, and git repo of source code

Organizations

TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science

DATA

To access the source code, use the following command:

git clone https://data.4tu.nl/v3/datasets/f38d8fdd-fe95-4bd4-968e-71db238de45e.git

Or download the latest commit as a ZIP.

Files (2)