Replication package for the ICSME 2018 paper "Threats of Aggregating Software Repository Data" by Martin P. Robillard, Mathieu Nassif, Shane McIntosh.

Download the data here: icsme2018-data.zip (21 MB). The file is a zip archive that extracts to the directory icsme2018-data.

This data complements the replication package of Nassif and Robillard, ICSME 2017 (https://doi.org/10.1109/ICSME.2017.64), page 272.

Available at: https://www.cs.mcgill.ca/~swevo/knowledgeloss/

Data for each project is contained in the directory with the name of the project. The project data directories contains the following files:

properties/data_[timestamp].csv

Files containing the extended properties computed for each file present in the period. Each line is a file present during the period. Each line is a comma-separated record. The file has headers that correspond to properties in Figure 1 of the paper.

[project]-report.md

An automatically-generated report that highlights key statistics for of each event analyzed in the study.

map.txt

A map between files and module names. Each module name starts with a # character. All files that start with the prefix(es) below the module name get mapped to the module, in order. The prefix below #TRIM is removed from all input file names prior to matching.

License and Attribution

This data artifact is provided under the terms of the Creative Commons---Attribution 4.0 International License

If you use this data place include the following reference:

Martin P. Robillard, Mathieu Nassif, and Shane McIntosh. Threats of Aggregating Software Repository Data. In Proceedings of the 34th IEEE International Conference on Software Maintenance and Evolution, 2018.