We generate comprehensive data extracts of a set of Eclipse projects, including data sources like:
- Software Configuration Management (Eclipse git or GitHub),
 - Issues tracking (Bugzilla or GitHub),
 - Project metadata (PMI) checks (PMI),
 - Licencing and copyrights (Scancode), and
 - Static Code Analysis (SonarCloud) when available.
 
Each dataset is composed of:
- Compressed (gzip’d) CSV and JSON files for tool-specific data.
 - A full bundle including all above data files related to a project.
 - A R Markdown document that analyses the extracted files and provides some hints about how to use them. This document also serves as a validation step to identify empty or inconsistent datasets.
 
These datasets are published under the Creative Commons BY-Attribution-Share Alike 4.0 (International) licence. Data is updated weekly, at 2am on Sunday. If you would like to add a project, please submit an issue.