Abstract

Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications, including longitudinal studies of individual Java projects and the creation of new corpora.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
PublisherIEEE - Institute of Electrical and Electronics Engineers Inc.
Pages236-241
Number of pages6
ISBN (Electronic)9781665496094
DOIs
Publication statusPublished - 2022
Event22nd IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2022 - Limassol, Cyprus
Duration: 2022 Oct 32022 Oct 4

Conference

Conference22nd IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
Country/TerritoryCyprus
CityLimassol
Period2022/10/032022/10/04

Subject classification (UKÄ)

  • Software Engineering

Free keywords

  • Software Corpora
  • Software Tools
  • Source-Code Analysis

Fingerprint

Dive into the research topics of 'JFeature: Know Your Corpus'. Together they form a unique fingerprint.

Cite this