Improving Software Defect Prediction by Aggregated Change Metrics

To ensure the delivery of high quality software, it is necessary to ensure that all of its artifacts function properly, which is usually done by performing appropriate tests with limited resources. It is therefore desirable to identify defective artifacts so that they can be corrected before the tes...

Full description

Bibliographic Details
Main Authors: Lucija Sikic, Petar Afric, Adrian Satja Kurdija, Marin Silic
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9336678/
id doaj-ed2e8fe04b344a5296b8ff2b989345fb
record_format Article
spelling doaj-ed2e8fe04b344a5296b8ff2b989345fb2021-03-30T15:17:23ZengIEEEIEEE Access2169-35362021-01-019193911941110.1109/ACCESS.2021.30549489336678Improving Software Defect Prediction by Aggregated Change MetricsLucija Sikic0https://orcid.org/0000-0002-8011-1055Petar Afric1https://orcid.org/0000-0001-9270-5988Adrian Satja Kurdija2https://orcid.org/0000-0003-2313-0396Marin Silic3https://orcid.org/0000-0002-4896-7689Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, CroatiaFaculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, CroatiaFaculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, CroatiaFaculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, CroatiaTo ensure the delivery of high quality software, it is necessary to ensure that all of its artifacts function properly, which is usually done by performing appropriate tests with limited resources. It is therefore desirable to identify defective artifacts so that they can be corrected before the testing process. So far, researchers have proposed various predictive models for this purpose. Such models are typically trained on data representing previous project versions of a software and then used to predict which of the software artifacts in the new version are likely to be defective. However, the data representing a software project usually consists of measurable properties of the project or its modules, and leaves out information about the timeline of the software development process. To fill this gap, we propose a new set of metrics, namely aggregated change metrics, which are created by aggregating the data of all changes made to the software between two versions, taking into account the chronological order of the changes. In experiments conducted on open source projects written in Java, we show that the stability and performance of commonly used classification models are improved by extending a feature set to include both measurable properties of the analyzed software and the aggregated change metrics.https://ieeexplore.ieee.org/document/9336678/Classificationfeature engineeringprocess metricschange metricssoftware defect prediction
collection DOAJ
language English
format Article
sources DOAJ
author Lucija Sikic
Petar Afric
Adrian Satja Kurdija
Marin Silic
spellingShingle Lucija Sikic
Petar Afric
Adrian Satja Kurdija
Marin Silic
Improving Software Defect Prediction by Aggregated Change Metrics
IEEE Access
Classification
feature engineering
process metrics
change metrics
software defect prediction
author_facet Lucija Sikic
Petar Afric
Adrian Satja Kurdija
Marin Silic
author_sort Lucija Sikic
title Improving Software Defect Prediction by Aggregated Change Metrics
title_short Improving Software Defect Prediction by Aggregated Change Metrics
title_full Improving Software Defect Prediction by Aggregated Change Metrics
title_fullStr Improving Software Defect Prediction by Aggregated Change Metrics
title_full_unstemmed Improving Software Defect Prediction by Aggregated Change Metrics
title_sort improving software defect prediction by aggregated change metrics
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description To ensure the delivery of high quality software, it is necessary to ensure that all of its artifacts function properly, which is usually done by performing appropriate tests with limited resources. It is therefore desirable to identify defective artifacts so that they can be corrected before the testing process. So far, researchers have proposed various predictive models for this purpose. Such models are typically trained on data representing previous project versions of a software and then used to predict which of the software artifacts in the new version are likely to be defective. However, the data representing a software project usually consists of measurable properties of the project or its modules, and leaves out information about the timeline of the software development process. To fill this gap, we propose a new set of metrics, namely aggregated change metrics, which are created by aggregating the data of all changes made to the software between two versions, taking into account the chronological order of the changes. In experiments conducted on open source projects written in Java, we show that the stability and performance of commonly used classification models are improved by extending a feature set to include both measurable properties of the analyzed software and the aggregated change metrics.
topic Classification
feature engineering
process metrics
change metrics
software defect prediction
url https://ieeexplore.ieee.org/document/9336678/
work_keys_str_mv AT lucijasikic improvingsoftwaredefectpredictionbyaggregatedchangemetrics
AT petarafric improvingsoftwaredefectpredictionbyaggregatedchangemetrics
AT adriansatjakurdija improvingsoftwaredefectpredictionbyaggregatedchangemetrics
AT marinsilic improvingsoftwaredefectpredictionbyaggregatedchangemetrics
_version_ 1724179817502343168