Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems
In the specialized literature, researchers can find a large number of proposals for solving regression problems that come from different research areas. However, researchers tend to use only proposals from the area in which they are experts. This paper analyses the performance of a large number of t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8788533/ |
id |
doaj-129b8f5fc92448c085187731466c235d |
---|---|
record_format |
Article |
spelling |
doaj-129b8f5fc92448c085187731466c235d2021-04-05T17:07:26ZengIEEEIEEE Access2169-35362019-01-01710891610893910.1109/ACCESS.2019.29332618788533Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression ProblemsMaria Jose Gacto0Jose Manuel Soto-Hidalgo1https://orcid.org/0000-0003-4412-5449Jesus Alcala-Fdez2Rafael Alcala3Department of Computer Science, University of Jaén, Jaén, SpainDepartment of Electronics and Computer Engineering, University of Córdoba, Córdoba, SpainDepartment of Computer Science and Artificial Intelligence, University of Granada, Granada, SpainDepartment of Computer Science and Artificial Intelligence, University of Granada, Granada, SpainIn the specialized literature, researchers can find a large number of proposals for solving regression problems that come from different research areas. However, researchers tend to use only proposals from the area in which they are experts. This paper analyses the performance of a large number of the available regression algorithms from some of the most known and widely used software tools in order to help non-expert users from other areas to properly solve their own regression problems and to help specialized researchers developing well-founded future proposals by properly comparing and identifying algorithms that will enable them to focus on significant further developments. To sum up, we have analyzed 164 algorithms that come from 14 main different families available in 6 software tools (Neural Networks, Support Vector Machines, Regression Trees, Rule-Based Methods, Stacking, Random Forests, Model trees, Generalized Linear Models, Nearest Neighbor methods, Partial Least Squares and Principal Component Regression, Multivariate Adaptive Regression Splines, Bagging, Boosting, and other methods) over 52 datasets. A new measure has also been proposed to show the goodness of each algorithm with respect to the others. Finally, a statistical analysis by non-parametric tests has been carried out over all the algorithms and on the best 30 algorithms, both with and without bagging. Results show that the algorithms from Random Forest, Model Tree and Support Vector Machine families get the best positions in the rankings obtained by the statistical tests when bagging is not considered. In addition, the use of bagging techniques significantly improves the performance of the algorithms without excessive increase in computational times.https://ieeexplore.ieee.org/document/8788533/Data miningsupervised learningregression algorithmsexperimental study |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Maria Jose Gacto Jose Manuel Soto-Hidalgo Jesus Alcala-Fdez Rafael Alcala |
spellingShingle |
Maria Jose Gacto Jose Manuel Soto-Hidalgo Jesus Alcala-Fdez Rafael Alcala Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems IEEE Access Data mining supervised learning regression algorithms experimental study |
author_facet |
Maria Jose Gacto Jose Manuel Soto-Hidalgo Jesus Alcala-Fdez Rafael Alcala |
author_sort |
Maria Jose Gacto |
title |
Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems |
title_short |
Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems |
title_full |
Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems |
title_fullStr |
Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems |
title_full_unstemmed |
Experimental Study on 164 Algorithms Available in Software Tools for Solving Standard Non-Linear Regression Problems |
title_sort |
experimental study on 164 algorithms available in software tools for solving standard non-linear regression problems |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
In the specialized literature, researchers can find a large number of proposals for solving regression problems that come from different research areas. However, researchers tend to use only proposals from the area in which they are experts. This paper analyses the performance of a large number of the available regression algorithms from some of the most known and widely used software tools in order to help non-expert users from other areas to properly solve their own regression problems and to help specialized researchers developing well-founded future proposals by properly comparing and identifying algorithms that will enable them to focus on significant further developments. To sum up, we have analyzed 164 algorithms that come from 14 main different families available in 6 software tools (Neural Networks, Support Vector Machines, Regression Trees, Rule-Based Methods, Stacking, Random Forests, Model trees, Generalized Linear Models, Nearest Neighbor methods, Partial Least Squares and Principal Component Regression, Multivariate Adaptive Regression Splines, Bagging, Boosting, and other methods) over 52 datasets. A new measure has also been proposed to show the goodness of each algorithm with respect to the others. Finally, a statistical analysis by non-parametric tests has been carried out over all the algorithms and on the best 30 algorithms, both with and without bagging. Results show that the algorithms from Random Forest, Model Tree and Support Vector Machine families get the best positions in the rankings obtained by the statistical tests when bagging is not considered. In addition, the use of bagging techniques significantly improves the performance of the algorithms without excessive increase in computational times. |
topic |
Data mining supervised learning regression algorithms experimental study |
url |
https://ieeexplore.ieee.org/document/8788533/ |
work_keys_str_mv |
AT mariajosegacto experimentalstudyon164algorithmsavailableinsoftwaretoolsforsolvingstandardnonlinearregressionproblems AT josemanuelsotohidalgo experimentalstudyon164algorithmsavailableinsoftwaretoolsforsolvingstandardnonlinearregressionproblems AT jesusalcalafdez experimentalstudyon164algorithmsavailableinsoftwaretoolsforsolvingstandardnonlinearregressionproblems AT rafaelalcala experimentalstudyon164algorithmsavailableinsoftwaretoolsforsolvingstandardnonlinearregressionproblems |
_version_ |
1721540213815115776 |