Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it

The article deals with the problem of economic adequacy of solving a linear regression problem by the OLS method. The study uses the following definition of adequacy: a linear regression solution is considered adequate if it not only has correct signs but also correctly reflects the relationship bet...

Full description

Bibliographic Details
Main Authors: Tyzhnenko Alexander G., Ryeznik Yevgen V.
Format: Article
Language:English
Published: PH "INZHEK" 2019-03-01
Series:Problemi Ekonomiki
Subjects:
OLS
Online Access:http://www.problecon.com/export_pdf/problems-of-economy-2019-1_0-pages-217_227.pdf
id doaj-fb079893b48e4fae90130112092fb2e2
record_format Article
spelling doaj-fb079893b48e4fae90130112092fb2e22020-11-24T22:19:42ZengPH "INZHEK"Problemi Ekonomiki2222-07122311-11862019-03-01139217227https://doi.org/10.32983/2222-0712-2019-1-217-227Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without itTyzhnenko Alexander G.0Ryeznik Yevgen V.1Simon Kuznets Kharkiv National University of EconomicsUppsala UniversityThe article deals with the problem of economic adequacy of solving a linear regression problem by the OLS method. The study uses the following definition of adequacy: a linear regression solution is considered adequate if it not only has correct signs but also correctly reflects the relationship between coefficients of regression in the population. If in this case the coefficient of determination is greater than 0.8, the solution is considered economically adequate. As an indicator of adequacy of a linear regression problem solution it is proposed to use a 10 % level of the coefficient of variability (CV) of the regression coefficients. It is shown that OLS solutions may be not adequate to the solution in the population, although they may be physically correct (with correct signs) and statistically significant. The mentioned result is obtained by using the artificial data population (ADP) algorithm. The ADP allows generating data of any size with known regression coefficients in the whole population, which can be calculated with the aid of the OLS solution for a very large sample. The ADP algorithm makes it possible to change the regular component of the influence of the regressors on the response. Besides, the random changes of regressors in the ADP are divided into two parts. The first part is coherent to the response changes, but the second part is completely random (incoherent). This one allows changing the near-collinearity level of the data by changing the variance of the incoherent noise in regressors. Studies using ADP have shown that with a high probability the OLS solutions are physically incorrect if the sample sizes (n) are less than 23; physically correct but not adequate for 23 < n < 400; adequate for n > 400. Furthermore, it is noted that if the elimination of strongly correlated regressors is not economically justified but is rather a measure of lowering the value of the VIF-factor, the results may be far from the reality. In this regard, it is stated that the use of the MOLS eliminates the need to exclude strongly correlated regressors at all, since the accuracy of the MOLS solution increases with an increase in the VIF.http://www.problecon.com/export_pdf/problems-of-economy-2019-1_0-pages-217_227.pdfmulticollinearityOLSdata simulationartificial populationphysical correctnessadequacy
collection DOAJ
language English
format Article
sources DOAJ
author Tyzhnenko Alexander G.
Ryeznik Yevgen V.
spellingShingle Tyzhnenko Alexander G.
Ryeznik Yevgen V.
Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
Problemi Ekonomiki
multicollinearity
OLS
data simulation
artificial population
physical correctness
adequacy
author_facet Tyzhnenko Alexander G.
Ryeznik Yevgen V.
author_sort Tyzhnenko Alexander G.
title Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
title_short Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
title_full Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
title_fullStr Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
title_full_unstemmed Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
title_sort ordinary least squares: the adequacy of linear regression solutions under multicollinearity and without it
publisher PH "INZHEK"
series Problemi Ekonomiki
issn 2222-0712
2311-1186
publishDate 2019-03-01
description The article deals with the problem of economic adequacy of solving a linear regression problem by the OLS method. The study uses the following definition of adequacy: a linear regression solution is considered adequate if it not only has correct signs but also correctly reflects the relationship between coefficients of regression in the population. If in this case the coefficient of determination is greater than 0.8, the solution is considered economically adequate. As an indicator of adequacy of a linear regression problem solution it is proposed to use a 10 % level of the coefficient of variability (CV) of the regression coefficients. It is shown that OLS solutions may be not adequate to the solution in the population, although they may be physically correct (with correct signs) and statistically significant. The mentioned result is obtained by using the artificial data population (ADP) algorithm. The ADP allows generating data of any size with known regression coefficients in the whole population, which can be calculated with the aid of the OLS solution for a very large sample. The ADP algorithm makes it possible to change the regular component of the influence of the regressors on the response. Besides, the random changes of regressors in the ADP are divided into two parts. The first part is coherent to the response changes, but the second part is completely random (incoherent). This one allows changing the near-collinearity level of the data by changing the variance of the incoherent noise in regressors. Studies using ADP have shown that with a high probability the OLS solutions are physically incorrect if the sample sizes (n) are less than 23; physically correct but not adequate for 23 < n < 400; adequate for n > 400. Furthermore, it is noted that if the elimination of strongly correlated regressors is not economically justified but is rather a measure of lowering the value of the VIF-factor, the results may be far from the reality. In this regard, it is stated that the use of the MOLS eliminates the need to exclude strongly correlated regressors at all, since the accuracy of the MOLS solution increases with an increase in the VIF.
topic multicollinearity
OLS
data simulation
artificial population
physical correctness
adequacy
url http://www.problecon.com/export_pdf/problems-of-economy-2019-1_0-pages-217_227.pdf
work_keys_str_mv AT tyzhnenkoalexanderg ordinaryleastsquarestheadequacyoflinearregressionsolutionsundermulticollinearityandwithoutit
AT ryeznikyevgenv ordinaryleastsquarestheadequacyoflinearregressionsolutionsundermulticollinearityandwithoutit
_version_ 1725777854969413632