Robust Multiple Regression

As modern data analysis pushes the boundaries of classical statistics, it is timely to reexamine alternate approaches to dealing with outliers in multiple regression. As sample sizes and the number of predictors increase, interactive methodology becomes less effective. Likewise, with limited underst...

Full description

Bibliographic Details
Main Authors: David W. Scott, Zhipeng Wang
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/23/1/88
Description
Summary:As modern data analysis pushes the boundaries of classical statistics, it is timely to reexamine alternate approaches to dealing with outliers in multiple regression. As sample sizes and the number of predictors increase, interactive methodology becomes less effective. Likewise, with limited understanding of the underlying contamination process, diagnostics are likely to fail as well. In this article, we advocate for a non-likelihood procedure that attempts to quantify the fraction of bad data as a part of the estimation step. These ideas also allow for the selection of important predictors under some assumptions. As there are many robust algorithms available, running several and looking for interesting differences is a sensible strategy for understanding the nature of the outliers.
ISSN:1099-4300