Probabilistic forecasting of replication studies.
Throughout the last decade, the so-called replication crisis has stimulated many researchers to conduct large-scale replication projects. With data from four of these projects, we computed probabilistic forecasts of the replication outcomes, which we then evaluated regarding discrimination, calibrat...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2020-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0231416 |
id |
doaj-b2fc1f4e8913469f888b74d49f3dc5c8 |
---|---|
record_format |
Article |
spelling |
doaj-b2fc1f4e8913469f888b74d49f3dc5c82021-03-03T21:39:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01154e023141610.1371/journal.pone.0231416Probabilistic forecasting of replication studies.Samuel PawelLeonhard HeldThroughout the last decade, the so-called replication crisis has stimulated many researchers to conduct large-scale replication projects. With data from four of these projects, we computed probabilistic forecasts of the replication outcomes, which we then evaluated regarding discrimination, calibration and sharpness. A novel model, which can take into account both inflation and heterogeneity of effects, was used and predicted the effect estimate of the replication study with good performance in two of the four data sets. In the other two data sets, predictive performance was still substantially improved compared to the naive model which does not consider inflation and heterogeneity of effects. The results suggest that many of the estimates from the original studies were inflated, possibly caused by publication bias or questionable research practices, and also that some degree of heterogeneity between original and replication effects should be expected. Moreover, the results indicate that the use of statistical significance as the only criterion for replication success may be questionable, since from a predictive viewpoint, non-significant replication results are often compatible with significant results from the original study. The developed statistical methods as well as the data sets are available in the R package ReplicationSuccess.https://doi.org/10.1371/journal.pone.0231416 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Samuel Pawel Leonhard Held |
spellingShingle |
Samuel Pawel Leonhard Held Probabilistic forecasting of replication studies. PLoS ONE |
author_facet |
Samuel Pawel Leonhard Held |
author_sort |
Samuel Pawel |
title |
Probabilistic forecasting of replication studies. |
title_short |
Probabilistic forecasting of replication studies. |
title_full |
Probabilistic forecasting of replication studies. |
title_fullStr |
Probabilistic forecasting of replication studies. |
title_full_unstemmed |
Probabilistic forecasting of replication studies. |
title_sort |
probabilistic forecasting of replication studies. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2020-01-01 |
description |
Throughout the last decade, the so-called replication crisis has stimulated many researchers to conduct large-scale replication projects. With data from four of these projects, we computed probabilistic forecasts of the replication outcomes, which we then evaluated regarding discrimination, calibration and sharpness. A novel model, which can take into account both inflation and heterogeneity of effects, was used and predicted the effect estimate of the replication study with good performance in two of the four data sets. In the other two data sets, predictive performance was still substantially improved compared to the naive model which does not consider inflation and heterogeneity of effects. The results suggest that many of the estimates from the original studies were inflated, possibly caused by publication bias or questionable research practices, and also that some degree of heterogeneity between original and replication effects should be expected. Moreover, the results indicate that the use of statistical significance as the only criterion for replication success may be questionable, since from a predictive viewpoint, non-significant replication results are often compatible with significant results from the original study. The developed statistical methods as well as the data sets are available in the R package ReplicationSuccess. |
url |
https://doi.org/10.1371/journal.pone.0231416 |
work_keys_str_mv |
AT samuelpawel probabilisticforecastingofreplicationstudies AT leonhardheld probabilisticforecastingofreplicationstudies |
_version_ |
1714815804173713408 |