Summary: | In recent years, Neural Networks (NN) have become a popular data-analytic tool in Statistics,
Computer Science and many other fields. NNs can be used as universal approximators, that is, a tool
for regressing a dependent variable on a possibly complicated function of the explanatory variables.
The NN parameters, unfortunately, are notoriously hard to interpret. Under the Bayesian view,
we propose and discuss prior distributions for some of the network parameters which encourage
parsimony and reduce overfit, by eliminating redundancy, promoting orthogonality, linearity or
additivity. Thus we consider more senses of parsimony than are discussed in the existing literature.
We investigate the predictive performance of networks fit under these various priors. The Deviance
Information Criterion (DIC) is briefly explored as a model selection criterion.
|