Some Constructive Suggestions on False Models
In the statistical literature, some knowledge about the true data generating model is often assumed. But the very word `model' implies simplification and idealization and various aspects of the so-called `true' model may not be fully precise. This is the sense in which George Box made his famous remark that all models are wrong, among which some are useful. Models might therefore be ranked by their `usefulness' from very useful to essentially useless. It is a common problem in Statistics to try to improve the inference while working with false models which are not especially useful. We first discuss a case in the context of temporal data of extreme values where most models that are often used in practice do not describe the characteristics of extreme events adequately or correctly. Several methods are used in practice for analyzing extreme values, most of which are based on the well known extreme value limit distributions and the related Generalized Extreme Value (GEV) family. More recent approaches extended those methods based on Poisson point process view of high level exceedances which is more useful in incorporating nonstationarity in the exceedances. But all of these approaches work under a tacit assumption that the frequency of those extreme events is approximately uniform over time, or at the most, follows a similar pattern as the magnitudes of the extremes over time. To invoke this assumption, time varying thresholds for identifying extreme events are employed which are often subjective and sometimes unrealistic. If we work with a fixed threshold which we believe is not unnatural in many examples, these models may be easily described as ``false", especially when the observations show that the changing patterns of frequency and magnitude of extreme events are quite different over time. In such cases, we have tried to `correct' these models by an extension of the Poisson process approach by incorporating separate parametric functions for modeling the frequency and the magnitude of extreme events. We discuss methodological aspects of this `improved' model and demonstrate that it may significantly outperform the usual models for extreme values through application on simulated and real datasets. We then focus on Bayesian Model Averaging (BMA) in the ``model false" case in the context of linear regression. Given a dataset from some experiment, one of the most important questions is how to predict a future observation. In the Bayesian approach to model selection, this entails the use of a squared-error loss. And if we focus on getting hold of the best predicted value and not a particular model (i.e. the `best' available model), the answer turns out to be the BMA estimate. This provides a coherent mechanism for accounting for the model uncertainty which is ignored in other approaches. In a linear regression set-up with the usual variable selection approach, the BMA estimate is nothing but a weighted average of optimal Bayesian predictors under each individual model with the weights being the posterior probabilities of the respective models. In the ``model false" scenario, where all available models in our model space are wrong, BMA is often outperformed by non-Bayes model averaging like Stacking in the linear models set-up. This has been already demonstrated by many authors, especially in the context of classification. We investigate theoretically why BMA performs poorly in the ``model false" scenario for linear regression models. Under the quite rich families of g-priors and mixtures of g-priors, we show theoretical results that explain why BMA is worse than frequentist methods such as Stacking for datasets of moderate to large size. We use simulated data to demonstrate our findings and indicate how possible rectifications can be made.
Ghosh, Purdue University.
Off-Campus Purdue Users:
To access this dissertation, please log in to our