The following are my thoughts on the paper “Beyond subjective and objective in statistics” by Gelman & Hennig (JRSS A, 2017), which was read at an ordinary meeting of the RSS on Wednesday. Overall, I really liked the paper. From the title and abstract, I was worried that it was either going to be a pointless philosophical argument of Bayes vs. frequentist, or else a statement of the obvious, but it was neither. In fact, the authors argue against tribalism in statistics and attempt to provide some universal guidelines for statistical practice.
Two brilliant slides from Philip Dawid responding to Hennig & Gelman pic.twitter.com/UXaD7CY00X
— Robert Grant (@robertstats) 12 April 2017
Statistics is an essential element of modern science, and has been for quite some time. As such, statistical procedures should be evaluated with regard to the philosophy of science. Towards this goal, the authors propose seven statistical virtues that could serve as a guide for authors (and reviewers) of scientific papers. The chief of these is transparency: thorough documentation of the choices, assumptions and limitations of the analysis. These choices need to be justified within the context of the scientific study. Given the ‘no free lunch’ theorems (Wolpert, 1996), such contextual dependence is a necessary property of any useful method.
The authors argue that “subjective” and “objective” are ambiguous terms that harm statistical discourse. No methodology has an exclusive claim to objectivity, since even null hypothesis significance testing (NHST) involves choice of the sampling distribution, as well as the infamous α=0.05. The use of default priors, as in Objective Bayes, requires ignoring any available information about the parameters of interest. This can conflict with other goals, such as identifiability and regularisation. The seven virtues are intended to be universal and can apply irrespective of whether the chosen methodology is frequentist or Bayesian. Indeed, the authors advocate a methodology that combines features of both.
There have been many other attempts to reconcile frequentist and Bayesian approaches to produce a grand unified theory of statistics. The main feature of the methodology in Section 5.5 is iterative refinement of the model (including priors and tuning parameters) to better fit the observed data. Rather than Bayesian updating or model choice, the suggested procedure involves graphical summaries of model fit (Gelman et al. 2013). This has connections with well-calibrated Bayes (Dawid 1982) and hypothetico-deductive Bayes (Gelman & Shalizi, 2013). I think that this is a good approach, albeit saddled with an unfortunate misnomer.
The term “falsificationist” might be slightly less clumsy than “hypothetico-deductive,” but nevertheless seems misleading. Leaving aside the question of whether statistical hypotheses are falsifiable at all, except in the limit of infinite data, falsification in the Popperian sense is really not the goal. This would imply abandoning an inadequate model and starting again from scratch. As stated by Gelman (2007),
“…the purpose of model checking (as we see it) is not to reject a model but rather to understand the ways in which it does not fit the data.”
Furthermore, this approach is not limited to posterior predictive distributions. It could be applied to any generative model, not necessarily a Bayesian one. Thus, falsificationist Bayesianism as presented in this paper is neither falsificationist nor Bayesian, but it is an excellent approach nevertheless.