A thought on resistance to Bayesian statistics

I'm not a statistician, and as a quantum theorist of a relatively abstract sort, I've done little actual data analysis.  But because of my abstract interests, the nature of probability and its use in making inferences from data are of great interest.  I have some relatively ill-informed thoughts on why the "classical statistics" community seems to have been quite resistant to "Bayesian statistics", at least for a while, that may be of interest, or at least worth logging for my own reference. Take this post in the original (?) spirit of the term "web log", rather than as a polished piece of the sort many blogs, functioning more in the spirit of online magazines, seem to aim at nowadays.

The main idea is this.  Suppose doing Bayesian statistics is thought of as actually adopting a prior which specifies, say, one's initial estimate of the probabilities of several hypotheses, and then, on the basis of the data, computing the posterior probability of the hypotheses.  In other words, what is usually called "Bayesian inference". That may be a poor way of presenting the results of an experiment, although it is a good way for individuals to reason about how the results of the experiment should affect their beliefs and decisions.  The problem is that different users of the experimental results, e.g. different readers of a published study, may have different priors.  What one would like is rather to present these users with a statistic, that is, some function of the data, much more succinct than simply publishing the data themselves, but just as useful, or almost as useful, in making the transition from prior probabilities to posterior probabilities, that is, of updating one's beliefs about the hypotheses of interest, to take into account the new data. Of course, for a compressed version of the data (a statistic) to be useful, it is probably necessary that the users share certain basic assumptions about the nature of the experiment.  These assumptions might involve the probabilities of various experimental outcomes, or sets of data, if various hypotheses are true (or if a parameter takes various values), i.e., the likelihood function;  they might also involve a restriction on the class of priors for which the statistic is likely to be useful.  These should be spelled out, and, if it is not obvious, how the statistic can be used in computing posterior probabilities should be spelled out as well.

It seems to me likely that many classical or "frequentist" statistics may be used in such a way; but, quite possibly, classical language, like saying that statistical inference leads to "acceptance" or "rejection" of hypotheses, tends to obscure this more desirable use of the statistic as a potential input to the computation of posterior probabilities.  In fact, I think people tend to have a natural tendency to want some notion of what the posterior probability of a hypothesis is; this is one source of the erroneous tendency, still sometimes found among the public, to confuse confidence levels with probabilities.  Sometimes an advocacy of classical statistical tests may go with an ideological resistance to the computation of posterior probabilities, but I suppose not always.  It also seems likely that in many cases, publishing actual Bayesian computations may be a good alternative to classical procedures, particularly if one is able to summarize in a formula what the data imply about posterior probabilities, for a broad enough range of priors that many or most users would find their prior beliefs adequately approximated by them.  But in any case, I think it is essential, in order to properly understand the meaning of reports of classical statistical tests, to understand how they can be used as inputs to Bayesian inference.  There may be other issues as well, e.g. that in some cases classical tests may make suboptimal use of the information available in the data.  In other words, they may not provide a sufficient statistic: a function of the data that contains all the information available in the data, about some random variable of interest (say, whether a particular hypothesis is true or not). Of course whether or not a statistic is sufficient will depend on how one models the situation.

Most of this is old hat, but it is worth keeping in mind, especially as a Bayesian trying to understand what is going on when "frequentist" statisticians get defensive about general Bayesian critiques of their methods.

2 thoughts on “A thought on resistance to Bayesian statistics

  1. I think, apart from a few diehards, one is more likely to encounter compatibilists in statistics departments these days, who don't think of Bayesian and frequentist methods as fundamentally opposed. This is partly due to increased computational power, which renders Bayesian methods much more tractable than they used to be, along with the success of Bayesian methods in machine learning and in "big data" in general. It is much harder to ignore Bayesian methods as irrelevant nowadays.

    What I often hear from statisticians these days is that it is good to use Bayesian methods, but classical methods provide a means to check the veracity of a proposed Bayesian method. I do not quite understand what they mean by this, but I think they are talking at a much more practical level than the abstract subjective vs. frequentist debate in the foundations of probability, which obviously would not countenance such a thing.

    However, I think you are on to something when you point out that the aim of doing a statistical analysis is not to update your own prior but to summarize the data in a way that is useful for other people who have "reasonable" beliefs. In fact, it is safe to say that we almost never do a strict Bayesian analysis of our own individual beliefs, and only tend to use formal probabilistic and statistical analysis when we are trying to prove something to others, i.e. when we are trying to form a consensus. It has always puzzled me that Bayesians insist that probability is a calculus of consistent individual belief when we almost never use it in this way. In fact, it seems to me that the decision scenarios facing a single individual are often idiosynchratic enough that we would not expect the decision theoretic arguments to go through for them, whereas the scenario of a large group trying to come to consensus is much better suited for the application of probability. Another way of saying this is that a rational individual is rarely compelled to be a Bayesian agent, but a large group of rational people often is. If I were formulating my own foundation for probability theory, this is where I would start.

    Finally, it is also true that Bayesian methods have themselves evolved away from arbitrary priors. I find the idea of a "least informative priors" interesting not because they are a good way of representing individual belief, but rather because they provide a way of letting the data do the talking and coming up with a conclusion that even a person who has very little prior commitment should be able to agree with. In this way, I think such methods provide a way of summarizing what the data say, rather than what anyone should think about them, which is what you are after.

  2. Hi Matthew---

    About being likely to find Bayesian-classical compatibilists nowadays, I suspect you're right, and right about your reasons. I have seen some very interesting talks suggesting that the brain implements Bayesian networks for at least some purposes (will try to recall more details sometime). I still occasionally encounter some fairly strong anti-Bayesian sentiment though... Cosma Shalizi's blog is the main place I remember recently (i.e. within the last few years).

    You wrote: "What I often hear from statisticians these days is that it is good to use Bayesian methods, but classical methods provide a means to check the veracity of a proposed Bayesian method." Like you I don't understand what this means, at least it's not obvious on the face of it. Whereas one commonly hears the something like the reverse from Bayesians... something like, that one needs to use Bayesian analysis to understand what classical methods are really telling us...

    Your third paragraph is extremely interesting, and takes up a theme that is also in your winning FQXi essay http://fqxi.org/community/essay/winners/2013.1#Leifer (congratulations again!). I am pretty sure I don't understand what you have in mind, and that it is worth understanding, so hope to spend some time doing it. You write "It has always puzzled me that Bayesians insist that probability is a calculus of consistent individual belief when we almost never use it in this way." One take on this point is that it still might be a consistency/rationality criterion even if we don't actually consciously use the calculus. We might, for example, use heuristics that have roughly a Bayesian representation, but not know what that representation is. We might possibly be moved to revise our beliefs, when an inconsistency with Bayesian representation is pointed out. Another, perhaps somewhat less Bayesian, but still with a lot of the spirit of Bayesianism, point that might be related to your statement is that most of us have very far from complete preferences over the possible choices we can envision making, but that there are (I don't have the references, but encountered them while delving into the idea of rational decisionmaking under non-probabilistic uncertainty in econ grad school) representations of such incomplete preference orderings over uncertain alternatives, in terms of a convex set of utility functions and a convex set of probability distributions, as long as the incomplete preference ordering obeys certain rationality axioms similar to the ones in standard Savage or Fishburn type proofs. Or perhaps it is a convex set of pairs of a probability distribution and associated utility function. I don't recall the details, but suppose that perhaps one is "rationally required" to choose one alternative over another if it has higher expected utility according to everything in the convex set, and "rationally free to" if it has higher expected utility according to at least one thing in the convex set. Less important (in my opinion) deviations from representability of our decisionmaking via standard Bayesian subjective-expected-utility maximization might be such things a lexical utility or lexical probabilities (i.e. not real-valued, but still in terms of ordered sets, even if lacking some "technical" axiom that ensures a real representation). There is also some work (not sure where I first heard of it, but Peter Fishburn mentioned it when I visited AT&T in 2000) on preferences satisfying what he seemed to think of as all the reasonable rationality requirements applicable to the situation, but because they are not required to be defined over a sufficiently rich set of alternatives (rather, just a finite set, or maybe countable set...), not representable as expected utility maximization. (I should track these references down... I am curious if they could be embedded in some lexical utility/probability framework, or not). This latter may be closer to what you allude to in your FQXi essay when you point to the limited number of decisions that we actually face. But I might be happier than you to impose a kind of "counterfactual consistency" on our decisionmaking... I have much more thinking to do here to understand what you are getting at in that essay.

    Getting back to statistical practice, I think perhaps a lot of the problem with classical methods nowadays is the way they are used and misused and understood and misunderstood by the press and by relatively unsophisticated users, rather than by the academic statistics community. E.g. press trumpeting that a study "found no statistically significant relation between X and Y" when the study lacks enough cases satisfying the relevant conditions to have much power to detect a relation between X and Y. It may be, though, that the only way to really counter that sort of thing is a greater penetration of a Bayesian understanding of things into the world of journalism and of relatively casual users of "canned" statistical analysis packages...

Comments are closed.