picking up the pieces |
Thanks to Nancy Cartwright, a little ad hoc discussion group has formed: “PhilErrorStat: LSE: Three weeks in (Nov-Dec) 2011.” I’ll be posting related items on this blog, in the column to your left, over its short lifetime. We’re taking a look at some articles and issues leading up to a paper I’m putting together to give in Madrid next month on the Birnbaum-likelihood principle business (“Breaking Through the Breakthrough”) at a conference (“The Controversy about Hypothesis Testing,” Madrid, December 15-16, 2011). I hope also to get this group’s feedback as I follow through on responses I’ve been promising to some of the comments and queries I’ve received these past few months.
Our very first meeting already reminded me of an issue Christian Robert raised in his blog about Error and Inference: Is the frequentist (error-statistical) interest in probing discrepancies, and the ways in which statistical hypotheses and models can be false, akin to a Bayesian call for setting out rival hypotheses with prior probability assignments?
Our very first meeting already reminded me of an issue Christian Robert raised in his blog about Error and Inference: Is the frequentist (error-statistical) interest in probing discrepancies, and the ways in which statistical hypotheses and models can be false, akin to a Bayesian call for setting out rival hypotheses with prior probability assignments?
The answer is no, quite the opposite. Since I see Robert has put his remarks together into a kind of overall document, I will refer to its pages directly: http://arxiv.org/abs/1111.5827.
Referring approvingly to our remark that
“Virtually all (...) models are to some extent provisional, which is precisely what is expected in the building up of knowledge.”—D. Cox and D. Mayo, p.283, Error and Inference, 2010.
Robert maintains that:
“This transience of working models is a good reason in my opinion to account for the possibility of alternative working models from the start of the statistical analysis. Hence for an inclusion of those models in the statistical analysis equally from the start. Which leads almost inevitably to a Bayesian formulation of the testing problem.” (Robert 2011: http://arxiv.org/abs/1111.5827 p. 8)
The opposite is true. To the extent that we regard theories and models in science as provisional, we can’t even begin to list all the rival theories that will arise, nor all the models that may be discovered to eventually replace the one now under consideration. Why would we want an account that requires delineating all possible rivals just to get a single inquiry going, much less one that also requires that they be assigned degrees of probability (however interpreted)? So it cannot be that our predicament leads “almost inevitably to a Bayesian formulation”. Scientific inquiry and learning are far too open-ended for that.
On the contrary, our situation speaks to the need for an approach that lets us break apart inquiries piecemeal in order to ask one question at a time. The models and methods of frequentist error statistics lead us to “get small” and identify a constrained question that we can probe severely. (Please see my Nov. 3 blog post.)
If we take Robert’s remark seriously (but, to be fair, it’s just a blog post that I think he wants to turn into a review of E & I), it’s like saying that since all the theories we have in front of us right now are incomplete, and bound to come up short or at best be approximate, that we need a statistical account that requires listing all the theories, including those not in front of us, or at least requires assigning a degree of probability to a lump disjunction: the “catchall hypothesis.” No wonder Robert claims not to see the difference between alternative statistical hypotheses (couched within a model)[i] and what philosophers typically mean by the “catchall hypothesis” in relation to a hypothesis H—all possible rivals to H, including claims not yet thought of.
There is a big difference between exhausting the space of answers to a given question, as in a Neyman-Pearson test, and exhausting all possible theories in a domain. The former is something we can get going with; it’s one of the most valuable aspects of the piecemeal error-statistical approach. Where certain low-level claims (about parameters, directions of effect, observed correlations) appear overly simple if they’re perceived as the final object of study, when it comes to exhausting the space of a local error they’re just the ticket!
Many people love to repeat the refrain that “all models are wrong.” They’re missing an essential point. Even granting that all models are wrong, and all theories are strictly false, we’re not barred from correctly understanding aspects of the world. We’re not prevented from distinguishing and ruling out erroneous interpretations of data, or from making correct inferences about stable patterns or about how to reliably trigger effects. We can exhaust the space with respect to a single question which, if shrewdly posed, can teach us which, if any, variants on our models hold, and how to improve them. Even if we err, we can arrange things so that there’s a good chance that the error will be detected in subsequent checks or attempts to replicate. In this way we can arrive at partial models that pass severely (within the time of a typical set of inquiries or research endeavor), and this knowledge remains through theory change.
If Robert thinks about it, I’m guessing he would agree with the following: to acknowledge that current theories and models are “provisional,” that they are likely to be reinterpreted and replaced by others in the future, does not lead to our wanting an account in which they’d all have to be trotted out now, along with their prior probabilities, in order to make progress today. My guess is that he would agree further that this acknowledgment actually argues in favor of an account that helps us to build and test local rival models and even mere directional alternatives. Even if one believes that an ideal world is one in which scientists arranged all possible alternative theories in a domain, and ran a Bayesian algorithm on them—to me this would be, essentially, a denial of learning, and would rob us of the creative value of finding things out--this is not how we actually go forward in the (complex and messy) real world. Frequentist methods and principles may not give the Bayesian what they want (or what they think they’d want), but if they tried sometimes they just might find, ….they give what we need!
[i] On p. 2 he claims my use of the “catchall hypothesis” is merely to give a new name replacing the more standard “alternative” hypothesis. It is not.
No comments:
Post a Comment
===========================