Error Statistics Philosophy: U-PHIL (3): Stephen Senn on Stephen Senn!

I am grateful to Deborah Mayo for having highlighted my recent piece. I am not sure that it deserves the attention it is receiving. Deborah has spotted a flaw in my discussion of pragmatic Bayesianism. In praising the use of background knowledge I can neither be talking about automatic Bayesianism nor about subjective Bayesianism. It is clear that background knowledge ought not generally to lead to uninformative priors (whatever they might be) and so is not really what objective Bayesianism is about. On the other hand all subjective Bayesians care about is coherence and it is easy to produce examples where Bayesians quite logically will react differently to evidence, so what exactly is ‘background knowledge’?.

Nevertheless, if we start at a rather different point, a point at which I

think most applied statistics starts, we might end up with a rather

different attitude. The point is to say that about most problems we encounter we have some background experience and it is appropriate to consider this carefully when deciding a) what data to collect and b) how to interpret them.

A favourite example of mine is cross-over trials. You cannot make a sensible analysis of a cross-over trial without considering carry-over. Standard frequentist approaches are either to assume it does not exist or to grant that it might be anything at all and these two extremes lead to startlingly different inferences. In principle, a Bayesian has more options and can mix things. However, if (s)he wants to model carry-over claiming to have used background knowledge, then to convince me that this has merit, I shall have to be shown that the length of the wash-out period compared to the treatment period has been reasonably incorporated into the model as has sensible belief about likely duration of response based on general background pharmacology and in general that carry-over and treatments effects have been modelled as mutually dependent phenomena. Andy Grieve and I published a paper together on analysing cross-over trials in 1998, he from a Bayesian and I from a frequentist approach. I came to the conclusion that I liked his approach far more than what had for many years been assumed to be the proper way to do a frequentist analysis of cross-over trials. So I have quite a friendly attitude to anybody who is prepared to be locally Bayesian and try a recipe of judiciously chosen prior distributions (based on experience) plus suitable likelihood, provided that they take a suitably realistic and humble attitude to what they have achieved and don’t ram the ‘this is the only way to think’ attitude down my throat. Basically I regard such calculations as acceptable (and in some cases very useful) ‘subjective’ contributions to an ongoing objective program of testing and verification. In particular they can provide attractive ways (in principle!) of dealing with nuisance parameters.

I meant it quite seriously when I said that there was value in being prepared to use all four systems of inference. If I look at my own practice, I use maximum likelihood and significance tests (Fisher), confidence intervals and power calculations (Neyman-Pearson) and Bayesian decision analysis (De Finetti, Wald) and find uses for all of these. I am well aware that there are areas in which I could do better. For example, I think that most medical statisticians, myself included, have paid far too much attention to the power approach to sample size determination. We ought to be using approaches based on Bayesian decision theory as well.

The only one of the four systems I don’t use is automatic objective Bayes, largely because in the way it is currently applied, as far as I can see, it is pretty much redundant. To take a field I often work in, great claims have been made for such approaches to meta-analysis but the choice of frequentist or Bayesian framework (as most usually applied) seems to make almost no difference. On the other hand, decisions such as whether to treat the main effect of trial as fixed or random, whether to allow for a random trial by treatment interaction and if so whether to model the effect for the ‘average’ trial or patient are far more important. I don’t exclude, however, that if sensibly applied, Jeffreys’ approach could be very useful. The problem is, that so-called Bayesians have enthusiastically embraced one-half of it (‘uninformative’ prior) whilst finding no use for the part that Jeffreys actually considered his greatest contribution (significance tests). I have not really tried to use the combination of these two in the way that Jeffreys himself suggested and maybe if I did I would be pleasantly surprised. My excuse is that in not trying Bayesian significance tests I am following the practice of by far the great majority of ‘Bayesians’.

Error Statistics Philosophy

Search This Blog

Tuesday, January 24, 2012

U-PHIL (3): Stephen Senn on Stephen Senn!

No comments:

Post a Comment