This is a first draft of part II of the presentation begun in the December 6 blogpost. This completes the proposed presentation. I expect
errors, and I will be grateful for feedback! (NOTE: I did not need to actually rip a cover of EGEK to obtain this effect!)
You have observed y”, the .05 significant result from E”,
the optional stopping rule, ending at n = 100.
Birnbaum claims he can show that you, as a frequentist error
statistician, must grant that it is equivalent to having fixed n= 100 at the
start (i.e., experiment E’)
Reminder:
The (strong) Likelihood
Principle (LP) is a universal conditional claim:
If two data sets y’
and y”
from experiments E’ and E” respectively, have likelihood functions which are
functions of the same parameter(s) µ
and are proportional to each
other, then y’ and y” should lead to identical
inferential conclusions about µ.
As with conditional proofs,
we assume the antecedent and try to derive the consequent, or equivalently,
show a contradiction results whenever the antecedent holds and the consequent
does not (reductio proof).
LP Violation Pairs
Start with any violation of
the LP, that is, a case where the antecedent of the LP holds, and the
consequent does not hold, and show you get a contradiction.
Assume then that the pair of
outcomes y’ and y”, from E’ and E” respectively, represent a violation of the
LP. We may call them LP pairs.
Step 1:
Birnbaum will describe a
funny kind of ‘mixture’ experiment based on an LP pair; You observed y” say
from experiment E”.
Having observed y” from the
optional stopping (stopped say at n = 100) I am to imagine it resulted from
getting heads on the toss of a fair coin, where tails would have meant
performing the fixed sample size experiment with n = 100 from the start.
Next, erase the fact that y”
came from E” and report (y’, E')
Call this test statistic: TBB:
The Birnbaum test statistic TBB:
Case 1: If you observe y” (from E”) and y” has an LP pair in E’, just
report (y’, E')
Case 2: If your observed
outcome does not have an LP pair, just report it as usual
(any outcome from optional
stopping E” has an LP pair in the corresponding fixed sample size experiment
E’)
Only case 1 results matter for the points of the proof we need to
consider.
I said it was a funny kind of
mixture, there are two things that make it funny:
· It didn’t happen, you only
observed y” from E”
· Second, you are to report an
outcome as y’ from E’ even though you actually observed y” from E” (and further,
you are to report the mixture)
We may call it Birnbaumizing the result you got;
whenever you have observed a potential LP violation, “Birnbaumize” it as above.
If you observe y” (from E”)
and y” has an LP pair in E’, just report y’ (i.e., report (y’, E’)
So you’d report this whether you actually observed y’ or if you got y”
-----------------------------
We said our inference would
be in the form of p-values
Now to obtain the p-value we
must use the defined sampling distribution of TBB---the convex combination:
In reporting a p-value
associated with y” we are to report the average of p’ and p”: (p’ + p”)/2.
(the ½ comes from the
imagined fair coin)
Having thus “Birnbaumized”
the particular LP pair that you actually observed, it appears that you must
treat y’ as evidentially equivalent to its LP pair, y”.
The test statistic TBB
is a sufficient statistic,
technically, but the rest of the argument overlooks that an error statistician
still must take into account the sampling distributions at each step.
At this step, it refers to
the distribution of TBB.
But it changes in the second
step, and that’s what dooms the ‘proof’, as we will now see.
0. Let y’ and y” (from E’ and
E”) be any LP violation pair, and say y” from E” has been observed
s
= 1, y’ = .196
1.
Premise 1: Inferences from y’and y”, using the sampling distribution of
the convex combination, are equivalent (Birnbaumization):
InfrE’(y’) is equal to InfrE”(y”) [both
are equal to (p’ + p”)/2) ]
2 Premise
2 (a): An inference from y’ using (i.e., conditioning on) the
sampling distribution of E’ (the experiment that produced it), is p’
InfrE’(y’) equals p’
Premise 2 (b): An inference from y” using (i.e., conditioning on) the
sampling distribution of E” (the experiment that produced it), is p”
InfrE”(y”) equals p”
From (1), (2a and b): InfrE’(y’)
equals InfrE”(y”)
Which is, or looks like the LP!
It would follow of course that p’
equals p”!
But from (0), y’ and y” form a LP violation, so, p’ is not equal to p”.
p’ was .05, p” ~ .3
Thus it would appear the frequentist is led into a contradiction.
The problem? There are different ways to show it, as always; here I
allowed the premises to be true.
In that case this is an invalid argument, we have all true premises and
a false conclusion.
I can consistently hold all the premises and the denial of
the conclusion
1. the two outcomes get the same convex combo p-value if I
play the Birnbaumization game
2. if I condition, the inferences from y” and y’ are p” and
p’, respectively
Denial of conclusion: p’ is not equal to p” (.05 is not
equal to .3)
No contradiction.
We can put it in a valid form but then the premises can never both be
true at the same time:
It’s not even so easy to put it in valid form (see my paper for several
attempts):
Premise 1: Inferences
from y’and y” are evidentially equivalent:
InfrE’(y’) is equal to InfrE”(y”)]
Premise 2 (a): An inference from y’ should use (i.e., conditioning on)
the sampling distribution of E’ (the experiment that produced it)
InfrE’(y’) equals p’
Premise 2 (b): An
inference from y” should use (i.e., conditioning on) the sampling distribution
of E” (the experiment that produced it):
InfrE”(y”) equals p”
Usually the proofs just give the
bold parts
From (1), (2a and b): InfrE’(y’)
equals InfrE”(y”)
Which is the LP!
Contradicting the assumption that y’ and y” form an LP violation!
The problem now is this: in
order to infer the conclusion the premises of the argument must be true, and it
is impossible to have premises (1) and (2) true at the same time:
Premise (1) is true only if
we use the sampling distribution given by the convex combinations (averaging
over the LP pairs).
·
This is the sampling distribution of TBB.
· Yet to draw inferences using
this sampling distribution renders both (2a) and (2b) false.
·
The truth of (2a) and (2b) requires ‘conditioning’ on the experiment
actually performed, or rather, they require we not ‘Birnbaumize’ the experiment
from which the observed LP pair is known to have actually come!
I plan to give the audience a
handout of chapter 7III, Error and Inference (Mayo and Cox 2010). Then I can point them to pages. http://www.phil.vt.edu/dmayo/personal_website/ch%207%20mayo%20birnbaum%20proof.pdf
Although I have allowed
premise (1) for the sake of argument, the very idea is extremely far-fetched
and unmotivated.[iii]
Pre-data, the frequentist
would really need to consider all possible pairs that could be LP violations
and average over them….
It is worth noting that Birnbaum himself rejected the LP
(Birnbaum 1969, 128): “Thus it seems that the likelihood concept cannot be
construed so as to allow useful appraisal, and thereby possible control, of erroneous
interpretations.”
References, to be added.
*I gratefully acknowledge Sir David Cox's advice and encouragement on this and numerous earlier drafts.
[2] In
the context of error statistical inference, this is based on the particular
statistic and sampling distribution specified by E.
[3] See EGEK, p.
355 for discussion.
[ii] We think
this captures the generally agreed upon meaning of the LP although statements
may be found that seem stronger.
For example, in Pratt, Raiffa, and Schlaifer, 1995:
If, in a given situation, two
random variables are observable, and if the value x of the first and the
value y of the second give rise to the same likelihood function, then
observing the value x of the first and observing the value y of the second are
equivalent in the sense that they should
give the same inference, analysis, conclusion, decision, action, or anything
else. (Pratt, Raiffa, Schlaifer 1995, 542; emphasis added)
[iii] Cox thinks
I should say more about the very idea of premise (1). He is right; but this is to be a very short talk, and this
is not a short topic. References
will be added shortly.
REFERENCES (incomplete)
Armitage, P. (1975). Sequential Medical Trials, 2nd ed. New York: John Wiley & Sons.
Birnbaum, A. (1962). On the Foundations of Statistical Inference (with discussion), Journal of the American Statistical Association, 57: 269–326.
Birnbaum. A. (1969). Concepts of Statistical Evidence. In Philosophy, Science, and Method: Essays in Honor of Ernest Nagel, edited by S. Morgernbesser, P. Suppes, and M. White, New York: St. Martin’s Press: 112-143.
Berger, J. O., and Wolpert, R.L. (1988). The Likelihood Principle, California Institute of Mathematical Statistics, Hayward, CA.
Cox, D.R. (1977). “The Role of Significance Tests (with Discussion),” Scandinavian Journal of Statistics, 4: 49–70.
Cox D. R. and Mayo. D. (2010). "Objectivity and Conditionality in Frequentist Inference" in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science, edited by D Mayo and A. Spanos, Cambridge: Cambridge University Press: 276-304.
Edwards, W., Lindman, H, and Savage, L. (1963). Bayesian Statistical Inference for Psychological Research, Psychological Review, 70: 193-242.
Jaynes, E. T. (1976). Common Sense as an Interface. In Foundations of Probability Theory, Statistical Inference and Statistical Theories of Science Volume 2, edited by W. L. Harper and C.A. Hooker, Dordrect, The Netherlands: D.. Reidel: 218-257.
Joshi, V. M. (1976). “A Note on Birnbaum’s Theory of the Likelihood Principle.” Journal of the American Statistical Association 71, 345-346.
Joshi, V. M. (1990). “Fallacy in the Proof of Birnbaum’s Theorem.” Journal of Statistical Planning and Inference 26, 111-112.
Lindley D. V. (1976). Bayesian Statistics. In Foundatioins of Probabilitiy theory, Statistical Inference and Statistical Theories of Science, Volume 2, edited by W. L. Harper and C.A. Hooker, Dordrect, The Netherlands: D.. Reidel: 353-362.
Mayo, D. (1996). Error and the Growth of Experimental Knowledge. The University of Chicago Press (Series in Conceptual Foundations of Science).
Mayo, D. (2010). "An Error in the Argument from Conditionality and Sufficiency to the LikelihoodPrinciple." In Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science, edited by D. Mayo and A. Spanos, Cambridge University Press. 305-314.
Mayo, D. and D. R. Cox. (2011) “Statistical Scientist Meets a Philosopher of Science: A Conversation with Sir David Cox.” Rationality, Markets and Morals (RMM): Studies at the Intersection of Philosophy and Economics. Edited
by M. Albert, H. Kliemt and B. Lahno. An open access journal published
by the Frankfurt School: Verlag. Volume 2, (2011), 103-114. http://www.rmm-journal.de/htdocs/st01.html
Mayo D. and A. Spanos, eds. (2010). Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science, Cambridge: Cambridge University Press.
Pratt, John W, H. Raffia and R. Schlaifer. (1995). Introduction to Statistical Decision Theory. Cambridge, MA: The MIT Press.
Savage, L., ed. (1962a), The Foundations of Statistical Inference: A Discussion. London: Methuen & Co.
Savage, L. (1962b), “‘Discussion on Birnbaum (1962),” Journal of the American Statistical Association, 57: 307–8.
No comments:
Post a Comment
===========================