I claim that all but the first of the “dirty hands” argument’s five premises are flawed. Even the first premise too directly identifies a policy decision with a statistical report. But the key flaws begin with premise 2. Although risk policies may be based on a statistical report of evidence, it does not follow that the considerations suitable for judging risk policies are the ones suitable for judging the statistical report. They are not. The latter, of course, should not be reduced to some kind of unthinking accept/reject report. If responsible, it must clearly and completely report the nature and extent of (risk-related) effects that are and are not indicated by the data, making plain how the methodological choices made in the generation, modeling, and interpreting of data raise or lower the chances of finding evidence of specific risks. These choices may be called risk assessment policy (RAP) choices.
Granted, values do arise from data interpretation, but they reflect the value of responsibly reporting the evidence of risk. Some ethicists argue that scientists should favor public and environmental values over those of polluters, developers, and others with power. Maybe they should, but it is irrelevant. Even if one were to grant this (and it would be a matter of ethics), it still would be irresponsible (on scientific grounds) to interpret what the data indicate about the risk in the light of policy advancement, even assuming that the vulnerable parties would prefer that policy. The job of the scientist is to unearth what is and is not known about the substance, practice, or technology.
The critics are right that in issuing “clean bills of health” there is a concern that the probability of a type II error may be too high. But the solution is not to try and minimize it. Rather we should use this information to argue:
If the Prob(test T accepts H0; increased risk d is present) is very high, then accepting H0 with test T is poor evidence that an increased risk d is absent.
Although H0 “passed” test T, the test it passed was not severe—it is very probable that H0 would pass this test, even if the increased risk is actually as large as d. Therefore, a failure to reject H0 with test T does not license inferring that the increased risk is less than d. We could also use the negative result in order to find a value for the increased risk—call it d*—such that so negative a result is very improbable if the increased risk were as high as d*. Then the negative result allows inferring that d < d*. (We are back to “rule M” from the formaldehyde hearings.) Even getting this right, however, only takes one to the level of the statistical report and not to subsequent risk policy decisions.
NOTE: This is not the same as what some are calling “observed power”—Oy! I’ll have to come back to this later, I am seeing this curious animal on some blogs on the top 50 list!
Contrary to promoting the public good, taken seriously, the ethics in evidence argument would be tantamount to saying that the evidence does not matter much—what matters, for an ethical interpretation of data, is the preferred policy consequences, the preference being decided on one or another ethical grounds.
I do not think proponents of the ethics in evidence position would wish to accept this logical consequence of their position. Not only does it have the untoward consequence of discounting or diminishing the role of evidence, it should be keep in mind that ethical grounds can shift and be used for conflicting ends (e.g., preventing starvation, and avoiding risks of GM foods). Most importantly, a position that can imply that evidence does not matter much is going to be (and has been) regarded as anti-science—greatly diminishing the voice of those who rightly wish to press for more responsible science. If it is all or largely a matter of political and social values, then more and better evidence can not help. What better excuse for those happy not having to provide better evidence!