Labels

Wednesday, April 13, 2016

The intrinsic unreliability of science - by William A. Wilson and comments from Vox Day

More and more investigations of quasi-scientific shenanigans are demonstrating the need for more precision in the language used to describe the field that is too broadly and misleadingly known as "science":

The problem with ­science is that so much of it simply isn’t. Last summer, the Open Science Collaboration announced that it had tried to replicate one hundred published psychology experiments sampled from three of the most prestigious journals in the field. Scientific claims rest on the idea that experiments repeated under nearly identical conditions ought to yield approximately the same results, but until very recently, very few had bothered to check in a systematic way whether this was actually the case. The OSC was the biggest attempt yet to check a field’s results, and the most shocking. In many cases, they had used original experimental materials, and sometimes even performed the experiments under the guidance of the original researchers. Of the studies that had originally reported positive results, an astonishing 65 percent failed to show statistical significance on replication, and many of the remainder showed greatly reduced effect sizes.

Their findings made the news, and quickly became a club with which to bash the social sciences. But the problem isn’t just with psychology. There’s an ­unspoken rule in the pharmaceutical industry that half of all academic biomedical research will ultimately prove false, and in 2011 a group of researchers at Bayer decided to test it. Looking at sixty-seven recent drug discovery projects based on preclinical cancer biology research, they found that in more than 75 percent of cases the published data did not match up with their in-house attempts to replicate. These were not studies published in fly-by-night oncology journals, but blockbuster research featured in Science, Nature, Cell, and the like. The Bayer researchers were drowning in bad studies, and it was to this, in part, that they attributed the mysteriously declining yields of drug pipelines. Perhaps so many of these new drugs fail to have an effect because the basic research on which their development was based isn’t valid....
Paradoxically, the situation is actually made worse by the fact that a promising connection is often studied by several independent teams. To see why, suppose that three groups of researchers are studying a phenomenon, and when all the data are analyzed, one group announces that it has discovered a connection, but the other two find nothing of note. Assuming that all the tests involved have a high statistical power, the lone positive finding is almost certainly the spurious one. However, when it comes time to report these findings, what happens? The teams that found a negative result may not even bother to write up their non-discovery. After all, a report that a fanciful connection probably isn’t true is not the stuff of which scientific prizes, grant money, and tenure decisions are made.

And even if they did write it up, it probably wouldn’t be accepted for publication. Journals are in competition with one another for attention and “impact factor,” and are always more eager to report a new, exciting finding than a killjoy failure to find an association. In fact, both of these effects can be quantified. Since the majority of all investigated hypotheses are false, if positive and negative evidence were written up and accepted for publication in equal proportions, then the majority of articles in scientific journals should report no findings. When tallies are actually made, though, the precise opposite turns out to be true: Nearly every published scientific article reports the presence of an association. There must be massive bias at work. 

Ioannidis’s argument would be potent even if all scientists were angels motivated by the best of intentions, but when the human element is considered, the picture becomes truly dismal. Scientists have long been aware of something euphemistically called the “experimenter effect”: the curious fact that when a phenomenon is investigated by a researcher who happens to believe in the phenomenon, it is far more likely to be detected. Much of the effect can likely be explained by researchers unconsciously giving hints or suggestions to their human or animal subjects, perhaps in something as subtle as body language or tone of voice. Even those with the best of intentions have been caught fudging measurements, or making small errors in rounding or in statistical analysis that happen to give a more favorable result. Very often, this is just the result of an honest statistical error that leads to a desirable outcome, and therefore it isn’t checked as deliberately as it might have been had it pointed in the opposite direction. 

But, and there is no putting it nicely, deliberate fraud is far more widespread than the scientific establishment is generally willing to admit.

Never confuse either scientistry or sciensophy for scientody. To paraphrase, and reject, Daniel Dennett's contention, do not trust biologists or sociologists or climatologists, or anyone else who calls himself a scientist, simply because physicists get amazingly accurate results.