October 10, 2015


Making It All Up : The behavioral sciences scandal (ANDREW FERGUSON, 10/19/15, Weekly Standard)

For one thing, the "reproducibility crisis" is not unique to the social sciences, and it shouldn't be a surprise it would touch social psychology too. The widespread failure to replicate findings has afflicted physics, chemistry, geology, and other real sciences. Ten years ago a Stanford researcher named John Ioannidis published a paper called "Why Most Published Research Findings Are False." 

"For most study designs and settings," Ioannidis wrote, "it is more likely for a research claim to be false than true." He used medical research as an example, and since then most systematic efforts at replication in his field have borne him out. His main criticism involved the misuse of statistics: He pointed out that almost any pile of data, if sifted carefully, could be manipulated to show a result that is "statistically significant." 

Statistical significance is the holy grail of social science research, the sign that an effect in an experiment is real and not an accident. It has its uses. It is indispensable in opinion polling, where a randomly selected sample of people can be statistically enhanced and then assumed to represent a much larger population. 

But the participants in behavioral science experiments are almost never randomly selected, and the samples are often quite small. Even the wizardry of statistical significance cannot show them to be representative of any people other than themselves. 

This is a crippling defect for experiments that are supposed to help predict the behavior of people in general. Two economists recently wrote a little book called The Cult of Statistical Significance, which demonstrated how easily a range of methodological flaws can be obscured when a researcher strains to make his experimental data statistically significant. The book was widely read and promptly ignored, perhaps because its theme, if incorporated into behavioral science, would lay waste to vast stretches of the literature. 

Behavioral science shares other weaknesses with every field of experimental science, especially in what the trade calls "publication bias." A researcher runs a gauntlet of perverse incentives that encourages him to produce positive rather than negative results. Publish or perish is a pitiless mandate. Editors want to publish articles that will get their publications noticed, and researchers, hoping to get published and hired, oblige the tastes of editors, who are especially pleased to gain the attention of journalists, who hunger for something interesting to write about. 

Negative results, which show that an experiment does not produce a predicted result, are just as valuable scientifically but unlikely to rouse the interest of Shankar Vedantam and his colleagues. And positive results can be got relatively easily. Behavioral science experiments yield mounds of data. A researcher assumes, like the boy in the old joke, that there must be a pony in there somewhere. After some data are selected and others left aside, the result is often a "false positive"‚Äč--‚Äčinteresting if true, but not true.

Publication bias, compounded with statistical weakness, makes a floodtide of false positives. "Much of the scientific literature, perhaps half, may simply be untrue," wrote the editor of the medical journal Lancet not long ago. Following the Reproducibility Project, we now know his guess was probably too low, at least in the behavioral sciences. The literature, continued the editor, is "afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance."

...is that conservatives are close-minded because we don't blindly accept scientific findings.
Posted by at October 10, 2015 9:03 AM

blog comments powered by Disqus