The importance of replication

Not just definitions but individual acts of experimentation must be checked for reliability. Research must be repeated before a finding can be accepted as well-established.

Findings obtained at one time might not hold true at another time with different researchers or different experimental subjects. To check the reliability of a finding, one must replicate the research. That means to repeat the research in all its important details .

What is replication? Why is replication "vital to science"? How are operational definitions helpful?

Replication is vital to science. It helps make science a self-correcting system. Any time a result is surprising, researchers will try to replicate it, to see if the phenomenon is dependable or just a fluke (a one-time occurrence).

Operational definitions are critically important in aiding replication. Why? An operational definition spells out exactly how to measure something.

To replicate an experiment, one must know how the original researcher performed measurements. Hence operational definitions must be known precisely, to replicate research.

What are common reasons a replication fails?

If a replication fails, that does not imply somebody lied or cheated. Most scientists are honest.

Most failures of replication are due to differences in how the research was performed. Possible differences include: the subjects involved (Nationality? Age? Volunteers or paid subjects?) the time of day, the weather, the expectations of the experimenter (a powerful effect), and many other variables.

Seemingly minor details could influence the outcome of an experiment. Replication failures may also expose what statisticians call a Type 1 error, a finding of statistical significance that is due to random error or luck of the draw.

If an effect cannot be repeated reliably, scientists need to find out about it. That is why replications are performed.

Until a phenomenon can be reproduced reliably, even by researchers who are skeptical about it, one cannot be fully confident the finding will prove true in the long run. Indeed, the word prove means test (as in "proving ground" which is a place where automobiles are tested). A finding cannot be accepted as true until it is "proven" in a variety of circumstances.

What does the word "prove" mean?

An example is research on extrasensory perception or ESP. Positive findings about ESP are often reported, but (so far) they always disappear when somebody skeptical about ESP tries to replicate them.

The same thing can happen in education and health research. Honest researchers may produce positive results for their favored ideas because of enthusiasm or other side-effects of positive expectations.

That is why double-blind research is the gold standard of science. It prevents expectations from altering the outcome of research, as we will discuss presently.

When proper controls such as double-blind design are used, researchers may be unable to repeat a finding first reported by enthusiasts. If so, the finding is open to doubt.

There are also occasions when a dishonest researcher commits fraud. The field of social psychology was traumatized in 2012 when several Dutch researchers were found to have published fraudulent data.

This type of misbehavior is also detected by failures of replication. An example of a fraud detected by replication failures is the "spotted mice" scandal at the prestigious Sloan-Kettering Institute in the early 1970s.

A research scientist was working under pressure to produce successful skin grafts from black-furred mice to white-furred mice. In all previous experiments, such grafts failed to succeed because the host animal rejected the foreign tissue. This was important research, because organ rejection commonly occurs after transplants.

The scientist who faked the skin graft was riding in an elevator early one morning with a batch of his white-furred mice and a permanent ink marking pen. Suddenly and impulsively, he leaned over and made some dark spots on the fur of his white mice. Then he claimed success in transplanting patches of skin from black to white mice.

The reports of a successful transplant surprised other scientists. They tried to replicate the experiment, and they could not. In this case, after failed replications, the scientist confessed what he had done, so the mysterious results were explained.

The consequences of fraud are devastating for a scientist, leading to dishonor and usually to the loss of job and career. Scientists know any important result will be subjected to attempts at replication. This provides a powerful incentive for honesty among researchers.

What is a "powerful incentive for honesty among researchers"?

Students are taught that a research report should include all necessary details to permit replication. However, it is often impossible to find all relevant details about how research is conducted in a published report of research.

Sometimes, to carry out an exact replication, one must contact earlier researchers to learn details of a procedure. Gasparikova-Krasnec and Ging (1987) found that researchers were generally cooperative in providing information needed for replications. A month's wait was normally all that was required.

Researchers typically realize that double-checking surprising results is important to science. Naturally they hope for confirmation of their earlier efforts, but if an attempted replication fails, this is not entirely bad news (as long as no fraud is involved). A failed replication may have a stimulating effect on a field of research.

Replication failures inspire new studies to figure out why an attempt to use the "same" procedures led to different results. A fine-grained analysis of the experimental procedures may reveal some key details that were different, when comparing the original study to the replication.

How is even a failed replication "not necessarily bad news"?

If a replication fails, but the original researchers believe their original finding is correct, they will suggest ways to tighten up controls or other procedures to improve the chances of a successful replication. They hope the results will come back if another replication is attempted with improved techniques.

Diminishing Returns with Repeated Replications

On some occasions, replication failures continue. That is bad news for the original researchers, because it means their finding was inaccurate or a fluke (a one-time finding).

False claims–including those that start as honest mistakes–produce a distinctive pattern during successive attempts at replication: the effects get smaller and smaller as more replications are conducted.

What pattern occurs with attempts to replicate a false claim?

This happened, for example, in the case of cold fusion: a desktop apparatus was said to produce fusion energy. In psychology, it happened with cardiac conditioning: claims that heart rates could be altered directly through conditioning procedures.

Diminishing effects with repeated replications occur not because an actual effect is disappearing, but because scientists are eliminating errors with better controls, as they make additional attempts at replication. A solid, scientific finding will gain more support as people continue to test it. A false lead or quack science claim will become less solid as people continue to test it.

---------------------
Reference:

Gasparikova-Krasnec, M. & Ging, N. (1987). Experimental replication and professional cooperation. American Psychologist, 41, 266-267.

Prev page | Page top | Chapter Contents | Next page

Write to Dr. Dewey at psywww@gmail.com.

Don't see what you need? Psych Web has over 1,000 pages, so it may be elsewhere on the site. Do a site-specific Google search using the box below.