Reacting to Replication Attempts

This is the first post in a three-part mini-series on replication research, to include posts on:

  • Why we should welcome replication attempts of our work
  • My own experience selecting and conducting replication studies
  • The case for offering up our own studies for replication, and how to do it via StudySwap

We should enthusiastically welcome replication attempts

How should we feel and how should we react when we learn that an independent research team either plans to conduct or has conducted a replication attempt of a finding we originally reported? I’ve prepared this flowchart to guide our reactions and elaborated a bit below.

ReplicationReaction

Replication attempts are often perceived as and labeled as “tear down” missions. This response is counterproductive and we need to reframe the discussion surrounding replication attempts. To hear an excellent example of how we can do this, do yourself a favor and listen to this episode of the Black Goat. Sanjay Srivastava, Alexa Tullett, Simine Vazire, and Rich Lucas had a very interesting conversation about replication research and Rich shared some of his actual motivations for conducting replications (spoiler alert, it isn’t to crush souls and destroy careers).

As a starting point for my take on more productive responses to replication attempts of your work, let us assume that you are confident in the finding in question. If you are not, well, that’s another discussion for another time.

If you are confident in the finding, a replication attempt should be taken as a form flattery and a chance to enhance the visibility of your work. It suggests that someone in the field thinks the finding is important enough that we should have an accurate understanding of the finding or estimate of the size of an effect. If the replication attempt is ultimately published, then other members of the field agree on its importance.

The attempt “succeeds”

For example, the replication study finds an effect size very similar to your originally published effect size. Yay! An independent research team has supported the original finding and your confidence in the effect has grown with very little work on your part. You have been cited and received a big pat on the back from the data.

The attempt “fails”

For example, the replication study finds no effect or a much smaller effect size than you did originally. Of course, this will be initially frustrating. BUT, remember, you are confident in the finding. You have essentially been invited to a low-effort publication. Why? The journal will now almost certainly welcome a submission from you showing that you can, in fact, still get the finding. Heck, perhaps you and the replicating team can even work together to figure out what’s going on! This was exactly the positive and productive cycle that developed after we failed to replicate part of the Elaboration Likelihood Model’s predictions in Many Labs 3.

Original -> ML3 -> Response w/ data -> Response to the response w/ data

Charlie Ebersole has even provided some empirical evidence on how responses to “failed” replications are perceived. tl;dr: if one operates as a scientist should, by earnestly pursuing the truth and collaborating with replicators, such behavior will win you friends and enhance your scientific reputation.

So, buy your replicators a beer. You owe them one!

My next two posts will focus on my own experience selecting effects for replication attempts and how to offer up one’s own effects for independent replication.

SURE THING Hypothesis Testing

Studies Until Results Expected, Thinks Hypothesis Is Now Golden

My sons watch a cartoon called Daniel Tiger’s Neighborhood. In one episode, which they (and by extension I) have watched at least one hundred times, Daniel and co. sing a little song that I imagine will repeat in my head for decades. The chorus goes:

“Keep trying, you’ll get better.”

The episode and song have a really nice message. Daniel is struggling to hit a baseball, but his friends encourage him to work at it until he improves.

What does this song have to do with experimental psychology? One interpretation of the lyric could be that of a researcher refining her craft to improve the research she conducts and strengthen the quality of evidence her studies produce. I can’t help but hear it another way.

“Keep trying, you’ll get better…results.”

As in, if at first your hypothesis is not supported, dust yourself off and try again. I think many of us have done too much SURE THING hypothesis testing.

A Twist on an Excellent Cartoon

targets

“Bullseyes” by Charlie Hankin

This cartoon elegantly captures the concept of HARKing. Hypothesizing After Results are Known. SURE THING hypothesis testing definitely isn’t HARKing. The hypothesis in question is often established well before any results, and certainly before the supporting results, are known. The researcher simply tries and tries and tries, all the while making “improvements” or “tweaks” with the best of intentions, until the target is struck.

It also isn’t really p-hacking, a practice in which we exercise myriad researcher degrees of freedom, typically within a single study, until our results reach statistical significance. I think that both p-hacking and SURE THING hypothesis testing deserve their own cartoons. I am not a cartoonist, nor do I know Charlie Hankin, so allow me to simply describe the needed cartoons. The artistically inclined reader is invited to produce these cartoons in exchange for fame and glory.

  • The “p-hacking bullseyes” cartoon: Targets are drawn beforehand, but they cover approximately 67% (drawn from Simmons & Simonsohn’s simulations of how bad it can get if we really go off the p-hacking rails) of the possible-arrow-landing surface.
    • The King’s shot has landed on one of the targets, and the assistant exclaims, “excellent shot, my lord.”
  • The “SURE THING bullseyes” cartoon: This one will need multiple panes, as SURE THING hypothesis testing is more episodic than HARKING or p-hacking.The target is drawn beforehand.
    • The King shoots and misses. “No worries, my lord. The arrow must be faulty. Allow me to retrieve and refine it.”
    • The King shoots again and misses again. “Ah, I know the problem. Let us quickly tighten your bowstring.”
    • The King shoots again and misses again. “Perhaps we shall try again in better lighting and wind conditions tonight.”
    • At night. The King shoots again and hits! “Excellent shot, my lord!”

If you shoot until you hit, then success is a

SURE THING.

Of course, others have described this process in scientific experimentation. Perhaps my favorite description comes from the Planet Money podcast episode on the replication crisis. They describe flipping coins over and over until one of them hits an unlikely sequence of results. What I think hasn’t yet been discussed adequately, is the fact that many of the proposals of the open science movement (pre-registration, open data, open materials) provide weak defense against SURE THING hypothesis testing.

An Illustrative Hypothetical Scenario

In my last post, I discussed Comprehensive Public Dissemination of empirical research. This and the following hypothetical scenarios will help outline why I think it can be so powerful.

One researcher pre-registers and runs attempt after attempt at essentially the same study, “tweaking and refining” with the best of intentions as he goes. Eventually,

bang!

p < .o5.

Publish.

How do we feel about this?

An Alternative Hypothetical Scenario

A different researcher has a hypothesis about a potentially cool new effect. She engages in CPD. She clearly identifies on her CPD log a series of studies intended to pilot methods and  to establish the necessary conditions for the effect to occur. Once she thinks she has established solid methods, she runs a pre-registered confirmatory study and

bang!

p < .o5.

Publish.

How do we feel about this?

If We Are SURE THING Hypothesis Testing, We Aren’t Hypothesis Testing at All

Ditch the File Drawer: Comprehensive Public Dissemination

How can you help fight the file drawer problem? Eliminate your file drawer!

Comprehensive public dissemination (CPD) is a commitment by an individual researcher to publicly post the basic methods and results of all empirical research that they conduct. Some researchers are already leading the way by doing a great job of tracking their research work flows in open and transparent ways (Lorne CampbellKatie Corker). CPD can serve as an important extension to these practices. As Will Gervais noted in his excellent post on emptying one’s file drawer, pre-prints offer a low effort mechanism for the dissemination of null results or other unpublished work. Taking the final step of briefly summarizing and sharing the methods and results of your data collection projects is not an overly burdensome step, and it could be greatly beneficial to the consumers of your research.

Draft CPD Initiative Statement and Guidelines

The results of scientific research must be comprehensively disseminated for researchers and the public to fully evaluate the evidence for any scientific finding and to generate cumulative knowledge. If a research project is worth conducting, its outcomes are worth disseminating. By publicly disseminating all research results, scientists can help combat problems that distort the body of scientific evidence, such as the file drawer problem and publication bias. I therefore agree that, from this date forward,

I will publicly disseminate the methodology and outcomes of all of my scientific work.

Suggested standards to become a comprehensive public disseminator:

  1. Create a comprehensive public dissemination (CPD) log with version control (spreadsheet on OSF, for example)
  2. Share a link to your CPD log (on your homepage, OSF account, twitter profile, etc.)
  3. At the beginning of data collection for any project, add the project to your CPD log and post an initiated date
  4. At the conclusion of data collection, post a completion date
  5. Disseminate your work in any manner you desire: publish a paper, present at conference, write a brief summary and post it to a public repository, etc.
  6. Provide a link to the dissemination product on your CPD log
  7. Repeat steps 3 through 6 for all projects

Why Would I?

You may be wondering, “what’s in it for the researcher?” First, I assume you mean, “what’s in it for the researcher besides doing their part to save the entire enterprise of science?” By signing on to CPD you can send a strong signal to others that you take open and transparent science seriously and are willing to “play ball.” CPD will also increase the confidence that others have in your published work. Joe Hilgard made a similar point in this post on publishing null results.

An Illustrative Hypothetical Scenario

CPD can complement and amplify the efficacy of other open science practices by providing the full data collection context for new findings. Imagine that researchers X and Y have each found a novel and exciting effect. Both publish papers on these effects, with what appears to be equally strong evidence from a single pre-registered study with a large N. If both have signed on for CPD, the full context of their findings is available for you to assess. You look at the CPD logs of each.

  • Researcher X has conducted two pilot studies on the new effect to refine methods and materials followed by the pre-registered study.
  • Researcher Y has conducted 17 similar pre-registered studies that appear to be close variants of the published study.

Are you equally confident in the replicability of the effects published by researcher X and researcher Y? I am not. Without CPD we would not have this important context for the published evidence. The promise of pre-registration is that it demarcates exploratory and confirmatory research. This promise may not be fully realized without CPD.

Just imagine if Deryl Bem kept a CPD log

I am actively seeking folks who want to contribute to the development and dissemination of this idea. Drop me a line at cchartie@ashland.edu.

My new CPD log.