How to run experiments, when you can’t afford a randomized trial

Author: Tom Wein

Experiments are powerful – but field randomized trials, or RCTs, can be expensive. From survey experiments to work in the lab, there are other ways of gathering causal evidence.

Good evidence delivers better programming. Often, to get the best evidence, we turn to experimentation. There has been an explosion in the number of field trials for international development in recent years. In humanitarian work, to give just a couple of examples, World Vision and IRC are both running major experimental trials around their cash programming.

But not everyone can or should run a full field experimental trial. Trials are expensive; one 3ie blog cites an average cost of US$400,000. They can take a long time and be disruptive. In humanitarian emergencies, even though they might provide the most reliable evidence on the effects of interventions, they will not always be the best choice.

Luckily, there are other ways of including experimentation in your work.

Survey experiments: One is to run a survey experiment. If you are going to be running a survey anyway, it’s very little trouble to program in a few versions of the same question, each with a different prompt to see if this leads to different responses. Even simpler is to run it on Amazon’s Mturk service.

Natural experiments: The basic logic of natural experiments is easy: you just follow a group of people who are doing what they would normally do and compare them with a similar group. If you make a careful, credible argument about why your comparison group really is similar, you can compare outcomes in both groups. It’s not perfect, but it can tell you more than only asking the people who benefited from your program. This is something more and more humanitarian organisations are already doing – Oxfam found success with this approach in Zambia.

Laboratory experiments: Lab experiments can have huge power as a controlled test-bed for low risk trials. You can set them up to look at the effects of actions analogous to what would happen in the field. Careful design work allows you to draw credible links between your experiment and the reality. If the lab infrastructure and willing participants are already available, you can run them far more quickly than a typical RCT.

Forecasting: Although not technically experimental, you might want to look at the bourgeoning science of forecasting. Even though people are mostly bad at predictions, some people are better than others. If you ask the question in the right way, and a crowd is wise enough, it can allow you to accurately rank the effectiveness of different ideas.

Digitization: If your programme has digital elements and data are recorded automatically, experimentation becomes far easier. Tech companies run hundreds of so called A/B tests a day, and lots of people nowadays run email or SMS experiments. If you don’t have the funds to experiment right now, consider using what cash you do have to digitize your programme as much as possible. It will then be far cheaper to do all kinds of evaluation later – an underdiscussed benefit of the digitization agenda.

Conclusion

RCTs are still hugely powerful. When you can do them, you should. They are the most realistic test available of whether your programme works. But sometimes second best is good enough.

In considering whether to do an RCT, there are great resources out there to help you. Raising Voices, a Ugandan charity, has published their learnings from running an RCT. Karlan and Gugerty’s ‘The Goldilocks Challenge’ explores ‘Right Fit Evidence’ and the many approaches you can take, starting with a clear theory of change and good monitoring systems. Evidence Aid’s new practice guide has a whole section on quasi-experimental methods. When an RCT doesn’t fit, but experimentation is still valuable, these tools offer different ways of doing it.

About the author:

 

Tom Wein is a research consultant. He works to advance justice and create better governance through useful research. His website is tomweinresearch.me, and he tweets @tom_wein.

 

Keywords: evaluation, experimentation, laboratory, RCT, forecasting, surveys

Share