by Quentin Wodon
Having argued in the first post in this series of three that we need more impact evaluations in Rotary, the next question is: How are such evaluations to be done? One must first choose the evaluation question, and then use an appropriate technique to answer the question. The purpose of this post is to briefly describe these two steps. A useful resource for those interested in knowing more is an open access book entitled Impact Evaluation in Practice published by the World Bank a few years ago. The book is thorough, yet not technical (or at least not mathematical), and thereby accessible to a large audience.
As mentioned in the first post in this series, impact evaluations seek to answer cause-and-effect questions such as: what is the impact of a specific program or intervention on a specific outcome? Not every project requires an impact evaluation – but it makes sense to evaluate the impact of selected projects that are especially innovative and relatively untested, replicable at larger scale, strategically relevant for the aims of the organization implementing them, and potentially influential if successful. It is also a good practice to combine impact evaluations with a cost-effectiveness analysis, but this will not be discussed here.
An impact evaluation starts with a specific project and a question to be asked about that project. Consider the dictionary project whereby hundreds if not thousands of Rotary clubs distribute free dictionaries to primary school students, mostly in the United States. This project has been going on for many years in many clubs. In Washington DC where I work, local Rotary clubs – and especially the Rotary Club of Washington DC – distribute close to 5,000 dictionaries every year to third graders. Some 50,000 dictionaries have been distributed in the last ten years. This is the investment made in just one city. My guess is that millions of dictionaries have been distributed by Rotarians in schools throughout the US.
The dictionary project is a fun and feel good activity for Rotarians, which also helps to federate members in a club because it is easy for many members to participate. I have distributed dictionaries in schools several times, the last time with my daughters and two other Interactors. Everybody was happy, especially the students who received the dictionary with big smiles. Who could argue against providing free dictionaries in public schools for children, many of whom are from underprivileged backgrounds?
I am not going to argue here against the dictionary project. But for this project as for many others, I would like to know whether it works to improve the prospects and life of beneficiaries – in this case the children who receive the dictionaries. It could perhaps be enough to justify the project that the children are happy to receive their own dictionary and that a few use it at home. But the project does have a cost, not only in terms of the direct cost of purchasing the dictionaries, but also in terms of the opportunity cost for Rotarians to go to the schools and distribute the dictionaries. Rotary clubs could decide to continue the project even if it were shown to have limited or no medium term impact on various measures of learning for the children. But having information on impact, as well as potential ways to increase impact, would be useful in making appropriate decisions to continue this type of service project or not. It would not matter much if dictionaries were distributed only by a few clubs in a few schools– but this is a rather large project for clubs in the US.
An impact evaluation question for the project would be of the form: “What is the impact of the distribution of free dictionaries on X?” X could be – among many other possibilities – the success rates at an English exam for the children, the propensity for children to read more at home, a measure of new vocabulary gained by children, or an assessment of the quality of the spelling in the children’s writing. One could come up with other potential outcomes that the project could affect. In order to assess impact, one would need to compare students in schools where children did receive dictionaries to students in schools where children did not. This could be done some time after the dictionaries have been distributed.
About two years ago I tried to find whether any impact evaluation of the dictionary project had been done. I could not find any. May be I missed something (let me know if I did), but it seems that this project which requires quite a bit of funding from clubs as well as a lot of time spent by thousands of Rotarians every year has not been evaluated properly. It would be nice to know whether the project actually achieves results. This is precisely what impact evaluations are designed to do.
In order to estimate project impacts data collection is required. Typically for impact evaluations quantitative data are used. For the dictionary project, one could have children take a vocabulary test before receiving the dictionary and again one year after having received the dictionary. One would then compare a “treatment” group (those who received the dictionary) to a “control” group (those who did not). This could be done using data specifically collected for the evaluation, or using other information – such as standardized tests administered by schools, which would reduce the cost of an impact evaluation substantially, but would also limit the outcomes being considered for the impact evaluation to those on which students are being tested by schools.
The gold standard for establishing the treatment and control groups is randomized controlled trial (RCT). Under this design, a number of schools would be randomly selected to receive dictionaries, while other schools would not. Under most circumstances, comparisons of outcomes (say, reading proficiency) between students in schools with and without dictionaries would yield (unbiased) estimates of impacts. In many interventions, the randomization is applied to direct beneficiaries – here the students. But for the dictionary project that would probably not work – it would seem too unfair to give dictionaries to some students in a given school and not others, and the impact on some students could affect the other students, thereby making the impact evaluation not as clean as it should be (even if there may be ways to control for that). This issue of fairness in choosing beneficiaries in a RCT is very important, and typically the design of RCT evaluations has to be vetted ethically by institutional review boards (IRBs).
A number of other statistical and econometric techniques can be used to evaluate impacts when a RCT is not feasible or appropriate. These include (among others) regression discontinuity design, difference-in-difference estimation, and matching estimation. I will not discuss these techniques here because this would be too technical, but the open access Impact Evaluation in Practice book that I mentioned earlier does this very well.
Finally. apart from measuring the impact of programs through evaluations, it is also useful to better understand the factors that lead to impact or lack thereof – what is often referred to as the “theory of change” for how an intervention achieves impact. The question here is not whether a project is having the desired impact, but why it does or does not. This can be done in different ways, using both qualitative and quantitative data. For example, for the dictionary project, a few basic questions could be asked, such as: 1) did the child already have access to another dictionary at home when s/he received the dictionary provided by Rotary?; 2) how many times has the child looked at the dictionary over the last one month?; 3) did the dictionary provided by Rotary have unique features that led the child to learn new things?, etc… Having answers to this type of questions helps in interpreting the results of impact evaluations.
Only so much can be discussed in one post, and the question of how to implement impact evaluations is complex. Still, I hope that this post gave you a few ideas and some basic understanding of how impact evaluations are done, and why they can be useful. If you are considering an impact evaluation, please let me know, and if I can help I will be happy to. In the next and final post in this series, I will discuss some of the limits of impact evaluations.
Note: This post is part of a series of three on impact evaluations. The three posts are available here: Part 1, Part 2, and Part 3.