Counting all the things


Since it’s all the rage to talk about measurement in aid and development projects these days I thought I’d share an indicator that I just found in a national AIDS strategy:

Outcome: fewer people have sex under conditions that could impair their judgement. % of [people] who had sex when they were drunk or when their partner was drunk reduced from 4% to 2% for women and 5% to 2% for men [over 8 year period].

Some questions:

  • How do we measure if people were drunk (or their partners were drunk) reliably? What is drunk? How far do people have to think back?
  • What tells us that we know what to do to achieve the hoped for changes (4-2% for women; 5-2% for men)?
  • What tells us that we can achieve a bigger change in men than women?

I’m not saying impaired judgement isn’t a factor. I’m not saying tackling alcohol use is a bad idea for health programmes.  This critique can be cut and pasted for a bunch more indicators in the same document.

I’m just wondering why on earth we bother making promises we don’t know how to keep and measure them in ways that lack any credibility.



Relative risk; absolute risk


Howard White of 3ie discusses some of the problems we often see in how people handle data in this post, “Using the Causal Chain to Make Sense of the Numbers“.  The essay makes many excellent points which are relevant both to how programmes are designed and how they are evaluated.

However, I have to take issue with one section:

And different ways of presenting regression models can give a misleading sense of impact. A large reduction in relative risk – a ‘good odds ratio’ – can reflect quite a small change in absolute risk. Three randomised controlled trials have found circumcision reduces the risk of transmission during unprotected sex by around 50 percent. The reduction in risk was from around 3.5 percent to 1.5 percent. Just a 2 percentage point absolute reduction, so 50 men need to be circumcised to avoid one new case of HIV/AIDS.

This is a bit misleading. In assessing effects we are interested both in relative and absolute effects, yes.  But White fails to acknowledge here that the absolute risk at the outset (3.5% in the case of the pooled results of the three trials) is a characteristic of the people being researched.  And indeed the absolute risk in the populations in the three studies (Kenya, South Africa, Uganda) was different. The 3.5% and 1.5% figures come from pooling the results of the three trials.  If study subjects had come from a population where the pre-existing HIV prevalence was higher, and risk factors (including unprotected sex) were higher, then the baseline absolute risk would have been more than 3.5%.  If the risk factors had been lower, the baseline risk would have been lower. White’s estimate that 50 men need to be circumcised to avert one infection is not universally valid. In some places it will be many more; in others it will be fewer. This is one of the reasons male circumcision is primarily promoted in higher HIV prevalence settings.

Having said that, “just” a 2% absolute reduction is actually pretty good when compared to other HIV prevention interventions. Especially when you consider that once a man is circumcised, he stays circumcised, so the risk reduction is permanent. Look at it another way: if the intervention being tested led to a 100% risk reduction, then (according to White’s post), that would be “only” a 3.5% reduction in absolute terms. Still doesn’t look very impressive, does it? Except in this case there would be no new infections whatsoever.

The reason the results of these trials (and any trials) are reported as relative risks is because if you want to estimate what the effects of the intervention might be in another population, you have to apply the relative risk reduction to the absolute risk in each and every population. Reporting it any other way is misleading.  White is of course correct that absolute risk reduction is what matters when looking at the overall effect of an intervention or policy, but absolute risk reduction is a function not just of the relative risk reduction of that intervention, but of all the other relevant factors.

When aid vanishes


The greatest trick the devil ever pulled was convincing the world he didn’t exist. (The Usual Suspects)


I’m not one of those people who puts aid donors on a par with the devil.  Just as I think of welfare programmes and the NHS in the UK as mechanisms by which the government can use tax income to help to guarantee peoples’ rights, I see aid as a form of redistribution from the better off to the worse off.  I couldn’t care less that economies, tax revenues and financial systems are bound by national borders, and I find the argument that governments in rich countries should only “look after their own” to be absurd and offensive.

However, like most people who have been involved in aid and development assistance for any amount of time, I’m well aware of the politics of aid, and I often wonder about what aid and assistance mean to the countries or people which are supposed to gain from them.  This isn’t a post about the need of donors to have their logos prominently displayed on every document folder and product and computer and grain bag they’ve paid for – although that is probably still a conversation worth having.  This post is more about the ownership of aid, and the ways in which the design of aid gets in peoples’ faces, possibly to the detriment of what it is trying to achieve.

A lot of attention has been paid to a new study of people who have been some way involved in aid, as recipients or as agencies or intermediaries. “Time to Listen: Hearing people on the Receiving End of International Aid” is based on 6000 interviews and investigates exactly this question. It has caused a fair amount of cringing and nodding in the aid community.  The first question I asked myself when I heard about the study was this: how did the researchers define “aid recipients”. The second question: did the aid recipients they spoke to all know that they were aid recipients?  The answer was: not necessarily. And an important finding of the book is that even if people know they’ve “received aid”, they don’t necessarily differentiate between sources of aid.


Quite a lot of my work involves reviewing and evaluating programmes. Different aspects of programmes – from the services being provided to communities, to the mechanisms by which money gets channelled towards those services, to the overarching national plans that these services are supposed to be part of.  Whatever level I am working at, these reviews will generally involve some interaction with some of the people who are getting these services.  My clients (NGOs, donors, UN agencies) are understandably keen to know what difference their advice, ideas or money made.  But of course, when you speak to people in the community, while they often care passionately and speak eloquently about the conditions people live in, the progressions and the regressions, and the services that are available, the mechanisms by which these changes did or did not happen are of little concern to them.  They care somewhat less about whether it was this NGO or that NGO that made it happen, or whether it was aid money or government money that paid for it.  And while they might happen to know who paid for it, this fact is far less important than what went right or what went wrong.

This, in my view, is exactly how it should be, even if it means a reviewer can’t honestly report which changes, or how much change, can be attributed to whom. 


This poses a problem because most agencies implementing social welfare or development programmes are desperate to be able to attribute progress to themselves.  I’m often asked to find out from “aid recipients” what they thought of this particular project. This is not just a case of neediness – whether the money is coming from a government department, a foreign donor, a philanthropist or a publicly funded NGO, they all want to be able to show their results, and most wish to learn from and adapt what they are doing.  Well-designed experimental evaluations are an important tool for properly assessing the efficacy of interventions and policies, and there is a lively ongoing debate about how evidence is constructed and what the “results agenda” in development work means.  But while this debate is important, it is also important to recognise that most evaluations of development efforts do not require this level of rigour.  It is not necessary to re-evaluate every single programme using a randomised control trial, because once good experimental evidence is available for a given intervention or policy, you should try to roll it out to as many people as possible: withholding it from a control group is no longer necessary or ethical.  At this stage, evaluation should be more about how an intervention works in the real world, whether available evidence on efficacy was properly applied, and what the effects of doing this were on the entire population rather than only in an intervention group.  This sort of evaluation can be called “programme evaluation”, to differentiate it from experimental evaluation.

But however sophisticated they are, most programme evaluations can never truthfully attribute effects to a given programme, since most effects are likely to be the result of a combination of factors.  Any changes that can be identified certainly can’t really be attributed to who paid because money is fungible.  So doesn’t it make more sense for programme evaluations to be designed in ways that assess the contribution of a combination inputs rather than the attribution of effects to each input; and that reflect the realities and concerns of the people at the receiving end?  If so, shouldn’t the design of these evaluations focus on helping people to identify changes and to identify what could be done better, rather than on asking them if they remember which projects took place? And if this is the case, does it not also have implications for how programmes are designed and delivered in the first place? Make accountability grow; make aid vanish.