The Guardian reports today that the UK Government has released a review of its Mandatory Work Activity Programme, and that the review finds the programme didn’t have any success getting people back to work. The review was written by the National Institute for Economic and Social Research (NIESR) and can be obtained from the blog of the NIESR’s director. The Director is disappointed that the Government announced an increase in funding after the review was released, even though it finds no effect. This MWA programme is essentially one of those work-for-the-dole schemes that forces people to take an unpaid job for four weeks if they want to keep their unemployment benefits.

I strongly oppose work-for-the-dole schemes and I wouldn’t be surprised to find they don’t work in their supposed central goal of getting the unemployed back to work; I also don’t believe that really is their central goal, and I think they’re a terrible economic idea. But I can also see that they could be successful in getting people back to work (at least, in a functioning economy – so probably not in the UK). So I was interested to check the report and see whether or not its research methodology really supported the claims. I don’t think it does, and I don’t think this report forms a solid basis for a critical review of the programme.

This report collects data on a couple of thousand people who were referred to the MWA between May and July, and compares them with people on benefits in that time period who weren’t referred. It then follows them to the end of November (I think), and compares them with a group of non-referred subjects who are identified during the same May-July period, and assigned “pseudo-referral dates” randomly to match those of the referred people. The groups are then matched using propensity score matching. The outcomes are then the differences in the proportion of people in work or in receipt of a benefit payment at each month between treatment group and control group. The results find no difference in work outcomes between the two groups.

I think this method won’t show what it aims to show, because it doesn’t adjust for serial dependence or right-censoring. Only 5% of the sample was recruited in May and 40% in June, so the July sample – about 45% of the sample – were followed for 1-2 months less than the remainder. Two months’ less follow-up out of six is a lot of lost follow-up. This could have been adjusted for using survival analysis. I also don’t know why they worried about assigning pseudo-referral dates to the controls. With survival analysis, everyone in the sample can be considered a control until they receive a referral, at which point they switch into the treatment group, and then their follow-up time gets assessed. This allows for the people who entered work before they had a chance to be referred to the MWA to be included in the analysis. Finally, by analyzing each week separately (as in the charts on the Director’s blog) rather than analyzing time-to-employment, the analysis includes a heavy element of serial dependence. Basically, the proportion of people out of work in any one week will contain data from many of the same people as the week before. This serial dependence leads to bias in estimates of accuracy, and doesn’t adjust for the long-term unemployed properly – if someone had been on benefits for two years in the first week of July, chances are they’ll still be there contributing to the mean in the second week of July. This should be adjusted for more carefully I think.

If survival analysis is used for this problem, there is no reason to stop collecting referral data at the end of July. Go right up until the end of the data, because the reduced follow-up time of later entrants is handled implicitly in this method. Also, comparing proportions in terms of differences – as this report does – is not a good plan. A 10% change from 80% is very different to a 10% change from 50%. Odds ratios or relative risks should always be used. Survival analysis uses hazard ratios, essentially equivalent to relative risks in a cohort study.

Since this method wasn’t used, I don’t think the findings are robust. It’s hard to tell from the report exactly what the outcomes were but the lack of effect on employment could simply represent a failure to properly handle follow-up time and right-censorship. The data should be re-analyzed using survival analysis on all the available information, rather than a strange and non-standard process of assigning pseudo-referral dates to people who were never referred, and then comparing differences in weekly proportions. Strange!