‘English was polite but unapologetic: “Some families are broken.’
This was Bill English quoted in a recent article, which appeared to be about how government data could be used to identify and fix at risk families. The author said that: “English wanted to reach into these families in an effort to steer young people away from crime.” From the article it sounds like the Government wants to use big data collected by various agencies to identify at risk families and help fix them.
The first point to note is that this won’t fix the big issue facing families, which as we have discussed before is poverty and the best way to fix that problem is giving them money. While the majority of poor parents won’t fritter away extra money as the public might preconceive, there is a small core of families for which money won’t make a difference. So while the idea championed by Mr English won’t fix the big problem, it might in theory be part of the solution.
Using big data to target families is a whole lot more complex than it first appears, and there are reasons to doubt whether it is even possible. The first question is whether the tool is accurate at targeting kids at risk. The second is whether we can actually help kids once targeted, or do we just make things worse e.g. by stigmatizing them.
To understand these issues we need to go back in time.
It All Starts with The Risk Prediction Project
Nasty name, but you may recall there was a lot of talk about a year or so back about a tool which the Ministry of Social Development and Auckland University were working on in which they were using data available from the Ministry of Social Development (for WINZ, CYFS and other government providers) to see whether they could predict in advance those children most at risk of child abuse and/or neglect (CAN), with the ultimate goal of preventing it from happening. The study had got to a phase where they had used the tool on historical data to see if it was good at ‘red flagging’ kids in the database who had in fact been maltreated.
The study authors found the tool had ‘good sensitivity and specificity’ – which basically means the tool did an ok job of not missing kids who had been maltreated and of accurately identifying those who had. Next part of developing that tool was to test it in real life, so follow a group of children (who still all got standard risk identification procedures from MSD and CYFS) and see if the tool was BETTER at identifying those at risk of maltreatment than the current process. That however went down like a bucket of cold sick and was pretty much shelved. The whole history of it has been captured here.
Individual Risk Prediction has Problems
So along with us currently not actually knowing if this individual risk prediction tool would work any better and at a lower cost than our current risk identification procedures, there are a number of problems with such tools, the main one is the ethics of targeting.
Here is the scenario – a couple or woman gets pregnant, she (or he) is in the MSD database (as with about half of kiwis, at some point in her life she or he received a benefit) and the tool flags her (or her partner) as having all the factors present to put their child at high risk of being maltreated
… what now? Does it become mandatory for that family to attend an intervention – how does that work? Are we forcing families to do something on the basis that they may turn out to be abusers? Or if it is voluntary, what happens when they choose not to attend? Are they then ‘monitored’ by a team anyway just in case? What impact does close supervision of that child and family have? Is there going to be an ‘observer effect’ where the child is put at greater risk simply because her parents are under pressure and observation from government services and labelled potential abusers? Shades of George Orwell’s “1984” are gathering.
What about the potential intervention we would use? Home-visiting programmes for example can work to improve outcomes for families but only if they are supportive, non-judgemental, culturally relevant, professionally led, well funded, long term, community driven, and highly evidenced based. The problem is we have a pretty shoddy track record in NZ of delivering this type of programme, sticking to it, and measuring its impact properly
So even if this tool were to turn out to be able to predict accurately which kids are more likely to be maltreated, ethically there are some big questions about it, and there are also some big questions about whether we could actually deliver any inventions in a way that made a difference to these families and kids.
BUT this is NOT what Bill is talking about
What Bill thinks will save us now (given that The Risk Prediction tool looks like a duck with one leg) is “Integrated Data Infrastructure”. This is where a massive pool of ‘administrative data’ is created and we can look across any number of points of information to find out what factors are leading to poor outcomes for our kids and target our social services better much better.
There is a big thing to keep in mind here – this data is anonymised data- that means no one can look at any individual or families’ data and identify them from it. What we can do is look at clusters of factors present in peoples lives and see which factors or combos are more likely to predict for example a child ending up in prison as an adult.
Treasury has already done some of this analysis and come up with pretty much what every other country who also does this stuff knows, and what researchers (in New Zealand) have been telling us for years. The kids who grow up in families who are in receipt of the benefit, with a mother who did not complete school, who has a parent in contact with the criminal justice system and has a CYF finding of neglect or abuse – is much more likely to have serious issues as an adult and end up costing the tax payer by being a beneficiary and not contributing to the government coffers.
So far so good – because you know we love good data and science here at the Morgan Foundation, the government getting better at using the data it does collect is the right thing to do. BUT what now? Well we already know that this data cannot be used to reach in and identify individual families at risk, as the article seems to be suggesting. We still have to find a way to identify and target them so we are back to relying on existing agencies already working in at-risk communities looking for families with these types of factors present (so kind of what we already do anyway…). And then we have this other rather MASSIVE elephant in the room; we are still not looking at what actually works to improve the outcomes for these families and kids once we find them.
Knowing the Risks STILL does not tell us what works
Understanding what puts kids at risk of having a pretty grim childhood and adult life is really important, but it is not the same as knowing what interventions work to improve their outcomes. Of course we need ‘epidemiological’ data to understand the factors that are present and help us design interventions. We need to know that unemployment, economic deprivation, abuse, and having a parent with low education levels put a child at risk. We need to use this data to put together an intervention that might hit a lot of these issues, and guess what?
That approach might work, but the funny thing with interventions is sometimes what you think should work does not, or only works a little bit, or works but costs so much money it is pointless. That is because those risk factors we identify with our data turn out to be only slightly related to a whole lot of other factors that are actually causing poor outcomes for kids. For example, it is not being on benefit that is the actual issue, it is likely to be the stress associated with being poor, being stigmatised and feeling a sense of hopelessness that then feeds into the relationship with a child and that child’s development and well-being. Unsurprisingly social and psychological issues are complicated, so that is why we have to look really hard at the evidence about what works to improve outcomes. This may not always line up with what we think we know from the data about what causes the problem in the first place.
An example of this very kind of simplified thinking was seen this week with the proposal that we fund schools on the basis of risk factors that lead to poor adult outcomes (the ones we discussed above from treasury). But jumping straight from identifying the presence of risk factors to implementing a new policy is fraught; entire swaths of information are missing, not least whether these risk factors are what causes educational underachievement (or whether it is some other related factor), whether using school funding as an intervention changes outcomes, or whether giving more school based funding on the basis of these particular risk factors works better than another intervention to improve outcomes for kids
So you know good on Bill for wanting better data, but he has a long way to go to find out what actually works to improve outcomes for these kids.