Coming soon: tech to predict the crimes you may soon commit
Computers are getting pretty good at predicting the future. In many cases they do it better than people. That's why Amazon uses them to figure out what you're likely to buy, how Netflix knows what you might want to watch, the way meteorologists come up with accurate 10-day forecasts.
Now a team of scientists has demonstrated that a computer can outperform human judges in predicting who will commit a violent crime. In a paper published last month, they described how they built a system that started with people already arrested for domestic violence, then figured out which of them would be most likely to commit the same crime again.
The technology could potentially spare victims from being injured, or even killed. It could also keep the least dangerous offenders from going to jail unnecessarily. And yet, there's something unnerving about using machines to decide what should happen to people. If targeted advertising misfires, nobody's liberty is at stake.
For two decades, police departments have used computers to identify times and places where crimes are more likely to occur, guiding the deployment of officers and detectives. Now they're going another step: using vast data sets to identify individuals who are criminally inclined.
They're doing this with varying levels of transparency and scientific testing. A system called Beware, for example, is capable of rating citizens of the US city of Fresno, California, as posing a high, medium or low level of threat. Press accounts say the system amasses data not only on past crimes but on web searches, property records and social networking posts.
Critics are warning that the new technology had been rushed into use without enough public discussion.
One question is precisely how the software works - it's the manufacturer's trade secret.
Another is whether there's scientific evidence that such technology works as advertised.
By contrast, the recent paper on the system that forecasts domestic violence lays out what it can do and how well it can do it.
One of the creators of that system, University of Pennsylvania statistician Richard Berk, said he only works with publicly available data on people who have already been arrested.
The system isn't scooping up and crunching data on ordinary citizens, he said, but is making the same forecasts that judges or police officers previously had to make when it came time to decide whether to detain or release a suspect.
He started working on crime forecasting more than a decade ago, and by 2008 had created a computerised system that beat the experts in picking which parolees were most likely to reoffend.
He used a machine learning system - feeding a computer lots of different kinds of data until it discovered patterns that it could use to make predictions, which then can be tested against known data.
Machine learning doesn't necessarily yield an algorithm that people can understand. Users know which parameters get considered but not how the machine uses them to get its answers.
In the domestic violence paper, published in February in the 'Journal of Empirical Legal Studies', Berk and Penn psychologist Susan Sorenson looked at data from about 100,000 cases, all occurring between 2009 and 2013. Here, too, they used a machine learning system, feeding a computer data on age, sex, zip code, age at first arrest, and a long list of possible previous charges for such things as drunk driving, animal mistreatment, and firearms crimes.
They did not use race, though Berk said the system isn't completely race blind because some inferences about race can be drawn from a person's zip code.
The researchers used about two-thirds of the data to "train" the system, giving the machine access to the input data as well as the outcome - whether or not these people were arrested a second time for domestic violence.
The other third of the data they used to test the system, giving the computer only the information that a judge could know at arraignment, and seeing how well the system predicted who would be arrested for domestic violence again.
It would be easy to reduce the number of repeat offenses to zero by simply locking up everyone accused of domestic violence, but there's a cost to jailing people who aren't going to be dangerous, said Berk. Currently, about half of those arrested for domestic violence are released, he said.
The challenge he and Sorenson faced was to continue to release half but pick a less dangerous half. The result: about 20pc of those released by judges were later arrested for the same crime. Of the computer's choices, it was only 10pc. (Bloomberg)