War is, you might think, unpredictable, especially when it comes to insurgent attacks carried out by loosely organized factions. But while strikes might appear to come from nowhere, researchers have now shown that crunching through WikiLeaks data can predict where attacks will happen.
The research, conducted at the University of Edinburgh and published in PNAS, used the "Afghan War Diary" as a data set. Dumped onto the internet two years ago by WikiLeaks, it details 77,000 military logs dated between 2004 and 2009 and turns out to be a goldmine for data lovers.
The math behind the work is incredibly complex—using ideas from statistics, signal processing, and even ecology—but the result is a software tool that can identify underlying trends in the data detailing attacks. It's all made possible, of course, by the huge sample size which makes it much easier to sort signal from noise.
It works surprisingly well. The software, for instance, predicted that in the Baghlan province of Afghanistan the number of incidents would rise from 100 in 2009 to 228 in 2010. In reality, the total for 2010 was 222. Amazingly, it's even able to predict long-term trends in the most volatile parts of the country—though obviously it's less accurate—and validation of the model shows that across all 32 of Afghanistan's provinces it's at least accurate in a statistical sense.