Why Data Mining Won't Stop Terrorists
Security and tech guru Bruce Schneier writes the definitive rebuttal of data mining as a counterterrorism tool.
Rule number one:
Data mining works best when there's a well-defined profile you're searching for, a reasonable number of attacks per year, and a low cost of false alarms.
Example: credit card fraud. By examining records of your transactions, credit card companies can spot a spending pattern that indicates something nefarious may be afoot. It's different with terrorism:
Terrorist plots are different. There is no well-defined profile, and attacks are very rare. Taken together, these facts mean that data mining systems won't uncover any terrorist plots until they are very accurate, and that even very accurate systems will be so flooded with false alarms that they will be useless.
All data mining systems fail in two different ways: false positives and false negatives. A false positive is when the system identifies a terrorist plot that really isn't one. A false negative is when the system misses an actual terrorist plot. Depending on how you "tune" your detection algorithms, you can err on one side or the other: you can increase the number of false positives to ensure that you are less likely to miss an actual terrorist plot, or you can reduce the number of false positives at the expense of missing terrorist plots.
Data-mining is the equivalent of searching for the proverbial needle in the haystack. Schneier crunches some numbers and reports:
This unrealistically-accurate system will generate one billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999% and you're still chasing 2,750 false alarms per day -- but that will inevitably raise your false negatives, and you're going to miss some of those ten real plots.
After some more examples of where data mining can be useful -- think Amazon or Netflix in projecting books or movies you might like based on your past purchases or reviews, Schneier writes:
Finding terrorism plots is not a problem that lends itself to data mining. It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier. We'd be far better off putting people in charge of investigating potential plots and letting them direct the computers, instead of putting the computers in charge and letting them decide who should be investigated.
Makes sense to me. Unfortunately, TIA lives on, as the National Journal reported a few weeks ago. It just went into an equivalent of the witness protection program: it changed its name and moved to the Defense Department.
| < Moussaoui Judge Warns Prosecutors | Is The White House Completely Losing Touch With Reality? > |





