During his remarks to the Senate Floor on the National Security Agency’s domestic surveillance program, Senator Ron Wyden exhibited a characteristic and well-meaning, but somewhat misleading response to mass data collection and data mining for security purposes. The substance of his challenge to the “dragnet” collection of data about the Americans (and others) was that it was unnecessary because, “In every instance in which the NSA has searched through these bulk phone records, it had enough evidence to get a court order for the information it was searching for.” In other words, the NSA already knew who its targets were, already had enough evidence to get a warrant, and was simply using the program to bypass the inconvenience of having to write one up and get it approved.
While Wyden’s observations may be accurate, they miss the heart of the shift in the mentality that guides new database-driven forms of surveillance. In the era of big data and predictive analytics, the standard logic of surveillance is reversed. You don’t first identify a target and then unleash the full force of the surveillance apparatus. You start with the population (and some priorities and preconceptions) and mine data about it in order to generate leads and suspects. This may not have been the approach taken by the investigations Wyden described, and it may not even have had any demonstrated successes (otherwise, we would likely have heard intimations of them during the defense of NSA surveillance mounted by the Obama administration in the wake of the Snowden revelations), but it is the speculative model upon which the intelligence apparatus is building its case for mass data collection.
The model is not an unfamiliar one: it is borrowed from marketing strategies that use data patterns to identify potential targets, as the CIA’s Chief Technology Officer, Gus Hunt, has enthusiastically noted: “We have these astounding commercial capabilities that have emerged in the market space that allow us to do things with information we’ve never been able to do before.” The paradigm shift he learned from Google et alia, is based on what he describes as the importance of, “moving away from search as a paradigm to pre-correlating data in advance to tell us what’s going on.” This is a somewhat opaque way of referring to data mining generally and predictive analytics in particular: the goal is not to find out more information about a particular target, but to learn from the data who should be a target in the first place.
In this context, it is not quite right to say that just because everyone is monitored, everyone is being treated as a suspect (although no one is ruled out in advance). Rather it is to understand that the vast majority of data will necessarily be collected about non-suspects in order to provide the background against which the actual suspects emerge. For data mining purposes the target is the population: the entire population, the full range of data about it which can be collected for as long as possible. The "complete" picture is needed in order to allow the clearest patterns to emerge over time. As the CIA’s Gus Hunt put it, “The value of any piece of information is only known when you can connect it with something else that arrives at a future point in time...Since you can't connect dots you don't have, it drives us into a mode of, we fundamentally try to collect everything and hang on to it forever.”
These are the watchwords of the new data surveillance era: “collect everything about everyone forever – or at least within the limits of the current sensing apparatus and storage capabilities.
The fact that the target is the population means that critiques based on particular individual targets (“since you already knew so-and-so was a ‘person of interest’ you could have gotten a warrant”) are unlikely to have much purchase upon a system that has already committed itself (without much in the way of concrete, publicly available evidence, as of yet) to the possibility that data mining might help generate new targets and pre-empt threats in advance rather than simply providing evidence to act against existing ones. The reversal of the relationship between targeting and surveillance means we are unlikely to see the surveillance sector back away from the programs revealed by Snowden and others (and others). Rather the pressure will go the other way: toward flexible access to an ever-growing range of data about everyone.