February 8, 2006


US plans massive data sweep: Little-known data-collection system could troll news, blogs, even e-mails. Will it go too far? (Mark Clayton, 2/09/06, The Christian Science Monitor)

A major part of ADVISE involves data-mining - or "dataveillance," as some call it. It means sifting through data to look for patterns. If a supermarket finds that customers who buy cider also tend to buy fresh-baked bread, it might group the two together. To prevent fraud, credit-card issuers use data-mining to look for patterns of suspicious activity.

What sets ADVISE apart is its scope. It would collect a vast array of corporate and public online information - from financial records to CNN news stories - and cross-reference it against US intelligence and law-enforcement records. The system would then store it as "entities" - linked data about people, places, things, organizations, and events, according to a report summarizing a 2004 DHS conference in Alexandria, Va. The storage requirements alone are huge - enough to retain information about 1 quadrillion entities, the report estimated. If each entity were a penny, they would collectively form a cube a half-mile high - roughly double the height of the Empire State Building.

Note they can't compare it to the height of the WTC?

Posted by Orrin Judd at February 8, 2006 6:06 PM

indeed; why not compare it to the combined height of the twin towers.

Posted by: toe at February 8, 2006 6:37 PM

This is what happens when legal buzz-words are allowed to take over rules of law.

We have seen the monstrosities which arise when the slogan, "wall of separation of church and state," is allowed to displace the First Amendment. Likewise, the Fourth Amendment buzz-words, "reasonable expectation of privacy," displace in the minds of the half-learned the protection against unreasonable searches and seizures.

Over and over, the notion is floated that this public record or that public discourse should be deemed private just because someone did not expect it to be examined.


Private is private, and public is public. Public does not become private because one doubted the diligence of the public to whom it had been laid open.

Quickly now, that technology facilitates the examination of public information may affect one's expectation of privacy, but what is affected is only a buzz-word once used by a judge to announce a holding, not a right.

Posted by: Lou Gots at February 8, 2006 7:43 PM

Companies have been doing this for decades. In 1991 I worked for a subcontractor on a project for American Express where they used a CM-5 supercomputer to correlate purchase patterns for merchants.

All data mining does is find statistical patterns. For example, DM can determine that men who purchase diapers also purchase beer. So to increase beer sales the DM analyst would recommend to the convenience store manager to move the Budweiser nearer the Pampers.

What DM does not do is find a needle in a haystack. It does not pop up a picture of Johnny Jihadi that says "He's gonna blow up the Lincoln Center". All DM does is find new correlations, stuff like "people who do X tend to do Y also".

Posted by: Gideon at February 8, 2006 7:54 PM

And that, Gideon, allows them to focus on behavior, actions, and not other characteristics which gave profiling a bad name. Provides that reasonable suspicion based upon articulable facts.

Posted by: Mikey at February 9, 2006 8:26 AM