Science and Technology : Data Mining 101: Finding out ALL About YOU

anAfrican

Well-Known Member
REGISTERED MEMBER
Feb 1, 2005
3,472
710
StreetNationEarth: Seattle
Occupation
The Meek !Shall! Inherit the Earth.
Data Mining 101: Finding Subversives with Amazon Wishlists

This article describes how anybody in the world can develop sophisticated profiles of hundreds of thousands of U.S. citizens, using only free and publicly available resources.
Vast deposits of personal information sit in databases across the internet. Terms used in phone conversations have become the grounds for federal investigation. Reputable organizations like the Catholic Worker, Greenpeace, and the Vegan Community Project, have come under scrutiny by FBI "counterterrorism" agents.

"Data mining" of all that information and communication is at the heart of the furor over the recent disclosure of government snooping. "U.S. President George W. Bush and his aides have said his executive order allowing eavesdropping without warrants was limited to monitoring international phone and e-mail communications linked to people with connections to al-Qaeda. What has not been acknowledged, according to the Times, is that NSA technicians combed large amounts of phone and Internet traffic seeking patterns pointing to terrorism suspects.

"Some officials described the program as a large data mining operation, the Times said, and described it as much larger than the White House has acknowledged." (Reuters)

Combining a data mining operation with the Patriot Act's power to access information makes it all too easy for the federal government to violate the Constitution's prohibition against unreasonable search. Ars Technica has an article, The new technology at the root of the NSA wiretap scandal, that describes the ease with which widespread wiretapping can now be implemented. It quotes Philip Zimmermann, the creator of the PGP encryption software:

"A year after the CALEA [Communications Assistance for Law Enforcement Act] passed [in 1994], the FBI disclosed plans to require the phone companies to build into their infrastructure the capacity to simultaneously wiretap 1 percent of all phone calls in all major U.S. cities. This would represent more than a thousandfold increase over previous levels in the number of phones that could be wiretapped. In previous years, there were only about a thousand court-ordered wiretaps in the United States per year, at the federal, state, and local levels combined. It's hard to see how the government could even employ enough judges to sign enough wiretap orders to wiretap 1 percent of all our phone calls, much less hire enough federal agents to sit and listen to all that traffic in real time. The only plausible way of processing that amount of traffic is a massive Orwellian application of automated voice recognition technology to sift through it all, searching for interesting keywords or searching for a particular speaker's voice. If the government doesn't find the target in the first 1 percent sample, the wiretaps can be shifted over to a different 1 percent until the target is found, or until everyone's phone line has been checked for subversive traffic. The FBI said they need this capacity to plan for the future. This plan sparked such outrage that it was defeated in Congress. But the mere fact that the FBI even asked for these broad powers is revealing of their agenda."

It used to be you had to get a warrant to monitor a person or a group of people. Today, it is increasingly easy to monitor ideas. And then track them back to people. Most of us don't have access to the databases, software, or computing power of the NSA, FBI, and other government agencies. But an individual with access to the internet can still develop a fairly sophisticated profile of hundreds of thousands of U.S. citizens using free and publicly available resources. Here's an example.

There are many websites and databases that could be used for this project, but few things tell you as much about a person as the books he chooses to read. Isn't that why the Patriot Act specifically requires libraries to release information on who's reading what? For this reason, I chose to focus on the information contained in the popular Amazon wishlists.

Amazon wishlists lets anyone bookmark books for later purchase. By default these lists are public and available to anybody who searches by name. If the wishlist creator specifies a shipping address, someone else can even purchase the book on Amazon and have it shipped directly as a gift. The wishlist creator's city and state are made public on the wishlist, but the street address remains private. Amazon's popularity has created a vast database of wishlists. No index of all wishlists is available, but it remains possible to view all wishlists by people of a particular first name. A recent search for people named Mark returned 124,887 publicly viewable wishlists.

For an all inclusive search by name, you could compile a comprehensive list of first names and nicknames from the baby names databases available on the internet. Armed with this list, and by recording the search results for each first name, it is possible for you to retrieve the vast majority of public wishlists on Amazon.
Data Mining 101: Finding Subversives with Amazon Wishlists
 
anAfrican said:
Data Mining 101: Finding Subversives with Amazon Wishlists

This article describes how anybody in the world can develop sophisticated profiles of hundreds of thousands of U.S. citizens, using only free and publicly available resources.

Data Mining 101: Finding Subversives with Amazon Wishlists

I don't know about sophisticated profiles using only amazon- but you can glean information easily. This is actually my area of expertise and there are open source tools that will enable one to get very deep in this area.

If you know java try here - pentaho project has a data mining and olap component...
http://pentaho.org/index.php
 

Donate

Support destee.com, the oldest, most respectful, online black community in the world - PayPal or CashApp

Latest profile posts

TractorsPakistan.com is one of the leading tractor exporters from Pakistan to Africa and the Caribbean regions.
HODEE wrote on Etophil's profile.
Welcome to Destee
@Etophil
Back
Top