Question
Question: The AOL search-query database released on the Web(Section 2.1.2) included the search query How to kill your wifeand other related queries by the same
Question: The AOL search-query database released on the Web(Section 2.1.2) included the search query “How to kill your wife”and other related queries by the same person. Give arguments forand
against allowing law enforcement agents to search the querydatabases of search engine companies
periodically to detect plans for murders, terrorist attacks, orother serious crimes so that they can
try to prevent them.
Here is Section 2.1.2 attached
2.1.2 New Technology, New Risks
Computers, the Internet, and a whole array of digitaldevices—with their astounding
increases in speed, storage space, and connectivity—make thecollection, searching,
analysis, storage, access, and distribution of huge amounts ofinformation and images
much easier, cheaper, and faster than ever before. These aregreat benefits. But when the
information is about us, the same capabilities threaten ourprivacy.
Today there are thousands (probably millions) of databases, bothgovernment and
private, containing personal information about us. In the past,there was simply no
record of some of this information, such as our specificpurchases of groceries and books.
Government documents like divorce and bankruptcy records havelong been in public
records, but accessing such information took a lot of time andeffort. When we browsed in
a library or store, no one knew what we read or looked at. Itwas not easy to link together
our financial, work, and family records. Now, large companiesthat operate video, email,
social network, and search services can combine information froma member’s use of
all of them to obtain a detailed picture of the person’sinterests, opinions, realtionships,
habits, and activities. Even if we do not log in as members,software tracks our activity
on the Web. In the past, conversations disappeared when peoplefinished speaking, and
only the sender and the recipient normally read personalcommunications. Now, when we
communicate by texting, email, social networks, and so on, thereis a record of our words
that others can copy, forward, distribute widely, and read yearslater. Miniaturization
of processors and sensors put tiny cameras in cellphones thatmillions of people carry
everywhere. Cameras in some 3-D television sets warn children ifthey are sitting too
close. What else might such cameras record, and who might seeit? The wireless appliances
we carry contain GPS and other location devices. They enableothers to determine our
location and track our movements. Patients refill prescriptionsand check the results of
medical tests on the Web. They correspond with doctors by email.We store our photos
2.1 Privacy Risks and Principles 51
and videos, do our taxes, and create and store documents andfinancial spreadsheets in a
cloud of remote servers instead of on our own computer. Powerand water providers might
soon have metering and analysis systems sophisticated enough todeduce what appliances
we are using, when we shower (and for how long), and when wesleep. Law enforcement
agencies have very sophisticated tools for eavesdropping,surveillance, and collecting and
analyzing data about people’s activities, tools that can helpreduce crime and increase
security—or threaten privacy and liberty.
Combining powerful new tools and applications can haveastonishing results. It is
possible to snap a photo of someone on the street, match thephoto to one on a social
network, and use a trove of publicly accessible information toguess, with high probability
of accuracy, the person’s name, birth date, and most of his orher Social Security number.
This does not require a supercomputer; it is done with asmartphone app. We see such
systems in television shows and movies, but to most people theyseem exaggerated or way
off in the future.
All these gadgets, services, and activities have benefits, ofcourse, but they expose us
to new risks. The implications for privacy are profound.
Patient medical information is confidential. It should not bediscussed
in a public place.
—A sign, aimed at doctors and staff, in an elevator in a medicaloffice
building, a reminder to prevent low-tech privacy leaks.
Example: Search query data
After a person enters a phrase into a search engine, views someresults, then goes on to
another task, he or she expects that the phrase is gone—gonelike a conversation with a
friend or a few words spoken to a clerk in a store. After all,with millions of people doing
searches each day for work, school, or personal uses, how couldthe search company store
it all? And who would want all that trivial information anyway?That is what most people
thought about search queries until two incidents demonstratedthat it is indeed stored, it
can be released, and it matters.
Search engines collect many terabytes of data daily. A terabyteis a trillion bytes.
It would have been absurdly expensive to store that much data inthe recent past, but
no longer. Why do search engine companies store search queries?It is tempting to say
“because they can.” But there are many uses for the data.Suppose, for example, you search
for “Milky Way.” Whether you get lots of astronomy pages orinformation about the
candy bar or a local restaurant can depend on your searchhistory and other information
about you. Search engine companies want to know how many pagesof search results
users actually look at, how many they click on, how they refinetheir search queries, and
what spelling errors they commonly make. The companies analyzethe data to improve
52 Chapter 2 Privacy
search services, to target advertising better, and to developnew services. The database of
past queries also provides realistic input for testing andevaluating modifications in the
algorithms search engines use to select and rank results. Searchquery data are valuable to
many companies besides search engine companies. By analyzingsearch queries, companies
draw conclusions about what kinds of products and featurespeople are looking for. They
modify their products to meet consumer preferences.
But who else gets to see this mass of data? And why should wecare?
If your own Web searches have been on innocuous topics, and youdo not care who
sees your queries, consider a few topics people might search forand think about why
they might want to keep them private: health and psychologicalproblems, bankruptcy,
uncontrolled gambling, right-wing conspiracies, left-wingconspiracies, alcoholism, antiabortion information, pro-abortioninformation, erotica, illegal drugs. What are some
possible consequences for a person doing extensive research onthe Web for a suspense
novel about terrorists who plan to blow up chemicalfactories?
In 2006, the federal government presented Google with asubpoena
[1]
for two months
of user search queries and all the Web addresses
†
that Google indexes.
‡
Google protested,
bringing the issue to public attention. Although the subpoenadid not ask for names of
users, the idea of the government gaining access to the detailsof people’s searches horrified
privacy advocates and many people who use search engines. Googleand privacy advocates
opposed the precedent of government access to large masses ofsuch data. A court reduced
the scope of the subpoena, removing user queries.
4
A few months later, release of a huge database of search queriesat AOL showed that
privacy violations occur even when the company does notassociate the queries with people’s names. Against company policy,an employee put the data on a website for search
technology researchers. This data included more than 20 millionsearch queries by more
than 650,000 people from a three-month period. The dataidentified people by coded
ID numbers, not by name. However, it was not difficult to deducethe identity of some
people, especially those who searched on their own name oraddress. A process calledreidentification identified others.Re-identification means identifying the individual from
a set of anonymous data. Journalists and acquaintancesidentified people in small communities who searched on numerousspecific topics, such as the cars they own, the sports
teams they follow, their health problems, and their hobbies.Once identified, a person is
linked to all his or her other searches. AOL quickly removed thedata, but journalists,
[1]
A subpoena is a court order for someone to give testimony orprovide documents or other information for an
investigation or a trial.
†
We use the term Web address informally for identifiers, oraddresses, or URLs of pages or documents on the Web
(the string of characters one types in a Web browser).
‡
It wanted the data to respond to court challenges to the ChildOnline Protection Act (COPA), a law intended to
protect children from online material “harmful to minors.” (Wediscuss COPA in Section 3.2.2.)
2.1 Privacy Risks and Principles 53
researchers, and others had already copied it. Some made thewhole data set available on
the Web again.
5[1]
Example: Smartphones
With so many clever, useful, and free smartphone apps available,who thinks twice about
downloading them? Researchers and journalists took a close lookat smartphone software
and apps and found some surprises.
Some Android phones and iPhones send location data (essentiallythe location of
nearby cell towers) to Google and Apple, respectively. Companiesuse the data to build
location-based services that can be quite valuable for thepublic and for the companies.
(Industry researchers estimate the market for location servicesto be in the billions of
dollars.) The location data is supposed to be anonymous, butresearchers found, in some
cases, that it included the phone ID.
Roughly half the apps in one test sent the phone’s ID number orlocation to other
companies (in addition to the one that provided the app). Somesent age and gender information to advertising companies. The appssent the data without the user’s knowledge
or consent. Various apps copy the user’s contact list to remoteservers. Android phones
and iPhones allow apps to copy photos (and, for example, postthem on the Internet) if
the user permits the app to do certain other things that havenothing to do with photos.
(Google said this capability dated from when photos were onremovable memory cards
and thus less vulnerable.
6
This is a reminder that designers must regularly review and
update security design decisions.)
A major bank announced that its free mobile banking appinadvertently stored
account numbers and security access codes in a hidden file onthe user’s phone. A phone
maker found a flaw in its phones that allowed apps to accessemail addresses and texting
data without the owner’s permission. Some iPhones stored monthsof data, in a hidden
file, about where the phone had been and when, even if the userhad turned off location
services. Data in such files are vulnerable to loss, hacking,and misuse. If you do not know
the phone stores the information, you do not know to erase it.Given the complexity of
smartphone software, it is possible that the companies honestlydid not intend the phones
to do these things.
†
Why does it matter? Our contact lists and photos are ours; weshould have control of
them. Thieves can use our account information to rob us. Appsuse features on phones
that indicate the phone’s location, the light level, movement ofthe phone, the presence
of other phones nearby, and so on. Knowing where we have beenover a period of time
(combined with other information from a phone) can tell a lotabout our activities and
[1]
Members of AOL sued the company for releasing their searchqueries, claiming the release violated roughly 10
federal and state laws.
†
The various companies provided software updates for theseproblems.
54 Chapter 2 Privacy
1. Files on hundreds of thousands of students, applicants,faculty, and/or alumni from the
University of California, Harvard, Georgia Tech, Kent State, andseveral other universities,
some with Social Security numbers and birth dates (stolen byhackers).
2. Names, birth dates, and possibly credit card numbers of 77million people who play video
games online using Sony’s PlayStation (stolen by hackers).Another 24 million accounts
were exposed when hackers broke into Sony Online Entertainment’sPC-game service.
3. Records of roughly 40 million customers of TJX discountclothing stores (T.J. Maxx,
More about the TJX
incident: Section 5.2.5
Marshalls, and others), including credit and debit card numbersand some
driver’s license numbers (stolen by hackers).
4. Bank of America disks with account information (lost orstolen in transit).
5. Credit histories and other personal data for 163,000 people(purchased from a huge
database company by a fraud ring posing as legitimatebusinesses).
6. Patient names, Social Security numbers, addresses, dates ofbirth, and medical billing
information for perhaps 400,000 patients at a hospital (on alaptop stolen from a hospital
employee’s car).
7. More than 1000 Commerce Department laptops, some withpersonal data from Census
questionnaires. (Thieves stole some from the cars of temporaryCensus employees; others,
employees simply kept.)
8. Confidential contact information for more than one millionjob seekers (stolen from
Monster.com by hackers using servers in Ukraine).
Figure 2.1 Lost or stolen personal information.
7
interests, as well as with whom we associate (and whether thelights were on). As we
mentioned in Section 1.2.1, it can also indicate where we arelikely to be at a particular
time in the future.
Some of the problems we described here will have been addressedby the time you
read this; the point is that we are likely to see similar (butsimilarly unexpected) privacy
risks and breaches in each new kind of gadget or capability.
Stolen and lost data
Criminals steal personal data by hacking into computer systems,by stealing computers
and disks, by buying or requesting records under falsepretenses, and by bribing employees
Hacking: Section 5.2
of companies that store the data. Shady information brokers selldata
(including cellphone records, credit reports, credit cardstatements,
medical and work records, and location of relatives, as well asinformation about financial
and investment accounts) that they obtain illegally or byquestionable means. Criminals,
lawyers, private investigators, spouses, ex-spouses, and lawenforcement agents are among
the buyers. A private investigator could have obtained some ofthis information in the
past, but not nearly so easily, cheaply, and quickly.
2.1 Privacy Risks and Principles 55
Another risk is accidental (sometimes quite careless) loss.Businesses, government
agencies, and other institutions lose computers, disks, memorycards, and laptops containing sensitive personal data (such asSocial Security numbers and credit card numbers)
on thousands or millions of people, exposing people to potentialmisuse of their information and lingering uncertainty. Theyinadvertently allow sensitive files to be public
on the Web. Researchers found medical information, SocialSecurity numbers, and other
sensitive personal or confidential information about thousandsof people in files on the
Web that simply had the wrong access status.
The websites of some businesses, organizations, and governmentagencies that make
account information available on the Web do not sufficientlyauthenticate the person acMore about authentication techniques:
Section 5.3.2
cessing the information, allowing imposters access. Data thievesoften
get sensitive information by telephone by pretending to be theperson whose records they seek. They provide some personalinformation
about their target to make their request seem legitimate. Thatis one
reason why it is important to be cautious even with data that isnot particularly sensitive
by itself.
Figure 2.1 shows a small sample of incidents of stolen or lostpersonal information
(the Privacy Rights Clearinghouse lists thousands of suchincidents on its website). In
many incidents, the goal of thieves is to collect data for usein identity theft and fraud,
crimes we discuss in detail in Chapter 5.
A summary of risks
The examples we described illustrate numerous points aboutpersonal data. We summarize
here:
.
Anything we do in cyberspace is recorded, at least briefly, andlinked to our
computer or phone, and possibly our name.
. With the huge amount of storage space available, companies,organizations, and
governments save huge amounts of data that no one would haveimagined saving
in the recent past.
.
People often are not aware of the collection of informationabout them and their
activities.
.
Software is extremely complex. Sometimes businesses,organizations, and website
managers do not even know what the software they use collectsand stores.8
.
Leaks happen. The existence of the data presents a risk.
.
A collection of many small items of information can give afairly detailed picture
of a person’s life.
.
Direct association with a person’s name is not essential forcompromising privacy.
Re-identification has become much easier due to the quantity ofpersonal information stored and the power of data search andanalysis tools.
56 Chapter 2 Privacy
.
If information is on a public website, people other than thosefor whom it was
intended will find it. It is available to everyone.
.
Once information goes on the Internet or into a database, itseems to last forever.
People (and automated software) quickly make and distributecopies. It is almost
impossible to remove released information from circulation.
.
It is extremely likely that data collected for one purpose (suchas making a phone
call or responding to a search query) will find other uses (suchas business planning,
tracking, marketing, or criminal investigations).
. The government sometimes requests or demands sensitivepersonal data held by
businesses and organizations.
.
We often cannot directly protect information about ourselves. Wedepend on the
businesses and organizations that manage it to protect it fromthieves, accidental
collection, leaks, and government prying
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started