Privacy Lapse at Google JotSpot
This page documents the sensitive data I have found on Google JotSpot servers, analyzes likely harms, compares JotSpot's practices with its promises, and considers implications in light of Google's broad and growing efforts to store user data on centralized servers.
JotSpot's Privacy Leaks
For wikis hosted at Google JotSpot, "user management" pages offer lists of all registered for users. See Screenshot 1. For each listed user, a link offers access to a detail page presenting more information about that specific user. Details page report usernames, full names, and full email addresses, along with technical details such as preferred edit style, time zone, and (for some users) instant message usernames. See Screenshot 2.
This data is posted for thousands of wikis hosted at Google JotSpot. Searching for "user management" at jot.com (via the Google search ["user management" site:jot.com]) yields 2,400+ results. See Screenshot 3.
This data is available even for secure wikis. For example, when I request codebook.jot.com, I am redirected to the login screen shown in Screenshot 4. By all indications, the operators of this wiki have elected to deny access by the general public. Yet the codebook wiki's user listing is widely available. See Screenshot 5 (user list) and Screenshots 6-8 (user details for codebook participants as distinguished as uber-cyberprof Lawrence Lessig).
JotSpot also allows the public to view the special roles of group administrators. See Screenshot 9, showing the various levels of administrators in the codebook Jot group.
I see three distinct harms from these JotSpot postings.
1) Posting email addresses on publicly-available web pages invites massive unsolicited commercial email. Crawlers read publicly-accessible pages and add listed addresses to their distribution lists. A 2003 study by the Center for Democracy and Technology found that more than 95% of tested spam was sent to addresses publicly posted on the web -- confirming the inadvisability of posting email addresses for all to see. Yet Google JotSpot posts email addresses without taking any steps to protect users from spam. In particular, JotSpot posts addresses in ordinary easily-readable text and in ordinary unencoded HTML, without the encoding many sites now recommend (1, 2, 3).
2) Full names and group memberships are reasonably viewed as sensitive and unsuitable for public distribution by JotSpot. Users and administrators have a particularly strong expectation of privacy in the context of closed groups like codebook.jot.com, where access to group discussion requires registration. Moreover, page titles on user-detail pages exactly indicate that the pages are "restricted" -- falsely suggesting that the contents of the pages are not available to the general public, when in fact the pages are available to anyone who cares to look.
3) The additional data provided by JotSpot exposes users to unpredictable cross-cutting attacks. For example, with a combination of a user's name, email address, JotSpot group membership and role, and instant message username, a perpetrator could send a compelling social engineering attack -- perhaps pretending to be a group administrator seeking assistance or document review.
JotSpot's Promises: Privacy and Security
JotSpot's Privacy Lapse in a Google'd World of Server-Based Computing
Google's recent services present a vision of server-based computing -- with users' search history, email, calendar, documents, presentations, spreadsheets, and even medical history all stored on systems Google operates.
Google's centralized approach to data storage reflects a major change from current practice. At present, users (and their employers) generally directly control the systems that house their data -- so users (or employers) can examine security practices first-hand, can personally assess security glitches, and can discuss relevant practices with responsible designers and administrators. Not so in Google's world, where implementation is delegated to Google, where Google typically does not provide robust customer support, and where Google is unlikely to discuss the details of its privacy and security policies. Indeed, users are asked to trust Google's approach without any apparent way to verify what protections Google has implemented on their behalf. Furthermore, Google's terms of service and other agreements systematically disclaim any promise that systems will be secure.
In fact, a series of recent vulnerabilities have shown the limits of relying on Google security. For example, a July 2008 Google glitch let any user obtain the full name associated with a Gmail account. A September 2007 vulnerability let arbitrary web sites modify users' Gmail accounts to forward mail to attackers, if users were logged in to Gmail with their passwords saved. A January 2007 vulnerability let arbitrary web sites retrieve users' Gmail contact lists.
Publicly-reported vulnerabilities probably significantly understate the true scope of privacy lapses at Google. Consider Google's likely response when its staff find vulnerabilities. For companies covered by the data breach notification laws present in at least 44 states, consumer notification is generally compulsory. But Google is generally not subject to these notification requirements: While Google collects extensive information about its users, Google's records typically do not include the specific data elements (e.g. social security numbers and financial information) that trigger notification statutes. As a result, there is no guarantee that Google would tell users about whatever further privacy lapses Google uncovers; certainly Google's privacy policies make no such guarantee. Thus, there's strong reason to suspect that Google has actually faced additional data breaches beyond those known to the public.
Managing potential vulnerabilities becomes that much harder as Google's services grow in number and complexity. Meanwhile, as these services become increasingly widely used, each slip-up exposes an ever-larger amount of data. So far few users seem concerned, but I suspect these hidden challenges will ultimately impede the server-based applications Google envisions.
Google responded to c|net coverage of this privacy lapse by claiming its systems are operating just as intended. Google argued: "The information in these wikis is accessible because they have been set to public on the Site Permissions page. Users are always in control of the information they share. If wikis are set to private, no information will be publicly accessible."
I see four separate problems with Google's argument:
1) Users never agreed to the postings at issue. As best I can tell, users nowhere agreed to have their email addresses (and other personal information) posted for all to see. For example, a JotSpot account creation page requested user details without mentioning where or how this information would be displayed.
If users actually agreed to have their email addresses and other data shared by JotSpot, at least some users should remember granting that permission. It might be informative to survey a large number of affected users to see how they thought their data would be shared. To get started, I checked with a computer security expert whose details I found within JotSpot listings. Based on his industry expertise, he might reasonably be expected to recall agreeing to the share the information JotSpot posted. But he told me he does not recall being asked, nor does he believe he granted consent for his details to be posted.
Rather, as detailed below, JotSpot's posting of user data stemmed not from user decisions, but from decisions made by the administrators who configure wikis hosted at JotSpot. In fact, the third sentence of Google's response confirms that administrators, not ordinary users, play the key role; ordinary users cannot set wikis to public or private. Thus, Google errs in its second sentence, where Google claims "users" control their information sharing.
2) Administrator permission is insufficient to justify posting sensitive data about specific individual users. The difference between administrator permission and user permission is crucial for the data at issue here. Users' email addresses pertain not to the group as a whole, but to the corresponding individual users. It is nonsensical to ask administrators to grant permission to share data that is not theirs to give.
3) Administrators' supposed decisions were ambiguous and ill-informed. As best I can tell, JotSpot user lists (including email addresses) became publicly available if an administrator used JotSpot's "Global Settings" screen to set "guest user priveleges" to include "read pages: yes." (See screenshot 3 of JotSpot's GlobalSettingsManageDoc reference.) But notice the plain language of this setting, letting administrators specify whether guest users may "read pages" (emphasis added). Making "pages" publicly available in no way implies similar distribution of a user list -- not to mention users' email addresses. There is no reason to think an administrator who chose to let the public "read pages" also intended to distribute user lists and user email addresses.
A savvy JotSpot administrator might find JotSpot's "BrowseUsersListDoc" reference. In a final paragraph, that page opaquely mentions the security implications of the user list function: "Administrators may set page permissions so that user profiles are not visible. This could be a requirement for teams with higher security requirements." But this terse description offers little benefit to typical administrators. For one, this text appears in the documentation of an entirely separate administrative function; once an administrator sets a site to let guests "read pages," user lists and email addresses are already available -- without the administrator ever finding this BrowseUsersListsDoc reference. Furthermore, the quoted text is remarkably hard to understand. Compare the following alternative: "By default, if you set your site contents to be visible to the general public, then you will also provide the general public with your full user list, including user email addresses."
4) JotSpot's approach to user lists and user email addresses is unreasonable and ill-advised. What administrator would want to share user email addresses given the well-known risk of spam from email harvesting? Google rightly masks email addresses in Google Groups, in Orkut, and in other Google systems that might otherwise provide fodder for address harvesters. There's no good reason to proceed differently here. Instead, Google should prioritize defaults and options that accommodate reasonable users, reasonable administrators, and standard use cases. If Google elects to offer privacy settings widely viewed as unwise or insecure, Google could helpfully alert administrators to their possible errors -- rather than making such errors so natural that they are virtually inevitable.
On the most charitable view, these privacy lapses stemmed from Google JotSpot's complexity -- from the subtle interactions between user preferences, administrator preferences, user disclosures, and administrative disclosures. JotSpot's complexity certainly creates a heightened opportunity for confusion, error, and unexpected outcomes. But such complexity is inherent in the multi-user, collaborative systems Google increasingly offers. Who can adjust security and privacy settings for a shared Google Docs draft? A shared calendar? A family's shared medical records? These questions deserve clear and easy answers. Users shouldn't have to ponder obscure or convoluted documentation to figure out where their data may end up.
Going forward, developers of collaborative software should clarify exactly which users have the power to show or hide what user data. Such clarity could manifest itself in sites' engineering plans, user interfaces, privacy policies, and documentation. For JotSpot, I suspect this approach would yield an alternative design -- perhaps posting a user's details only if both 1) the user accepted such posting, and 2) the site administrator enabled such posting. Whatever the details, I'm confident that careful evaluation would yield an appraoch importantly superior to JotSpot's current practice.
I notified Google of this privacy lapse on October 23. On October 27, at least some of the affected sites were modified to prevent the disclosures. As of October 30, when this lapse began to attract media attention, my tests indicate that every affected site was modified to prevent the disclosures. I received a message from JotSpot staff indicating that all User Management pages have been set to private
Posted: October 30, 2008. Sign up for notification of major updates and related work.