Measuring Bias in "Organic" Web Search - Benjamin Edelman and Benjamin Lockwood
Numerous web sites, courts, and public interest groups have raised concerns about possible search engine bias. In this appendix, we review their allegations and approaches.
A series of web sites argue that search engines' listings are biased due to changes in search engine listings. For example, blog search engine Technorati in July 2010 told the FT that Technorati had "certainly [been] penalized" by Google in that Technorati's search rankings had "tumbled" on multiple occasions. British vertical search service Foundem complained in 2009 that its site had "effectively disappeared from the Internet" during a 3.5 year period during which Google's PageRank assessment of Foundem dropped from 10 to 1, and during which Foundem's rank for key terms dropped from the top 5 to position 100 or lower. On this change-over-time theory, a search engine began with appropriate results, but at some point decided to disfavor certain sites. Complaining sites seek restoration to a prior version of the search engine's algorithm and/or policy.
Some might reject arguments grounded in change over time as mere coincidence. After all, with every change to search engine algorithms, some sites are bound to become more prominent while others must drop to make room. But complainants typically raise further factors that make their concerns more persuassive: Complaints from well-known sites tend to ring especially true: most users expect well-known sites to be ranked favorably, so reduced prominence of such sites may appear to be ill-advised. Similar arguments may apply when a site suffers multiple losses simultaneously (e.g. a drop in algorithmic search prominence as well as an increase in minimum bids for advertisement purchases). Finally, public scrutiny following a site's complaint sometimes yields a partial or complete restoration of the site's prior links. Such reversals seem to indicate a search engine's recognition that the lower ranking was improper.
Inferences grounded in change over time suffer from important limits. For one, some circumstances threaten the fundamental assumption that a site's listings should remain equally prominent: conditions might have changed, yielding an appropriate reduction in a site's prominence. Even well-known sites can become less appealing to users and, hence, properly made less prominent in search results. Furthermore, if a search engine uncovers a site engaged in some form of impropriety -- perhaps using trickery to artificially inflate its search rankings -- a penalty might be appropriate. Even if a search engine later reverses its course and restores a site's links, that reversal might not result from an admission of wrongdoing or error; a search engine might rationally respond to public scrutiny by reversing a decision made for proper reasons. In short, change over time, standing alone, does not generally provide compelling proof of bias.
Some sites argue that search engines' listings improperly disfavor sites that compete with some portion of search engines' offerings. Such complaints are particularly common from vertical search services -- sites that search some portion of the web, e.g. travel, electronics, or B2B supplies. For example, in antitrust litigation against Google, TradeComet alleged:
[S]ites like SourceTool ... posed a substantial threat to Google's dominance in search advertising and would attract highly-valued search traffic from Google and, as a result, advertisers from Google's highly profitable advertising platform. Faced with this threat to its business, Google undertook a variety of actions to exclude vertical search sites from the search advertising market. (TradeComet v. Google, complaint, S.D.NY Case No. 09 CIV 1400.)
Similarly, Foundem argues:
Google search penalties [are] increasingly targeted at perfectly legitimate vertical search and directory services. It may not be coincidence that, collectively, these services present a nascent competitive threat to Google's share of online advertising revenues.
More recently, we have been struck by the reduction in prominence of algorithmic listings for sites that compete with Google's local results. Search for "best burrito boston" last year, and top algorithmic listings came from Yelp and Chowhound. These days, a large Google Map tends to fill much of the page.
To the extent that a search engine distinctively disfavors companies that could grow into competitors, the search engine engages in conduct that many people might view as improper. But search engines offer a counterargument: That they disfavor sites without original content -- sites that, search engines argue, users find significantly less useful. Thus, search engines argue, reduced prominence of vertical search sites should be seen not as harming competitors or would-be competitors, but rather as protecting users against unwanted links and web spam.
Here too, it is difficult to resolve the disagreement without making difficult judgements about the value and importance of individual sites. Such judgments may be possible as to specific sites like TradeComet and Foundem, where hands-on testing and industry accolades confirm site quality. But even if some data source offered insight into the value of a given site, e.g. by measuring click-through rates to that site or time spent on that site, this data would indicate only whether that specific site appeared to be useful, and not whether Google's various rules and penalties are useful in their treatment of scores of other sites.
Some companies argue that search engines' listings entail improper power over sites simply because sites have no alternative to search engines -- particularly to Google, whose 85% worldwide market share (and 95%+ market share in many countries) leaves web sites little alternative. For example, the court in Navx took this approach, concluding that "Google's dominant position in the online search advertising gives it ... the responsibility to implement its AdWords policies in an objective, transparent, and non-discriminatory manner" (translation by Benjamin Lockwood).
The Navx court is surely correct in flagging Google's staggering power relative to a typical web site or retailer. But American antitrust analysis does not penalize companies solely for high market share. Furthermore, while the Navx court calls for objective, transparent, and non-discriminatory listings, the court offers no clear method to assess whether a given search engine in fact offers those laudable characteristics. We attempt that task in the body of our paper.
In Hard-Coding Bias in Google "Algorithmic" Search Results, co-author Edelman compares results for certain combinations of searches -- for example, searches with and without trailing punctuation marks. In general, trailing punctuation marks do not affect search results. But for certain searches, the presence of a trailing punctuation mark causes a disappearance of links to Google's own services, while the absence of such punctuation leads Google links to come up in the top-most position. Edelman argues that this combination of results is best explained by Google intentionally and manually "hard-coding" its own links to appear first for certain predetermined search terms.