January 22, 2018 - updated January 25, 2018

Who should pay to store and distribute the litigation records in US federal courts? The answer is surprisingly contentious – and, by all indications, getting more so.

When the general public wants to read documents in US federal litigation, the standard method is PACER – the federal courts’ electronic record system. One might think this would be free to access – an efficient information system run from a few web servers with no significant incremental cost per document copied. But PACER dates back to 1989, when information was provided by modem with genuine costs for courts to obtain and provide dial-up hardware, not to mention servers and storage. Thus a tradition of charging -- a tradition which has continued, now yielding fees of $0.10 per page. That sounds low, and it is… until a big session, perhaps looking for the proverbial needle in a haystack, requires browsing hundreds or even thousands of documents at corresponding expense. Meanwhile, with a fee for every page, PACER tells users that the meter is always running and browsing should be done lightly if at all – inconsistent with some readers’ preference to explore unfettered. Certain readers are distinctively at risk. For example, journalists often have limited budgets. For some pro se litigants, virtually any expense is unaffordable. Others are conducting research outside the country or are undocumented, unbanked, or for other reasons have no United States debit or credit card to use for payment.

In principle, litigation might put a check on public record expenses. Following up on a 2014 lawsuit, several non-profits sued the United States in 2016 as to PACER fees, alleging that the E-Government Act of 2002 allows only “reasonable” fees and then “only to the extent necessary.” The plaintiffs argue that PACER charges are higher, are not necessary, and indeed yield revenues that courts divert to other projects. For example, Quartz reports the federal courts spending 28% of PACER revenue on other court technology, such as monitors and sound systems -- meritorious, no doubt, yet not obviously related to distributing court documents. Litigation is ongoing. But the disposition of the 2014 lawsuit gives reason to doubt the effectiveness of litigation to constrain PACER fees. For one, a sitting federal judge has every reason to defer to colleagues sitting on the Judicial Council (the body of judges that sets the PACER fee schedule). Indeed, any reduction in PACER fees would deprive the judicial branch of one of its primary revenue sources -- a particular challenge if judges believe that Congress does not adequately fund the courts.

In the short run, the most promising response to PACER fees is RECAP, a browser plug-in that lets users share what they read on PACER. The idea is simple and elegant: Whenever a RECAP user reads a document on PACER, the RECAP plug-in sends a copy to RECAP’s servers which in turn store it for others to read free of charge. RECAP has used this approach since 2009, when it began as a graduate student project at Princeton University’s Center for Information Technology Policy (CITP). RECAP later grew with funding from non-profits and the Think Computer Foundation, as well as smaller monetary and in-kind donations from a variety of other groups and legal startups. There have been technical glitches, but on the whole the system has worked well.

A surprise development came in November 2017 when the Free Law Project (FLP), the operator of RECAP ever since the original Princeton team graduated and disbanded, announced major changes in how RECAP-collected data will be distributed in the future. Under the new plan, rather than making all RECAP-collected documents available to the public on the Internet Archive (IA) as soon as possible, FLP would hold the documents until the end of each quarter for a batch update to IA. FLP also proposed to upload litigation materials to IA in only machine-readable formats compressed into enormous multi-gigabyte tarballs, ending the human-readable individual HTML files that have for years made it easy for normal users with standard web browsers to see court records.

FLP says these changes are “necessary for RECAP to thrive” which I understand to summarize concern about FLP funding. I’m sensitive to the need to keep RECAP sustainable. But I question whether the November 2017 changes offer the right approach. Four specific concerns:

More generally, I struggle to reconcile FLP’s changes with the organization’s non-profit status and with its overall position favoring free and unrestricted access to court documents.

A final twist is that FLP has already arranged for only its own CourtListener site to get premium access to RECAP’s documents as they are gathered by users and uploaded by the RECAP plug-in. FLP envisions that other sites and the public will get access only once per quarter unless they pay an undisclosed fee. Indeed, a few sites have sprung up to collect RECAP-gathered court records, repackage them in some way, and distribute them to the interested public. It’s hard to see a principled reason why only CourtListener should get superior and more frequent access to the documents. They’re public documents, gathered by the US courts themselves, and then submitted for public archival by participating users who support the “Free Law” RECAP concept. Nothing in users’ or donors’ understanding or their agreement with RECAP calls for RECAP providing the documents to only one site but not others.

Tensions have been brewing, including a pointed critique from Aaron Greenspan (whose PlainSite service is among the victims if RECAP begins to withhold data from other services), as well as Internet Archive staff questioning the purpose and effect of the proposed changes. But getting this right requires more users speaking up about what they want from RECAP and what they expect of its leadership. For myself, I want a RECAP that lives up to the principles articulated in 2009 -- gather court documents and distribute them without charge or restriction.

(Added January 25, 2018) On Github, user @johnhawkinson pointed out the FLP is still posting RECAP data to CourtListener, and indeed indicates that it will continue to do so promptly (even though IA uploads are to be delayed). Indeed. And at present, no CourtListener TOS limits how users access the site; it seems users could download thousands of documents for their own purposes, even scrape the site and its contents if that better matches their requirements. Kudos to CL for not banning (or purporting to ban) those methods. Yet favoring CourtListener, rather than the authoritative and widely-trusted, seems to me a troubling change. FLP's stated reason for putting data only on CL, and not promptly on IA, is to make the service "sustainable" which I take to mean an effort to raise funds, which in turn entails withholding features or data (or both) from those who don't pay. And FLP is already asking for payments from those sites that seek full access to bulk RECAP data. Such restrictions and charges are exactly what RECAP's historic mission did not contemplate or indeed allow.