first thoughts on the Google Books settlement

As promised at the end of my last post, here are my thoughts on the Google Books settlement. As I noted before, I'm a nonprofessional librarian and involved in a number of scanning projects (including trying to get Google Books to scan MITSFS's collection), so I'm not a disinterested party. To further caveat, I'm not a lawyer, and I don't play one on TV — I'm just an interested layperson, and this is not legal advice and should not be construed as such.

There are a bunch of good reasons to oppose the settlement. It needs stronger privacy protections, for one. Remember how back when the PATRIOT Act was passed we were all worried about the FBI getting reading records out of libraries without a warrant, and putting a gag order on the libraries preventing them from talking about it? Now magnify that to Google. On the one hand, Google has a lot more resources to fight a records request and a gag order if they want to; on the other hand, they're a single point of failure, and if they decide that it's not evil to give your reading records to the government, you're screwed. Having some policies specified would give private citizens a recourse.

There are also some more general points:

  • There is a perception that once Google's scanning project is done, no one else will bother to scan all these books ever again, which seems naïve to me. It's not strictly a question of money, true — scanning means wear and tear on the books, and especially older and rarer volumes should undergo it as few times as possible — but I don't see how, if Google's archive is too proprietary, or too low-quality, some other company, or a non-profit with sufficient funding, couldn't come along and replicate it by doing the same work Google did of striking agreements with libraries and scanning their collections.
  • A corollary to that is that, while it would be nice of Google to set up a foundation and donate the collection to it, to ensure that (as the Open Book Alliance wants) Google does not have an exclusive set of access and distribution rights to its database, I see no reason why Google should be forced to do so. Google has sunk a decent amount of money into the project, and I don't see why they should be compelled by the courts to let a competitor (say Microsoft) come in and profit off their work for free. That said, if Google agreed in the settlement to donate the database to a foundation and in return had a free license to use it for for-profit purposes (subject always to relevant copyright limitations), anyone could use the database for non-profit purposes (subject again to copyright limitations), and any competitor who wanted to use the database for for-profit purposes had to pay a reasonable licensing fee (subject again &c.), that could be reasonable.
  • Generalizing from the above, I agree that the scope of not-for-profit activities allowed on the database should be expanded substantially — scholars and the general public should be allowed to do more than make "non-consumptive" use of it. (I take "non-consumptive use" to mean something like running a MapReduce query that does word frequency analysis on the corpus — as opposed to actually, like, reading the books.)
  • I believe the libraries whose collections were scanned retain some rights to the scans of their books, so I'm not sure how Google could end up with "exclusive" access and distribution rights anyway. I need to read over such a contract to be sure (which should be happening RSN, actually). I guess if the library's rights only proceed from Google's rights that could be a problem, and it's something I'll be looking out for.
  • There seems to be a certain amount of anger on the part of authors and publishers and legal effort from same directed at Google simply because they scanned said entities' works. This is utter bullshit. In the same way that I can legally time-shift and format-shift other copyrighted works — compare ripping a CD to MP3 or digitizing videotapes and burning them to DVD for personal use — it is legal for me to scan a book and digitize it for personal use, and I don't see how the mere fact that Google scanned these books, or that these libraries allowed Google to scan these books, is any different. Speaking now of how I believe the law should work and not how it necessarily does work, I believe that it should be legal for a library to scan its books (or get Google to scan its books) for the use of its patrons and only its patrons. (Now, redistributing the books is another matter, and these authors and publishers might argue that even allowing the public to search the books and see relevant passages, to say nothing of allowing the public to read whole pages of the books, counts as redistribution and isn't covered under fair use. That's a minefield, and I'm not going there yet.)

A lot of the more interesting complaints about the settlement revolve around the treatment of orphan works. It's arguable that in the settlement Google is trying to get adjudicated some pretty significant changes to U.S. copyright law (or, from a different perspective, trying admirably to patch the deficiencies of U.S. copyright law). I'll cover that and more about the arguments around copyright in the settlement in next week's post.Edit: The third post of this series is available here.