copyright and the Google Books settlement

As I said in my last post, the Google Books settlement has some fairly serious copyright implications. I was going to write up a lengthy discussion of them, but in researching them I discovered that Prof. Lawrence Lessig has an excellent article on the subject up at The New Republic. His argument is too nuanced to capture in a pull-quote, so go read it.[1] It's well-written and not terribly long. I'll wait.

In the settlement, Google really is trying to adjudicate significant changes to copyright law, notably a registry of works and a policy on orphaned works. I've heard a number of authors, in talking about the settlement, say, "It would have been fine if Google asked me before scanning my books, but since Google didn't ask me, they can't have them." The problem is that it's impossible for Google to individually ask every author or rightsholder for every work under copyright for permission to scan their books. Since creative expression is automatically under copyright once it is created, and the United States has no central registry of copyrighted works, every book published after Steamboat Willie is presumptively under copyright to someone. That means that, for Google to ask permission before scanning, they would, for each work, need to go through a lot of trouble and expense to locate the current rightsholder, and if that person can't be found, Google would be barred from scanning the book. It doesn't matter if the book is rare and falling apart — if it's copyrighted, they can't scan it. (And the legal tangle around who got what rights when an author died, and what they did with them, is almost always a serious mess, or there wouldn't be a project up to make sure that authors have wills.)

To make it even worse, if an author has, say, quoted some song lyric in her book — and, as a good- and copyright-observant author, went and obtained the rights to use that lyric from the company which owns it — now, in this proposed world order, Google has two problems — first to track down the author of the book, and second to track down whoever now owns the rights to the song (which may have changed hands since the author did their due diligence) and clear that separately. If the author has quoted multiple songs owned by different companies, Google has to find the rightsholder for each song, no matter how small the quotation. And even if the author allowed Google to scan their book, if even one of the song companies says no, Google is faced with the choice of using the book without the offending quotation or not using the book at all. If it removes the quotation and that happened to be a major plot point? Oops, sorry. Now repeat as necessary if that lyric quotes part of another song, and so on ad infinitum, and combine with ever-increasing copyright terms (pushing a century these days), for a scary picture of what the future could look like.

Basically, if Google had asked permission first, they'd never have scanned anything. Some people would say that would be right — Google shouldn't have scanned anything. I'm too much of an information archivist and a librarian to agree with them.

Lessig is right that, considering the sad state of US copyright law, the settlement is really quite a good patch on it. He's also right that it's another worrying step in the trend towards the world outlined above, where copyright goes recursive and strangle culture abed. Then again, if the settlement falls through or gets amended as the complainants want, it will also be a step towards that world, and perhaps an even bigger one. Everybody is right that the US needs serious copyright reform, preferably spearheaded by Congress, sooner rather than later. Whether that will actually result in a better system — or any system at all — is unclear, given that body's current state of gridlock, but the current system is clearly broken, and Lessig has some good ideas on how to fix it.

[1] The Open Book Alliance's blog post on it, which brought it to my attention, misses the heart of it, I think. Their goals, in my understanding, are exactly what Prof. Lessig is worried about.

first thoughts on the Google Books settlement

As promised at the end of my last post, here are my thoughts on the Google Books settlement. As I noted before, I'm a nonprofessional librarian and involved in a number of scanning projects (including trying to get Google Books to scan MITSFS's collection), so I'm not a disinterested party. To further caveat, I'm not a lawyer, and I don't play one on TV — I'm just an interested layperson, and this is not legal advice and should not be construed as such.

There are a bunch of good reasons to oppose the settlement. It needs stronger privacy protections, for one. Remember how back when the PATRIOT Act was passed we were all worried about the FBI getting reading records out of libraries without a warrant, and putting a gag order on the libraries preventing them from talking about it? Now magnify that to Google. On the one hand, Google has a lot more resources to fight a records request and a gag order if they want to; on the other hand, they're a single point of failure, and if they decide that it's not evil to give your reading records to the government, you're screwed. Having some policies specified would give private citizens a recourse.

There are also some more general points:

  • There is a perception that once Google's scanning project is done, no one else will bother to scan all these books ever again, which seems naïve to me. It's not strictly a question of money, true — scanning means wear and tear on the books, and especially older and rarer volumes should undergo it as few times as possible — but I don't see how, if Google's archive is too proprietary, or too low-quality, some other company, or a non-profit with sufficient funding, couldn't come along and replicate it by doing the same work Google did of striking agreements with libraries and scanning their collections.
  • A corollary to that is that, while it would be nice of Google to set up a foundation and donate the collection to it, to ensure that (as the Open Book Alliance wants) Google does not have an exclusive set of access and distribution rights to its database, I see no reason why Google should be forced to do so. Google has sunk a decent amount of money into the project, and I don't see why they should be compelled by the courts to let a competitor (say Microsoft) come in and profit off their work for free. That said, if Google agreed in the settlement to donate the database to a foundation and in return had a free license to use it for for-profit purposes (subject always to relevant copyright limitations), anyone could use the database for non-profit purposes (subject again to copyright limitations), and any competitor who wanted to use the database for for-profit purposes had to pay a reasonable licensing fee (subject again &c.), that could be reasonable.
  • Generalizing from the above, I agree that the scope of not-for-profit activities allowed on the database should be expanded substantially — scholars and the general public should be allowed to do more than make "non-consumptive" use of it. (I take "non-consumptive use" to mean something like running a MapReduce query that does word frequency analysis on the corpus — as opposed to actually, like, reading the books.)
  • I believe the libraries whose collections were scanned retain some rights to the scans of their books, so I'm not sure how Google could end up with "exclusive" access and distribution rights anyway. I need to read over such a contract to be sure (which should be happening RSN, actually). I guess if the library's rights only proceed from Google's rights that could be a problem, and it's something I'll be looking out for.
  • There seems to be a certain amount of anger on the part of authors and publishers and legal effort from same directed at Google simply because they scanned said entities' works. This is utter bullshit. In the same way that I can legally time-shift and format-shift other copyrighted works — compare ripping a CD to MP3 or digitizing videotapes and burning them to DVD for personal use — it is legal for me to scan a book and digitize it for personal use, and I don't see how the mere fact that Google scanned these books, or that these libraries allowed Google to scan these books, is any different. Speaking now of how I believe the law should work and not how it necessarily does work, I believe that it should be legal for a library to scan its books (or get Google to scan its books) for the use of its patrons and only its patrons. (Now, redistributing the books is another matter, and these authors and publishers might argue that even allowing the public to search the books and see relevant passages, to say nothing of allowing the public to read whole pages of the books, counts as redistribution and isn't covered under fair use. That's a minefield, and I'm not going there yet.)

A lot of the more interesting complaints about the settlement revolve around the treatment of orphan works. It's arguable that in the settlement Google is trying to get adjudicated some pretty significant changes to U.S. copyright law (or, from a different perspective, trying admirably to patch the deficiencies of U.S. copyright law). I'll cover that and more about the arguments around copyright in the settlement in next week's post.Edit: The third post of this series is available here.

a brief summary of the Google Book Search lawsuit

In October of 2004, Google announced that it would be adding book search to its search engine, under the name Google Print. In December of 2004, Google announced that it would be digitizing the collections of several libraries and adding their books to its search engine, as well as making the scans of some of the books available to the public. In September of 2005, a bunch of people sued Google over what had become the Google Books project, which aimed to scan a bunch of libraries' collections and make them available at least for search online, and perhaps for browsing, reading, and download. Litigation ensued. In October of 2008, a settlement was announced, and in November of 2009 a a revised settlement was announced, amended largely over antitrust concerns. Here is Google's take on the revised settlement. In December of 2009, author Ursula K. Le Guin resigned from the Authors' Guild over its acceptance of the Google Books settlement agreement, and a number of other noted authors followed suit. Between the beginning of the suit and the settlement, Google scanned a lot of libraries' collections, some of which are available under various terms on their website and in their search results.[0] And that is where the matter rests today as I understand it.

What's all the fuss about? The Public-Interest Book Search Initiative at the New York Law School has compiled a list of objections to the settlement and the response to those objections, if any, in the revised settlement. The EFF is against it on privacy grounds. The Science Fiction and Fantasy Writers of America, the National Writers Union, and the American Society of Journalists and Authors are opposed to it under the consortium name of the Open Book Alliance, and offers this list of ways it fails to meet their requirements.

It's a huge issue, and I can hardly do it justice with a single blog post. Implicit in it are arguments about copyright, monopoly power, international relations, and privacy issues. I am myself torn — on the one hand, I'm a nonprofessional librarian and dedicated to the preservation and dissemination of knowledge, a reader and a lover of books; and on the other hand, I'm keenly aware that the books I so love wouldn't exist if authors and publishers weren't able to make a living off their work. It's possible that some of the objectors are right, and legislation rather than litigation is the right way to tackle this issue, though there's no less sausage-making on display there than in the judicial process. In short, I don't know what the right thing is to do.

I have some more concrete personal thoughts on aspects of the settlement, but it's late and I'm tired, so that will be the topic of my next blog post. Until then, here are a couple pictures of how I spent Saturday afternoon:

my afternoon

2010: Library Two


Footnotes[N]:
0: ^ Full disclosure: I've been trying to get Google Books to scan the MITSFS collection, and am engaging in other scanning projects so I am neither a disinterested party nor blameless in this.
N: ^…since footnotes appear to be the Iron Blogger secret ingredient this week.[N+1]
N+1: ^ http://en.wikipedia.org/wiki/Iron_Chef#Theme_ingredients

Edit: The second post of this series is available here.

Edit: The third post of this series is available here.

book reviews — Stratford Man and Julian Comstock

Some weeks, I have nothing interesting to blog about.  This is one of those weeks.

I spent it variously at my parents’ in Iowa, in transit, and out on the Cape over New Year’s with friends, and I got very little productive done.  Between the time spent on planes and the time spent just chilling, I did, however, finish a couple books and write up reviews of them for MITSFS’s reviews site. (They’re not posted on MITSFS’s site yet, so I’ve linked them locally for now, and I’ll edit the links to point to MITSFS’s site when they’re up.) Here they are:

I don’t think it’s spoiling the reviews to say that I enjoyed both books, really. Oh, also, apparently since I bought a membership to last year’s WorldCon, I’m eligible to nominate for this year’s Hugo Awards. Anybody have suggestions for things published in 2009 that I should read in the next couple months because they might be worth nominating? (No promises, of course.)

Edit: My reviews are now up on the MITSFS book reviews site! Here is the review of The Stratford Man, and here the review of Julian Comstock.