«  Another great If:Book summary: Privacy and Net Neutrality Main How Google used librarians  »


Inside Google Book Search:


If I handed you a book and asked whether it was in copyright or in the public domain, you'd probably turn to the copyright page first. Unfortunately, a copyright page can't answer that question definitively -- at best, it could tell you when the book in your hands was published, and who owned the rights to it at that time. Ownership can change, though: rights revert back to authors, and after enough time has passed, the book enters into the public domain, letting people copy and adapt it as they wish.

So how much time is "enough"? It varies, often depending on the country, on when the book was published, and whether the author is living. For U.S. books published between 1923 and 1963, the rights holder needed to submit a form to the U.S. Copyright Office renewing the copyright 28 years after publication. In most cases, books that were never renewed are now in the public domain. Estimates of how many books were renewed vary, but everyone agrees that most books weren't renewed. If true, that means that the majority of U.S. books published between 1923 and 1963 are freely usable.

How do you find out whether a book was renewed? You have to check the U.S. Copyright Office records. Records from 1978 onward are online (see http://www.copyright.gov/records) but not downloadable in bulk. The Copyright Office hasn't digitized their earlier records, but Carnegie Mellon scanned them as part of their Universal Library Project, and the tireless folks at Project Gutenberg and the Distributed Proofreaders painstakingly corrected the OCR.

Thanks to the efforts of Google software engineer Jarkko Hietaniemi, we've gathered the records from both sources, massaged them a bit for easier parsing, and combined them into a single XML file available for download here.

There are undoubtedly errors in these records, but we believe this is the best and most comprehensive set of renewal records available today. These records are free and in the public domain, and we hope you're able to use them to determine the copyright status of books that interest you.

At Google, we're committed to making as many books available online to users as possible while respecting copyright, and this is one example of that commitment. Watch this space for more to come.

This is great news for historians, journalists, researchers, publishers, and librarians. It's also great for the Open Content Alliance and other book digitization projects.

Of course, this does not help much with books published and copyrighted outside of the United States. But that's always a complication.

However, I wonder if Google itself is going to use these records to change the format of many of the scanned books published between 1923 and 1963. Currently, these are only available in "snippet" form. Will Google Book Search change significantly now that this file is available?

arrow

Comments (2)

According to the basic guidelines, pre-1923 works should be PD. However, there are many of these, including old copies of classics, which are blocked on Google. For example, pre-1923 issues of Publishers' Weekly should not be blocked but they are.

Sometimes this is because some enterprising entrepreneur has taken the PG or MS or G files and "published" them with a print on demand system. They don't get new copyrights on the PD material (only something new they added like illustrations, an introduction, index, etc) yet they are blocked on Google Books.

I think Google is trying to avoid antagonizing too many publishers. The end result is that they are granting them a copyright extension beyond the statutory limit.

James

Peter Hirtle on July 17, 2008 5:09 PM:

Siva, it is not just foreign works that are not helped by the Google records. Copyright restoration has made it almost impossible to determine US copyright status, as I describe in my article "Copyright Renewal, Copyright Restoration, and the Difficulty of Determining Copyright" at http://www.dlib.org/dlib/july08/hirtle/07hirtle.html.

Here is the conclusion:

This paper has demonstrated that it is almost impossible to determine with certainty whether a work published from 1923 through 1963 in the US is in the public domain because of copyright restoration of foreign works. First you have to determine if the work was also published abroad or if it is based on or derived from a work published abroad. If a foreign edition is found, one then has to establish the order of publication, and whether the foreign publication occurred less than 30 days before the US publication. If foreign publication was more than 30 days before American publication, one next needs to determine if publication occurred in an eligible country and if at least one of the authors of the work was living in or a citizen of an eligible nation. Checking the copyright renewal database is still important, but only after one has determined that the work's foreign copyright was not restored or that it does not draw upon subsisting foreign copyrights.

Copyright restoration has been criticized for unnecessarily removing thousands of foreign-published works from the public domain in the United States. What has been little noticed up to now is its negative impact on the determination of the potential public domain status of works published in the US. In many cases the impossibility of determining with certainty the absence of subsisting foreign copyrights in American publications that otherwise would be in the public domain means that American institutions will either have to keep these works inaccessible to the general public or risk the possibility of an infringement suit.

Post a comment

We had to crank up the spam filter so it may take a little while to appear. Thanks.

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

A book in progress by

Siva Vaidhyanathan

Siva Vaidhyanathan

This blog, the result of a collaboration between myself and the Institute for the Future of the Book, is dedicated to exploring the process of writing a critical interpretation of the actions and intentions behind the cultural behemoth that is Google, Inc. The book will answer three key questions: What does the world look like through the lens of Google?; How is Google's ubiquity affecting the production and dissemination of knowledge?; and how has the corporation altered the rules and practices that govern other companies, institutions, and states? [more]

» Send links, questions and ideas:
siva [at] googlizationofeverything [dot] com

» To reach me for a press query, please write to SIVAMEDIA ut POBOX dut COM

» To reach me for a speaking invitation, please write to SIVASPEAK ut POBOX dut COM

» Visit my main blog: SIVACRACY.NET

» More about me

Topics

Like the Mind of God (57 posts)

All the World's Information (75 posts)

What If Big Ads Don't Work (20 posts)

Don't Be Evil (16 posts)

Is Google a Library? (84 posts)

Challenging Big Media (46 posts)

The Dossier (49 posts)

Global Google (26 posts)

Google Earth (6 posts)

A Public Utility? (37 posts)

About this Book (28 posts)

RSS Feed icon  RSS Feed


Powered by Movable Type 3.35