August 9, 2006

University of Calif. joins Google book scan push

By Eric Auchard

SAN FRANCISCO -- The University of California has joined Google Inc.'s bid to scan the book collections of the world's great libraries, the organizations said on Tuesday, marking renewed momentum for a project nearly derailed by stiff resistance from publishers.

The top Web search company said it will fund the scanning of "several million" of the 34 million titles in the University of California's libraries, as part of a year-and-a-half old project to make major library collections searchable online.

The University of California (UC) holds 100 libraries on 10 campuses across the state and ranks as the largest research and academic library in the world. California joins Harvard, Oxford, Stanford, the University of Michigan and the New York Public Library in the Google Book Search project.

Google is working with the U.S. Library of Congress on a similar effort.

For Google, the new momentum for its Book Search Project is the latest in a string of high-profile deals it has announced over the past week in which it signed a major search and advertising contract with News Corp., the owner of, and a video advertising and delivery deal with Viacom, owner of


"We know that we will be digitizing several million volumes but not the entire 34 million" books in the California system, said Jennifer Colvin, a spokeswoman for the University of California's digital library arm (

But authors' and publishers' groups sued Google last year to block scanning of copyrighted library books, arguing that, akin to Napster's effect on the music industry, the digitizing of books might tempt consumers to stop buying printed works.

Google has countered that it is creating the electronic equivalent of a library card catalog for copyrighted works and that library project only plans to publish the full texts of out-of-copyright books in the public domain.

For works under copyright protection, Google Book Search ( publishes only short snippets, a few sentences on either side of mentions of words a user has searched for. What online readers see is similar to's "Search Inside the Book" feature.


In response to the legal threats, several of Google's library backers said last year they would proceed with the scanning of public domain works, but deferred plans to digitize copyrighted books in order to steer clear of the controversy.

Michigan was alone in saying it planned to proceed with the scanning of both in-copyright and out-of-copyright materials.

Colvin said the University of California Libraries shared Michigan's view that Google's project enjoys "fair use" protection and had agreed to scan copyrighted works.

"UC and Google are both really committed to respecting copyrights," Colvin said.

A contract was only hammered out two days ago, Colvin said, adding that book scanning should begin within several weeks.

The University of California, along with the University of Toronto, are among the founders of a competing book scanning project called the Open Content Alliance (OCA), which is backed by Yahoo Inc. and Microsoft Corp..

In contrast to Google, the OCA has treaded gingerly around the issue of scanning copyrighted works and is focused on public domain works. Together with the non-profit Internet Archive, the OCA, is aiming to create an online clearinghouse for historic books, audio and films.

The Google Book Search project was a far larger in scope than its undertaking with the Yahoo-Microsoft funded group.

"OCA is on a smaller scale," Colvin said. "There won't be as many books as we are doing through the Google partnership."

Google Books product manager Adam Smith confirmed that the project would scan books numbering "in the millions," but declined to offer specific targets in terms of the number of books or the scope of financing Google planned to provide.