Quantcast
Last updated on June 1, 2012 at 9:28 EDT

Microsoft is Busily Playing Catch-Up in the British Library

February 6, 2008
Repost This

A good place to see the true frontline in Microsoft’s battle with Google is deep in the bowels of the British Library in London.

In a room under the library’s red brick building near St Pancras railway station, a Microsoft-funded team is working 14 hours a day to scan shelf upon shelf of books.

Launched a year ago, the project will scan 25 million carefully preserved pages of the British Library’s 19th-century archive, around 100,000 books, over the next two years. Together with collections from other libraries including Yale and Cornell University, the pages are destined for Live Search Books, Microsoft’s answer to Google Book Search.

It’s a field where Microsoft is playing catch-up with Google, whose mass digitisation project already has around one million books online, 10,000 publishers and almost 30 major world libraries on board.

If the inventor of the PC operating system’s recent bid for Yahoo is an effort to glean more online advertising, this painstaking copying points to a deeper flaw behind the weakness in advertising: search.

“When we are able to do a better job of answering people’s questions we are going to build loyalty and then ultimately increase the size of our user community,” said Cliff Guren, Microsoft director of Publisher Evangelism, a title the company hopes will attract libraries and publishers to its scheme.

“By doing this we increase our query share which helps us increase our advertising rates and that’s how our business makes money,” he said. Query share is the percentage of individual consumer web search requests attracted by services like Google and Microsoft.

Internet audience measurement firm comScore estimates only four per cent of internet searches worldwide use Microsoft’s engine, against 77 per cent through Google’s. Yahoo, the second-largest web search provider, has a 16 per cent share.

But Microsoft’s problem is not just that Google is bigger. As search technology advances, the real headache for the company, whose software currently drives most of the world’s computers, is that Google has its eye on a more sophisticated prize.

Google’s mission is “to organise the world’s information and make it universally accessible and useful” and to do that it is hoovering up not only books, but any data it can grab. An example is how anyone using a Google e-mail account is invited never to delete anything: that’s data Google can use.

Its initiative in mass digitisation stems from a vision of internet search as a tool to hunt not only the words we type in, but also the things we might have meant to type in, said Colin Gillis, an internet analyst at brokerage Canaccord Adams.

Like many concepts in the internet industry, this idea can go with an arcane title: the Semantic Web. It’s a concept already partly realised on some web sites, for instance when search engines recognise a common typing error. To achieve this needs data.

“The important component of the Semantic Web is mass digitisation,” Mr Gillis said. “You have to have all the little bits of data, all the little pieces.

From comprehensive data sets come deeper insights.”

Even though the list of libraries and publishers on both sides in the Google-Microsoft digitisation race is growing fast, the challenge goes far beyond the world’s books. Jason Hanley, who manages Google’s Book Search partnerships in the UK, said there was no competition for exclusivity between his company and Microsoft.

“I wouldn’t say there is any arms race in terms of picking off certain people to work with and excluding others,” Mr Hanley said. “That wouldn’t make any sense as we’re trying to be comprehensive.

These things are massive undertakings.” Neither company will say how much they are investing, but Mr Guren said it was “a very substantial financial commitment”. The projects are strategic, said Danny Sullivan, editor-in-chief of SearchEngineLand.com.

Mr Sullivan said Google sets the tone by spending large sums of money to develop new businesses without rushing to make money back.

“Microsoft and Google are both building libraries and the way you get the books off the shelves at these digital libraries is through their search engine.

Their search engine is an electronic librarian,” Mr Sullivan said. “The battle shouldn’t be over getting the books, the battle should be over who is building the best librarian.”

Latecomer Microsoft says it is taking a selective approach.

“Google’s general mission has been to organise the world’s information. With that they bring in the flotsam and the jetsam, the good with the bad because they’re casting their net far and wide,” Mr Guren said.

“We are taking a much more focused approach to figuring out what content we need to drive user satisfaction.”

(c) 2008 Birmingham Post; Birmingham (UK). Provided by ProQuest Information and Learning. All rights Reserved.