December 21, 2003


The Great Library of Amazonia: 120,000 fully searchable texts and counting … Jeff Bezos is building the world's biggest digital book archive. It's an info-age dream come true - and the best way to sell books ever. (Gary Wolf, December 2003, Wired)

The fondest dream of the information age is to create an archive of all knowledge. You might call it the Alexandrian fantasy, after the great library founded by Ptolemy I in 286 BC. Through centuries of aggressive acquisition, the librarians of Alexandria, Egypt, collected hundreds of thousands of texts. None survives. During a final wave of destruction, in AD 641, invaders fed the bound volumes and papyrus scrolls into the furnaces of the public baths, where they are said to have burned for six months. "The lesson," says Brewster Kahle, founder of the Internet Archive, "is to keep more than one copy."

Kahle recently gave a copy of his digital archive of 10 billion Web pages to a new library in Alexandria. On a visit to the city last year, he sat down with Suzanne Mubarak, the wife of Egypt's president, and discussed his gift, which has all the advantages of a modern electronic resource: It can be instantly updated, easily searched, and endlessly replicated. Mubarak, with diplomatic politeness, allowed that she was impressed. Still, she ventured a protest: "But I love books!"

Therein lies a problem. Books are an ancient and proven medium. Their physical form inspires passion. But their very physicality makes books inaccessible to the multi-terabyte databases of modern Alexandrian projects. Books take time to transport. Their text vanishes and their pages yellow in a rash of foxing. Most important, it's still shockingly difficult to find information buried in books. Even as the Internet has revived hope of a universal library and Google seems to promise an answer to every query, books have remained a dark region in the universe of information. We want books to be as accessible and searchable as the Web. On the other hand, we still want them to be books.

An ingenious attempt to illuminate the dark region of books is under way at Over the past spring and summer, the company created an unrivaled digital archive of more than 120,000 books. The goal is to quickly add most of Amazon's multimillion-title catalog. The entire collection, which went live Oct. 23, is searchable, and every page is viewable.

To build the archive, Amazon CEO Jeff Bezos has had to unravel a tangle of technological and copyright problems. His solution promises to remake the publishing business and give Amazon a powerful new weapon in its battle against online competitors such as Yahoo, Google, and eBay. But the most interesting thing about the archive is the way it resolves the paradox of the book, respecting its physical form while transcending its limits.

Like all the best aspects of the Internet, it's easier to see how this will be useful than how it will generate any revenues in the long run.

Posted by Orrin Judd at December 21, 2003 11:04 AM

So Amazon is finaly becoming the cultural and economic revolutionary force it's boosters have long claimed it was(even when it was just a mail order business)?

Posted by: M. at December 21, 2003 11:51 AM

Build it, and they will come.

Posted by: Robert D at December 21, 2003 12:02 PM

They also said that about public transport,Robert.

Posted by: M. at December 21, 2003 1:34 PM


The problem was we build roads too.

Posted by: oj at December 21, 2003 2:50 PM

The revenues question is the interesting one. Is this expected to sell books?

One doubts it will work. Having public libraries certainly doesn't help sell books—Or so one would surmise from the example of Japan, which has wretched libraries and splendifeous bookstores. (I used to own a bookstore, and a Japanese bookseller once introduced himself. HIS bookstore was an entire four-story building! I've felt puny ever since.)

Posted by: John Weidner at December 21, 2003 3:15 PM

120,000 is the output of new titles in English in 4 years. A start but just a baby step.

Query: how many here have downloaded a text from such sites as the U. of Michigan's? Did you pay the donation? If you did it once, did you do it twice?

Interesting that the Wired writer would use the example of the Great Library of Alexandria and that, despite Google, he got the facts wrong. The library was destroyed, as a library, by Christian monks in the the 2nd-3rd century. When the Muslims got there, there was a pile of scrolls but that was no more a library than the bales of recyclable paper down at the junkyard.

Making copies is a good idea. The two countries with the most complete remains of tbeir medieval records are England and Korea.

In England, it just happened. Korea is the only society between Assyria and the modern age to deliberately plan to preserve its written records.

Three copies of each royal document were made. One set was kept at the palace, the other sets were deposited in remote monasteries where it was expected that they might be bypassed in the periodic invasions.

Curiously, it was the capital archive that survived and the monastery copies that didn't.

Posted by: Harry Eagar at December 21, 2003 6:20 PM

The library was destroyed, as a library, by Christian monks in the the 2nd-3rd century.

Harry, what is your source for this statement? I've done some reading about the L of A, and from what I've read, the facts about its destruction aren't entirely clear. It appears to have suffered at the hands of many, Father Time no less than the others. Those who put the blame entirely on the Christians or the Moslems usually have an axe to grind against one or both.

Posted by: R.W. at December 22, 2003 12:32 AM



Posted by: oj at December 22, 2003 8:51 AM

A couple of observations from a librarian (although I am a law librarian and that is a bit different).

1. I take issue with the statement "it is still shockingly difficult to find information in books". Well, that is if you don't have good indexing. I would rather have a good index than keyword searching 9 times out of 10. For the lawyers out there, in most instances I'd much rather use the West Topic and Key Number System to find cases rather than a terms and connectors search on Westlaw or Lexis. And natural language searching just blows. BTW, the topic and key number system is essentially a huge indexing system for case law.

2. 120,000 volumes really is drop in the bucket. I know you gotta start somewhere but that is not an impressive amount.

3. Suzanne Mubarak? Suzanne? Yeah, I remember Pharaoh Ramses married a Suzanne.

Posted by: pchuck at December 22, 2003 12:03 PM

"The Great Library" by some Italian whose name escapes me now is a handy summary of the extant texts regarding the library at Alexandria.

Other historians have urged that the author's conclusions should be used "with caution," meaning that the texts are more obscure than we'd like.

The most recent biography of Hypatia, called, cleverly, "Hypatia," makes it clear enough that the monks did most of the damage. It was the destruction of the Serapeum, which was the research institute associated with the library, which was, after all, just a roomful of scrolls, that doomed the library.

The monks ripped Hypatia limb from limb in their charming way.

By the time the Arabs got there, there wasn't much left to destroy.

Latest research tends to confirm Gibbon, as usual.

Posted by: Harry Eagar at December 22, 2003 3:21 PM

Hmm. I evidently have misremembered the title of the Italian's book, too, and I'm not at home now to find it.

The biography of "Hypatia of Alexandria" is by Maria Dzielska.

Posted by: Harry Eagar at December 22, 2003 3:32 PM

Having a senior moment, I guess.

The book is "The Vanished Library" by Luciano Canfora.

There is also something on the subject in an excellent but out-of-print history of the survival of the Organum, Burgess Laughlin's "The Aristotle Adventure."

Posted by: Harry Eagar at December 22, 2003 4:03 PM

I thought you might be referring to the Serapeum. Its destruction, however, occurred in 391 A.D., not in the 2nd or 3rd c. Battles would have been the cause of damage to the library in those centuries, not fanatical monks. How much of the library's holdings were stored in the Serapeum in 391--or even still existed at that date in the city of Alexandria--is pretty much a matter of guesswork. As you and I both say, definitive facts are hard to come by.

Whatever the uses of the Serapeum may have been, it existed as a pagan temple, and that is why it was destroyed. The Emperor Theodosius had made outlawing paganism his chief project, and the Serapeum was razed in order to supplant it with a church. The edifice was the target, not the contents--and again, we don't really know what the contents were.

Gibbon was well-informed but mistaken. Bias does that to the best of us. I'm glad you didn't mention Carl Sagan, who wasn't even well-informed.

Posted by: R.W. at December 22, 2003 10:28 PM

BTW, if it wasn't clear from the final statement in my first post, Harry is definitely right about one thing: the story, repeated in the Wired article, that Moslem invaders burned the library's books to heat the bath--that story is not held to be of much account.

Posted by: R.W. at December 22, 2003 10:32 PM

The destruction of the faculty of the Serapeum occurred earlier.

I think the dearth of evidence is itself very good evidence that it stopped being a functioning library long before the Arabs got there.

Posted by: Harry Eagar at December 23, 2003 1:56 AM