VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender:
"VICUG-L: Visually Impaired Computer Users' Group List" <[log in to unmask]>
Date:
Mon, 12 May 2003 09:21:42 -0700
Reply-To:
Subject:
MIME-Version:
1.0
Content-Transfer-Encoding:
7bit
In-Reply-To:
Content-Type:
text/plain; charset="iso-8859-1"
From:
Jeff Samco <[log in to unmask]>
Parts/Attachments:
text/plain (200 lines)
It is not clear in the article, but I presume digitizing means an image scan
rather than OCR processing to an electronic text version.  If this is so,
the article is interesting but does not offer much progress toward improved
acces for us who cannot read electronic images any better than paper-based
ones.  Or, might we be able to run such text images through an OCR engine?

Jeff
Grass Valley, California
-----Original Message-----
From: VICUG-L: Visually Impaired Computer Users' Group List
[mailto:[log in to unmask]]On Behalf Of Kelly Ford
Sent: Sunday, May 11, 2003 10:27 PM
To: [log in to unmask]
Subject: NYTimes.com Article: The Evelyn Wood of Digitized Book Scanners

This article from NYTimes.com
has been sent to you by [log in to unmask]


[log in to unmask]

[log in to unmask]

/-------------------- advertisement -----------------------\

Explore more of Starbucks at Starbucks.com.
http://www.starbucks.com/default.asp?ci=1015
\----------------------------------------------------------/

The Evelyn Wood of Digitized Book Scanners

May 12, 2003
By JOHN MARKOFF






PALO ALTO, Calif., May 10 - Putting the world's most
advanced scholarly and scientific knowledge on the Internet
has been a long-held ambition for Michael Keller, head
librarian at Stanford University. But achieving this goal
means digitizing the texts of millions of books, journals
and magazines - a slow process that involves turning each
page, flattening it and scanning the words into a computer
database.

Mr. Keller, however, has recently added a tool to his
crusade. On a recent afternoon, he unlocked an unmarked
door in the basement of the Stanford library to demonstrate
the newest agent in the march toward digitization. Inside
the room a Swiss-designed robot about the size of a sport
utility vehicle was rapidly turning the pages of an old
book and scanning the text. The machine can turn the pages
of both small and large books as well as bound newspaper
volumes and scan at speeds of more than 1,000 pages an
hour.

Occasionally the robot will stumble, turning more than a
single page. When that happens, the machine will pause
briefly and send out a puff of compressed air to separate
the sticking pages.

For Mr. Keller, the robot, made by 4DigitalBooks, one of
two companies now introducing the first automated
digitization systems, is a boon.

"Think about the power of bringing our library to little
schools in the middle of Africa," Mr. Keller said. "Would
it make a difference for those who now have their minds
closed to the idea of democracy?"

The first book-scanning robots were introduced this spring
by 4DigitalBooks of St. Aubin, Switzerland, and Kirtas
Technologies of Victor, N.Y. The machines have already
begun to generate interest from libraries and private and
nonprofit groups now working to digitize books.

Until now, the job has been done mostly by students or
armies of low-cost workers in countries like India and the
Philippines. But manual digitization presents significant
logistical problems. Book collections may have to be moved
long distances to digitization centers.

And in some cases the process of scanning has damaged old
books and journals, making it necessary to rebind them
afterward.

The digitizing machines, by contrast, can be located close
to book collections and offer speed and quality control
unattainable by manual systems.

Even so, manual processing is still less expensive in many
cases than acquiring a robot. The 4DigitalBooks robot,
whose price neither the company nor Stanford officials
would disclose, becomes cost effective on projects larger
than 5.5 million pages, said Ivo Iossiger, the company's
chief technology officer and a co-founder. It seems likely
that the vast majority of digitization over the next
several years will be done by hand.

Mr. Keller admits that his dream to have the entire
Stanford library in a digital database is unlikely in the
foreseeable future because such an undertaking - involving
eight million volumes - could cost upward of $250 million.

In the meantime, the Stanford librarians have begun
digitizing books and documents where there are no thorny
copyright barriers and have important historical and
political significance.

The newly installed robot is currently finishing two pilot
projects, scanning books published by Stanford's Center for
the Study of Language and Information and works for the
Medieval and Modern Thought Text Digitization Project. It
will soon begin work on the 2,500 titles published by the
Stanford University Press.

Not long ago Stanford helped finance the manual
digitization of the presidential papers of Eduardo Frey,
the former president of Chile, who was concerned that
records of his administration could be lost in a coup.

And beginning in 1999, the Stanford library system sent a
team of specialists and students to Europe, where the
university is engaged in a multiyear project to digitize
selected documents produced by the General Agreement on
Tariffs and Trade and its successor organization, the World
Trade Organization in Geneva. The project, which will take
five years, will ultimately scan about 2.2 million pages of
information.

Other ambitious undertakings like Carnegie Mellon
University's Million Book Project will also continue to
rely on manual digitization for several more years. Another
project, led by the Internet Archive in San Francisco,
recently shipped 80 tons of old books acquired from the
Kansas City Library to Hyderabad, India, where they will be
scanned, according to Michael Lesk, a former National
Science Foundation official and digital library expert who
works with the archive.

Mr. Lesk said that currently in India or the Philippines it
is possible to scan and digitize a book for $1 to $4. But
he acknowledged that there were significant costs in
quality control.

For Mr. Keller the most vexing challenges are neither labor
costs nor technology. Librarians, he said, must find a way
to address the copyright restrictions that appear to be
tightening as a result of new federal laws like the Digital
Millennium Copyright Act of 1998.

Stanford is struggling to comply with copyright
restrictions while making works that have recently lost
their copyright protection available digitally. Mr. Keller
said the library increased the circulation of its
collection by 50 percent when it computerized its card
catalog. Digitizing out-of-print books could likewise make
them available to a much wider audience, he said. The
payoff for building such a digital collection, he added, is
vastly improved availability of a huge store of knowledge
and information for teaching, learning and research.

http://www.nytimes.com/2003/05/12/technology/12TURN.html?ex=1053717211&ei=1&
en=36b40b488d286d54



HOW TO ADVERTISE
---------------------------------
For information on advertising in e-mail newsletters
or other creative advertising opportunities with The
New York Times on the Web, please contact
[log in to unmask] or visit our online media
kit at http://www.nytimes.com/adinfo

For general information about NYTimes.com, write to
[log in to unmask]

Copyright 2003 The New York Times Company


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html


ATOM RSS1 RSS2