VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Kelly Pierce <[log in to unmask]>
Reply To:
Kelly Pierce <[log in to unmask]>
Date:
Mon, 10 May 1999 22:14:56 -0500
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (602 lines)
>From the web page
http://www.dinf.org/csun_99/session1012.html

Web Posted on: May 10, 1999


 CSUN 99 Papers

                        Dueling Scanners

Peter M. Scialli, Ph.D.

The second annual Dueling Scanners Event was held on Tuesday,
March 16, 1999 in conjunction with the CSUN Conference,
Technology and Persons with Disabilities. It is the world's only
venue in which vendors of computer based reading products for
the blind can present their products for side-by-side
comparisons before a live international audience. This report is
a result of what was heard and seen during that session. The
report was written entirely by the judges whose participation
was approved, in advance, by the vendors.

Following the text of the report are written responses from the
vendors who participated in the session. Here is the report:

As Jim Fruchterman, President of Arkenstone, put it, "Thanks,
Peter, for providing a deadline for adaptive OCR products to
announce new innovation." It is true, too: Dueling Scanners has
become the event of the year for putting competitors face to
face, or in this case, page to page.

Since this event receives wide coverage, and since this report
will be read by many in the blindness field, it is vital that as
much information as possible is presented so as to allow
individuals to make a more informed decision as to what OCR
software might best suit their needs, or the needs of their
clients.

This year the vendors and their representatives who participated
were: Jim Bliss from Jbliss Imaging, Jim Fruchterman and Mike
May from Arkenstone, and David Bradburn and Steven Baum from
Kurzweil Educational Systems Group.(KESG).

Mr. Bliss showed VIP Info System running on a Pentium/2 333 MHz
machine with a Microtek scanner connected via the USB.

Arkenstone premiered Open Book: Ruby Edition, version 4.0, using
a Pentium/2 400 MHz machine with a HP 5200C scanner using the
USB port.

Kurzweil Educational Group showed its K1000 version 4.0 on a
Pentium Celeron 333 MHz with an HP 4C scanner using a SCSI
connection.

The session was moderated by Peter Scialli of Shrink Wrap
Computer Products, and judged by Rich Ring, Supervisor,
International Braille and Technology Center at the National
Federation of the Blind and Larry Skutchan, Director of
Technology Projects at the American Printing House for the
Blind.

The participants were provided with a list of tasks and
questions to perform and answer ahead of time, so all three came
to the event with the same information. This list of questions
and tasks was also available to the audience upon arrival at the
session.

High audience participation and overall interest made the
answers to specific questions take a bit longer than time
permitted, so some of the questions were combined in hope that
all the material could be covered in the three-hour time slot.
Another problem that resulted in the foreshortening of the list
of questions was, that having three vendors made it somewhat
time consuming for each representative to discuss the systems
they had brought. While three hours may appear to be a long time
to expect an audience to maintain interest, there was no
sleeping in this session.

The judges began by providing two notoriously difficult pages
for each vendor to scan. The first of these samples was a page
from "Workforce Diversity" magazine. Each vendor scanned the
first page and pointed out basic operation of their system in
the process. Each vendor described the difficult page and
pointed out how their product handled the various issues
presented by this document. This page, though it did not contain
graphic material, contained some unusual fonts that were handled
differently by each OCR system. VIP attempted to render this
material, and did an extremely poor job with the portion of the
page that contained the unusual font style. Ruby did about the
same. Kurzweil, on the other hand, skipped this portion of the
page entirely! In its default configuration, Kurzweil 1000 will
ignore "degraded" text. Though this setting can be changed, the
problem here is that in this case, since the blind user wouldn't
know that there was a portion of the page being skipped, he/she
would not realize that in order to see the entire page, for
better or for worse, this setting would in fact need to be
changed. Though in this instance, the page actually made sense
even with this portion left out, we believe that this does the
blind user somewhat of a disservice, since he/she would have no
way of knowing that there was in fact text on the page that
wasn't being properly recognized. It would be interesting to
understand what the Kurzweil 1000 program defines as degraded
text, since to some extent this like so many other things could
be subjective.

The second difficult page provided by the judges was a page from
the Damark catalog that contained many graphics scattered
throughout the page, multiple columns, and many random sidebars
of information in different fonts. One of the most dramatic
moments of the show came after VIP performed in the mediocre
manner most of the audience expected from experience with
current OCR systems, and Ruby nearly flawlessly rendered the
page. It should be noted that Kurzweil 1000 did almost as well
with this page. Interestingly, Kurzweil and Ruby presented the
material in different order. Each vendor justified why their
system made the correct decision on how the text was presented,
and the arguments for each were persuasive. The page was, in
fact, one that a sighted person could read in more than one
order, as well, illustrating just one of the subjective
decisions these systems have to make. Optical character
recognition, in order to be effective, is far more than simply a
matter of turning scanned images into readable text. Retaining a
page layout that still allows the user to determine the contents
of a page in some kind of rational order is essential.

Vendors then went on to describe what it was about their product
that made it unique. Bliss demonstrated some impressive features
designed for low vision users. His system scanned the page
twice, once to gather a black-and-white image of the page for
OCR, and another time to acquire the color image of the page. If
the page contained more than one image, the user could save each
color image to a separate file. This process looked simple, but
it would clearly require some vision in order to choose which
images one wanted to save.

Given that Dueling Scanners 99 was billed as a comparison of
systems especially designed for the totally blind user, the
judges did not find this feature particularly valuable, but it
should be noted that it could indeed be a useful tool for a
person with usable vision.

Bliss also demonstrated a unique feature in scanning software, a
hand-held camera that could be used to enlarge paper documents
on the user's PC screen. This camera didn't have the resolution
to provide an image that would be suitable for optical character
recognition; it was strictly used for enlargement. Again, while
this was a nice feature for those who can benefit from it, many
members of the audience were especially looking for features
geared toward the totally blind user.

Using Xerox's Textbridge recognition engine, the Bliss system
did not provide the kind of accuracy demonstrated by either
Kurzweil 1000 or Open Book: Ruby Edition.

One problem that was obvious was a result of not using a scanner
with AccuPage. Many times during the session, brightness level
adjustments had to be made manually for VIP to read any text at
all.

Like the other vendors, Jbliss Imaging has wrapped a clean user
interface around their component of the system. VIP's simple
interface consisted of using the four corners of the numeric
keypad for control of the application. The interface seemed
simple and consistent, but the judges believe that including a
standard Windows interface would be a useful addition to the
product. VIP uses AT&T's FlexTalk for its speaking voice, and it
does not support other dedicated speech synthesizers as both
Ruby and K1000 do. This would not be an issue for someone who is
purchasing their first computer, but it is surely a drawback for
those considering adding optical character recognition software
to an already existing system, especially if one has a speech
synthesizer to which one has grown accustomed. Note that all
three systems come with a software text-to-speech engine that
allows the user to obtain speech through a sound card.

Featuring a revamped interface, Open Book: Ruby Edition is a
totally redesigned program which has maintained support for the
familiar "classic" Open Book menus for those who desire to use
them. Ruby is a far more powerful and flexible package than its
predecessor, and one would have to say that it was the most
improved program being shown. Ruby's new features include
editing functions such as cutting, pasting, moving, and
inserting pages, and a spell checker and thesaurus, as well as
dual-recognition engine processing. The new user interface
follows standard Windows user interface guidelines, therefore,
if a user is already familiar with the menus and dialog boxes
presented in Windows 95, 98 or NT, Ruby will be simple to
understand and work with. Since the "classic" Open Book menus
are retained as an option in this version, those who feel
comfortable with them need not venture into unknown territory.

Arkenstone's Ruby also incorporates many features that would be
especially useful to those with low vision. Those features
include, but are not limited to: support for numerous fonts and
font sizes, contrast adjustments, the ability to adjust the
spacing of words, lines and sentences, and the ability to
highlight text as it is spoken. Ruby also has an "exact view"
mode where an image of the original page, including graphical
elements, may be displayed and magnified during reading.

Ruby's dual-engine recognition process impressed the judges: It
submits the scanned image to the first engine for deskewing and
decolumnization, then uses the second engine to recognize the
individual components from the first engine. This innovation
represents a departure from the traditional use of a
commercially available OCR product in a specialty product for
the blind. It bespeaks some interesting technical possibilities
for the future. Ruby also supports the recognition of languages
other than US English. It will not support the recognition of
multiple languages on the same page, nor will it automatically
switch text-to-speech engines in order to accommodate those
languages. However, it does come with ViaVoice Outloud, which
has six text-to-speech engines for several other languages. A
total of thirteen recognition languages are furnished on the CD.
During the installation process, one is not given the
opportunity to install those engines automatically.

Both Ruby and Kurzweil support direct grade II Braille
translation and, according to Jim Fruchterman, the screen could
be read in interactively translated grade II if one has a screen
reader that supports this feature. The judges assume this is
true of all three systems. Arkenstone uses TurboBraille for its
grade II translation, and Kurzweil uses NFB Trans. The judges do
not know which Braille translation software program VIP is
using. The judges do not know if the translators involved had
been recompiled for Windows or if they are using the DOS version
of the software. The judges would not see the fact that a DOS
version of any Braille translation program was being used as a
disadvantage.

The programs do a reasonably good job when it comes to creating
quick and dirty Braille translation. This feature would be
useful in settings where Braille documents were required ASAP.
One example of this would be in an educational environment,
where a blind student needed Braille copies of handouts that
were being distributed to the sighted members of his/her class.
One thing to keep in mind here is, that no matter which of the
three programs one is using, the ability to scan and then emboss
will result in readable Braille, but it will not in most cases
result in properly formatted Braille. To create Braille
documents whose format is correct, a working knowledge of a
Braille translation software package and a word processor is
required.

K1000's host of new features highlighted the introduction of
version 4.0. This system comes with both Flex Talk and Lernout &
Hauspie text-to-speech engines. This feature impressed the
audience later in the session when David Bradburn scanned a page
from a pamphlet he found in his hotel room. The page was an AT&T
instruction page on how to make a long distance call. The same
information was presented in several languages. Amazingly, the
K1000 not only changed its recognition language, but also
switched text to speech engines to immediately read the
different languages as they appeared on the page. Neither of the
judges is fluent in German, French, and Spanish, but the system
sounded like it was recognizing and announcing the text
properly.

Ever since version 3.0, the K1000 has had some impressive
editing features that made document management a snap. These
included features like re-scanning a page, moving and deleting
pages, and direct editing capabilities. While both Ruby and
K1000 now support spell checking, Kurzweil sports a
simple-to-use feature that lets you put common scanning mistakes
and their correct spellings into a list so the software always
replaces the misrecognized word with its correct version during
the recognition process. The K1000 also optionally removes the
hyphens at the ends of lines when the system determines they
belong to a hyphenated word. If you scan a lot of books, this is
a function that you probably already perform with your word
processing software with search and replace or with macros, but
bringing this useful function directly to the user interface
exemplifies the kinds of new conveniences introduced by this
release.

As stated earlier, one feature that both impressed and disturbed
the judges was the K1000's tendency to entirely throw out poorly
recognized text, although David Bradburn made a convincing
argument complete with a dramatic demonstration with a better
sounding page than the same one scanned by the other two
systems. Bradburn demonstrated this when it came to a side bar
in the text that was both a different size than the rest off the
page and in a totally different font. The continuity of the page
seemed better when the text was completely rejected. He also
assured them that it was possible to retain this partially
recognized text if the user was interested in placement on the
page or some other aspect of the less-than-perfect recognition.

We have already pointed out what we feel the drawbacks to this
approach are. Bradburn explained that K1000's distinguishing
qualities were its high degree of accuracy, host of features,
including international recognition and speech, and its
simplicity. Noting that most users could be up and running
within ten or fifteen minutes, Bradburn noted the system came
with a videotape to quick start, and the manual in Braille. The
setup was also the first self-voicing installation program in
the industry, according to Steven Baum. He made his point fairly
clear when he responded to a question about the feasibility of a
totally blind person installing the system by saying that if you
could insert the CD into the drive and come back in fifteen
minutes, you would have it installed.

The judges asked each vendor to bring and scan a page that would
highlight the unique capabilities of his particular system. This
is where David Bradburn brought that AT&T pamphlet page with the
multiple languages.

One of the vendors brought a page that had straight text at the
top of the page and skewed text toward the bottom of the page.
K1000 did a fine job on this page, but Arkenstone decided the
whole page was skewed and only got the skewed part of the page
right. VIP didn't do well, for some reason, on either part of
this page.

One of the judge's questions involved the issue of self-voicing
applications having the problem of user interfaces that don't
always provide adequate review capabilities. This is often a
problem with self-voicing applications when the user gets a
prompt that he/she didn't understand correctly the first time.
The question is, how does one get to repeat that information?
When you use a screen reader with off-the-shelf applications,
you use your screen reader's review commands to examine the
material you want to hear again. Since these self-voicing
applications provide their own interface and you don't run your
screen reader while using these programs, they must provide a
way to repeat relevant material.

VIP's answer to this problem was to press Escape and repeat the
command sequence that originated the prompt in the first place.
The problem with this approach is, of course, that there are
some situations where you won't want to press escape to cancel
the current operation, because you don't necessarily know or
remember what the current operation is. It is possible that you
will cancel something you don't want to cancel. Ruby and K1000
both provide a way to repeat prompts, but we did see cases where
the interface wasn't always as cleanly implemented as it could
be.

Each of the vendors demonstrated their product's ability to
store documents in a variety of word processor formats. Both
Ruby and K1000 provided direct translation to several word
processor formats including--in K1000's case--PeachTree Write,
an obscure format that nevertheless delighted the audience with
the sheer volume of word processors supported. VIP used the
standard Windows cut-and-paste capabilities, which has both
advantages and disadvantages. All three programs do permit
permanent storage of scanned documents in a variety of formats,
a feature that also makes it easier to emboss accurately
translated and formatted Braille.

Each of the three products supports ways of changing the color
of the foreground and background text and the size of the font.
All three programs also support a feature that shows a highly
visible mark that moves word by word as the program reads the
text. This is an extremely useful feature for low vision or LD
users.

When it came to showing other features not specifically outlined
already in the session, each vendor took a few minutes to
elaborate. Jim Bliss discussed how the hand-held camera that
came with VIP allowed the low vision user to examine hand
written notes and packaged goods--items, in other words, that
would not lend themselves to optical character recognition. This
is, again, a feature that requires some vision to use. Bliss
also showed how VIP could be used for email and Internet www
browsing. The VIP system has its own self-voicing web browser,
similar to IBM's Home Page Reader or the Productivity Work's PW
Webspeak. Though such features are fine in their place, we do
not feel that they are important when it comes to an optical
character recognition package.

David Bradburn showed how the K1000 could be combined with voice
recognition software to provide a solution for people who can't
use the keyboard. While the judges do not feel that voice
recognition technology is advanced enough to allow the novice
user complete control over the system, the capability has
possibilities for individuals with limited motor skills. It
should also be noted that adding voice recognition capabilities
should be possible for all three systems.

In concluding, Jim Fruchterman emphasized Ruby's retention of
the Classic Open Book user interface as an option for
traditional users of the product. At the same time, Ruby offers
the full power of a true Windows 32-bit application. He also
described the detailed context-sensitive help that is now
present in the product, making Ruby a tool that almost anyone
should be able to use quickly and easily.

What conclusions might be drawn from Dueling Scanners 99?
Unfortunately, it is not a simple matter of one program being
vastly superior to another. Each has strengths and weaknesses.
But perhaps we can provide some general statements that might
assist one in determining which of these packages would be the
correct choice. For a totally blind user, the choice is
difficult. Both Arkenstone's Open Book: Ruby Edition and
Kurzweil Educational Systems Group's Kurzweil 1000 offer
accurate recognition and a wealth of features. Both programs can
be easily learned by novice and advanced users alike. Both
programs are fairly easy to install. Both of them allow the user
to work with a standard Windows interface or one that utilizes
the keys on a PC's numeric keypad. Insofar as recognition
accuracy is concerned, the differences in performance between
these two programs were not great enough to declare a clear
winner. Since in these judges' opinion, the most important
determining factor is accuracy, one could not go wrong with
either of these packages.

Thus having said that overall recognition accuracy is far too
close to call, one must make a choice between these two programs
based upon their respective feature sets. And, to some extent,
this is not an easy thing to do. Both programs allow one to
edit, use a spell checker, dictionary, and thesaurus and they
both allow one to set up a list of launchable applications that
can be run without closing the respective programs. One feature
that we particularly like that is present in Kurzweil 1000 and
not available in Open Book: Ruby Edition, is the ability to
create and edit a list of corrections that can be applied to the
recognition process either automatically or upon the user's
request. But would this feature alone be enough to make a user
decide to purchase Kurzweil 1000 over Open Book: Ruby Edition?
We think not. We believe that users and rehabilitation
professionals alike owe it to themselves to examine each of
these programs carefully, keeping in mind their particular needs
as well as those of their clients.

It will be noted that thus far in these conclusions, no mention
has been made of VIP from Jbliss Imaging Systems. This is
because, in the opinion of these judges, this program is simply
not appropriate for use by a totally blind individual. There is
simply not enough verbal feedback built into this package. In
fact, even if an individual has some useable vision and his/her
primary purpose for purchasing the program is optical character
recognition, we believe that the greater accuracy provided by
both Open Book: Ruby Edition and Kurzweil 1000, and the low
vision features built into those programs, would still make
either of them a better choice. However, we feel that if some of
the other capabilities that VIP provides such as the ability to
process pictures and the use of a digital camera are important
to the low vision user, VIP is clearly the program to buy.

Though these conclusions do not provide the easy answers that
all of us long for, we feel that many of the features and
functions of these programs have been discussed in this report,
and that it is now up to you to decide which of these programs
is right for you or your client.

Here is the Response of Kurzweil Educational Systems Group to
the above report:

Before we begin with our response to some of the observations
and conclusions made by the judges at 'Dueling Scanners,' we
would like to extend our thanks to Peter Scialli for organizing
this event. It was an enjoyable start to CSUN 1999.

Here are our comments:
  * Price and Performance: No-one addressed performance in the
    report. Halfway through the session, an audience member
    commented that Kurzweil 1000 (K1000) appeared to be the
    fastest system. That was confirmed by timings. KESI's
    response was that we were glad to hear that because we were
    using the slowest PC of all the vendors - a 333MHz Celeron
    computer. In addition, we announced that K1000 pricing was
    now $995 ($1,195 with DECtalk Access-32 speech). We also
    think it is worth noting that Arkenstone used a hardware
    DECTalk synthesizer in their system. We are not sure why,
    given that their product now includes software speech, but
    it does, of course, have a significant impact on the final
    price to the customer.
  * Degraded Text: The K1000 is the only product to offer the
    choice of recognizing everything on a page - poor quality
    print and coffee stains alike - or ignoring them for a
    clearer understanding of a document. We find that most
    people don't want to hear long sequences of punctuation
    marks. None of the systems were capable of recognizing the
    text printed in a highly ornamental font. Only The K1000 was
    capable of ignoring it entirely, if the operator so chose.
  * Accupage: A reference indicated that the high level of
    recognition accuracy exhibited by K1000 [and Ruby] was
    through using a scanner with Accupage. K1000 did not make
    use of Accupage at any time during the session. Nor did we
    adjust the brightness setting.
  * Editing: K1000 has included the Editing feature since v2.0,
    which we launched in the summer of 1997.
  * Test Documents: While no mention was made of the documents
    bought by the other vendors, it was KESI who provided the
    skewed document. The AT&T pamphlet mentioned was actually
    used to address the question of international languages.
    This document demonstrated K1000's unique ability to
    correctly OCR and speak [in the relevant language] by
    paragraph. As we recall, clapping was involved at this
    point.
  * Closing Statements: Not all of what was said by KESI was
    captured in this report. Of specific note, we mentioned our
    industry leading 12-months of FREE updates and superior
    customer support that is available 12 hours per day, a claim
    the other vendors did not choose to contest when invited to.
    We demonstrated another unique feature, 'Text
    Summarization,' which summarizes the open document.
  * We agree with the judge's conclusion that it is best for a
    user to examine each of these programs for themselves,
    keeping in mind their own particular needs. We were the only
    vendor who distributed free demonstration CDs at this
    session and throughout CSUN, making such an examination
    possible.

Following is the response from Arkenstone, Inc.

I was personally delighted by the opportunity to demonstrate the
newest version of Open Book at Dueling Scanners. We appreciate
the effort put forth by the judges in crafting a well-written
report that highlights the state of the art in reading systems
for people with visual impairments, and of course Peter Scialli
for organizing it.

The most important point I can make is to encourage prospective
users to test out the products they are interested in on their
own books and documents. Seeing your own documents read is the
fundamental test of a reading system, and allows you to assess
that most important factor: accuracy. We feel that our new
dual-OCR technology will provide the best possible results, but
please judge this for yourself! Testing the program also gives
valuable feedback about the design and support built into the
product. At Arkenstone, we pride ourselves on the care and
attention we place on design issues and incorporating feedback
from our many thousands of users around the world.

We would like to expand on a few of the issues mentioned in the
report. Our thirteen OCR recognition languages are all installed
automatically, so that our users can scan in documents in many
different languages. The French, German, Italian and Spanish
ViaVoice Outloud speech synthesizers do require a separate
install to be run on our Ruby CD. Our self-voicing install is
based on the industry standard InstallShield technology, and
offers the user the choice between a typical automatic install
or a custom installation. The custom installation is important
so that our users can make their own decisions about major
installation issues, such as which speech synthesizer drivers
they want to be available inside Open Book.

Our Windows-standard user interface is designed to provide our
users with a very well behaved example of the Windows User
Interface. Ruby not only repeats prompts, but also makes it easy
to spell out difficult strings, such as file names. Open Book
has long served as first product for people just getting started
with PCs, and we think that our careful attention to design has
extended Open Book's strengths in this area. Users have always
appreciated our ability to read documents smoothly and
naturally, taking care of issues like removing hyphens from
scanned text. At the same time, we've built in many of the
powerful capabilities requested by our users.

In conclusion, Arkenstone values our relationship with each of
our more than 20,000 users, and our goal is to provide them with
the best possible reading tools. Our commitment to this shows in
our products, our service and our attitude. We hope that
everyone interested in reading tools will take the time to test
our commitment and our products!

Jim Fruchterman

The following response was provided by Jim Bliss of Jbliss
Imaging:

Dueling Scanners 99 provided an excellent and informative
comparison among three products. Even though this session was
billed as a comparison of systems especially designed for the
totally blind user, there were many in the audience who were
interested in the needs of users with low vision. In evaluating
the conclusions, it is important to keep in mind that VIP is
intended for a low vision audience, as well as blind, which is a
different target audience than the other two products. With this
audience, it is important that the visual displays, as well as
the speech, enable reading to be as fast and easy as possible.
VIP's wide range of text attribute adjustments, and choice of
four different viewing modes, means that it can be optimized for
most visual impairments. We also believe that the combination of
features in VIP, (e.g., picture viewing, e-mail, memo writing,
Internet, address database, etc. as well as scanned document
reading) meets the needs of our target audience in an easy to
learn, efficient, and cost effective manner.

With respect to the ease of installation and learning, VIP has
made this straight forward and simple without vision. The
installation CD is fully voiced and there is built in contextual
help in both speech and large print, as well as a full manual on
line. In addition there is a six-cassette tape audio tutorial as
well as an audio CD tutorial.

Since the CSUN Conference, VIP's scanning and OCR accuracy have
been improved.

----------


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html


ATOM RSS1 RSS2