from the New York times
April 3, 1997
New Software Greatly Advances Computer Dictation
By STEPHEN C. MILLER
C ontinuous speech has been the Holy Grail of the voice/speech
recognition industry for years, and while it hasn't quite been
achieved yet, Dragon Systems Inc. unveiled a new computer dictation
system on Wednesday that represents a significant step forward in
that quest.
"Everybody knows that we will be able to talk to our computers,"
the actor Richard Dreyfuss told a news conference in New York
announcing the new system, known as Naturally Speaking. "But the
question has always been: when?" Dreyfuss, who identified himself
as an unpaid, unofficial spokesman for the Newton, Mass. company
("I'm just a fan"), implied that the when was now.
Naturally Speaking is a so-called continuous-speech program,
meaning that it allows a user to speak normally to the computer and
have the words appear on the screen. Current speech or
voice-recognition products require users to talk in what is known
as discrete speech, meaning that each word must be pronounced
individually, separated from the next word by a pause.
Joel Gould, Dragon Systems' engineering manager, gave the most
compelling demonstration of the product by using it while
delivering his prepared remarks so that the words he was speaking
appeared simultaneously on a giant computer screen. The system
produced very few errors, and the mistakes it did make were easily
corrected. To prove that it was not just a canned demonstration,
Gould read from today's front page of The New York Times.
"Acknowledged" was interpreted as "a knowledge" and Webster
Hubbell's last name printed out as "howl," but the system generally
worked well.
______________________________________________________________
[INLINE] It really works. And it gets better the more you use
it. [INLINE]
Scott Miller, Dataquest
______________________________________________________________
A new feature of the program is the integration of command and
dictate modes. Dragon Systems' current product, Dragon Dictate,
requires the user to switch between command mode (for example, when
the speaker says "new paragraph and indent") and dictate mode (when
the speaker says, "The quick brown fox . . .") Not having to change
modes makes the product easier to use.
Yet some complications are unavoidable. For example, the user must
specify all the punctuation, which represents a new skill for most
people and can be cumbersome.
Another challenge in designing dictation programs is making them
speaker independent. A speaker-independent system will understand
the speech of anyone, but that is very difficult to achieve since
accents, regional pronunciations and everyday mumbling make for
significant differences in how people speak.
Speaker-dependant software makes a user "train" the program to
understand that particular user's speech patterns. It can take days
to train some dictation programs to a degree of accuracy that makes
the program useful. Usually the user has to recite a long list of
words that represent the program's "vocabulary." But even after
training, something as simple as a case of the sniffles can make
that user unintelligible to the software.
Dragon Systems isn't claiming total speaker independence, but Dr.
Janet Baker, the company's president and cofounder, said that
Naturally Speaking significantly reduced training time. "It takes
about 18 minutes to train the program," Baker said. That amount of
time, she asserted, is insignificant when compared to how long it
takes most people to learn to type.
Scott Miller, a senior industry analyst at Dataquest, a San Jose,
Calif., research firm, said he was impressed after being given an
advance peek at the software. "It really works," Miller said. "And
it gets better the more you use it."
______________________________________________________________
The big unknown, as usual, is what Microsoft is cooking up.
______________________________________________________________
While Miller was quick to acknowledge the product's imperfections,
he said that it was ahead of most of the competition. He said that
the major advantage of Naturally Speaking was that it represented
the first step in getting people past the biggest barrier to using
computers -- the ability to type. Just about everyone can talk.
The big unknown in the voice-recognition industry right now is what
Microsoft Corporation has up its sleeve. The maker of the Windows
operating systems has long made clear its interest in incorporating
voice recognition technologies in future products. Several years
ago, Microsoft licensed some technology from Dragon, but it is
currently traveling its own road and keeping its plans under wraps.
|