C.B.R.C. TORCH
December2000
BUT IT'S ALL SO EASY!
By Duane Christianson
Note: This is the first in a series of articles about what
the Computer Access Training department at Hines (known locally
as CATS) is doing with speech-input technology to control
computers. Since I am supposed to be the department's guru on
this subject, I get to make my observations as well as
confessions about this kind of technology public.
I admit it. I'm getting lazy. I don't want to sit up straight for
hours in an office chair anymore in front of my personal computer
(PC) and pound away on the keyboard. I want comfort, access to
the fridge, and a pot of my favorite English tea. As a matter of
fact, I want to sit on the couch and talk to my PC the way
Captain Kirk talks to his computer on the bridge of the
Enterprise. Yes. I want a lot, and I'm not going to get it any
time soon.
I can tell you, however, what I can get right now. And that is
useful speech-input technology. It can control computers,
specifically microcomputers running under the "Windows 98" or
"Windows NT" operating systems, but it involves compromises. They
force me to think about what I really need, not just what I want.
All rehabilitation technology involves compromise, and this is no
exception.
The CATS program considers training only those visually impaired
or blind veterans for the speech-input programs if it is
determined that they have serious physical problems using
computer keyboards. So far, all of those who have been accepted
have been able to at least press a couple keys on the keyboard.
This has meant that we have not had to create a system that can
be operated completely hands-free. Such systems for visually
impaired persons are quite difficult to control and to learn.
We make sure that our students have the skills to use speech-
input: They must remember specific verbal commands, have speech
patterns the computer can translate into English, hear what the
computer is saying to them or see what is displayed on the
screen. They also need to be able to think through the problems
that inevitably develop.
At the moment, we are using a program called "Dragon Naturally
Speaking Professional." (Don't worry about the spelling -
computer companies have a bad habit of joining words together.)
Because we lack a mandate to do research into using other
software programs that handle speech input, we have worked with
the best one that we know about. It is good and is getting
better, but don't expect that it is easy. Captain Kirk would have
a tantrum if he had to use it. But in "Star Trek" he could get
the Enterprise computer to create a Beef Wellington dinner along
with a good Rhine wine out of some kind of energy soup. We just
want documents that are spelled and formatted correctly.
When we first began working with speech input several years ago,
back in the days before Microsoft "Windows" had taken over more
than 85% of the world's microcomputers, we used a product called
"Dragon Dictate." It was slow and understood only one word at a
time - maybe. If you happened to be totally blind, you had to
listen to each word being spoken and then spelled after you said
it. At the moment, we are using "continuous speech recognition."
That only means that you can say an entire phrase rather than
just a single word without freaking out the computer. Both speeds
of recognition and accuracy have increased enormously. Words are
spoken but they are not spelled back to you as you say them,
however, so there is enormous pressure on the computer user to
listen for odd pronunciations by the speech output system. That
doesn't guarantee that the correct word has been placed on the
screen. The words "to" (as in toward),"too" (as in also),"two"
(as the written number), and"2" (as in the Arabic numeral), all
sound the same. Now, before I scare you too much, I should say
that it possible to go back and check how some- thing is spelled,
but that requires the will to go back and check.
There are some wonderful myths about speech input technol- ogy.
Here is my short list of myths:
1. It is easier to learn than to type on a keyboard.
2. The computer knows the word you want and its correct spelling,
so you don't need spelling skills.
3. It will figure out the punctuation and grammar, so you don't
need to worry about all that stuff you hated when taking English
in grammar school.
4. It won't respond to or write profanity.
5. You can control any software program with it.
6. It will work with any speech output or large print output
program.
7. You can take old tape recordings and have them automatically
turned into text.
All of these things are false. I did say "myths," didn't I? There
should be no surprise here; you still have to be smarter than the
computer. And the great thing is that you are!
Now, just to give you an idea of how seriously we treat this
technology, let me list a few of my favorite rules for dealing
with Dragon System's "Naturally Speaking" or any other dragons,
for that matter: Pause. Think. Think very carefully. Plan. Don't
panic. Assign panic to someone else! NEVER rush. Absolutely
NEVER! If you don't the Dragon will chase you, catch you, and eat
you up slowly, surely and painfully.
Now, what in the world could I mean by the last crazy warning?
Well, Dragon Naturally Speaking listens to what you say and takes
a good guess, only a guess that you, as the computer user, have
to verify or correct. Basically, Dragon tries to match the
English noises you make with written English words. You want to
make sure that Dragon types "cat" when you say "cat." And you
want to make sure that Dragon doesn't save a memory of the
mistake so that "cat" is remembered as "rat." Now, you see why
talking to a dragon can be dangerous.
Dragon is confronted by a lot of problems from the start. I'll
just mention one in this paragraph, and that is words that rhyme.
Let's say you said, "rain" (as in precipitation), but Dragon
typed in "reign" (as what kings are supposed to do for a living).
Oh, there are worse choices. You can come up with your own list
of rhyming words. English is full of them.
There are also what one might consider phrases that remind you
almost of rhyming. Let me give you a few examples of the phrases
that one of my students said. A spoken phrase is followed in
Italics by what the computer heard the student say.
grounded in dreams
(grounded in geraniums)
a bird is singing
(burgers sinking)
fraternity
(truck 19)
lecturing us on
(luxury us on)
I've rarely
(highbrow early)
And now let me put two nightmarish sentences together for you.
(They appear in Italics.)
"The I.D. of a sit e is vary auld. ltjest hap ends to be the
whey most of OZ live in the twentieth cent cherry." Now, who
could be so stupid as to write like that? Well, a computer, of
course!
What are the real sentences? "The idea of a city is very old. It
just happens to be the way most of us live in the Twentieth
Century."
So what do we suggest to help avoid such problems? Enunciate!
Enunciate! Enunciate. After that, check your work carefully as
you proceed because Dragon learns your speech patterns, and you
don't want it to learn the wrong lessons. Also assume that you
are talking to a hall full of people, not just to a friend. Speak
clearly and distinctly. Friends can understand when you are tired
or ticked off or kind-of mumbling through sentences. Friends can
also understand your particular dialect of English. People
understand you in quite a different way than computers are
prepared to.
Make no mistake. Dialect is significant. Any dialect of English
that de-emphasizes word endings or produces an inordinate number
of rhyming words can cause a lot of problems. I have had students
who swore on a stack of computer manuals that they said "desks"
when they only said "desk." I have also had students whose
pronunciation of "fund," "fun" and "fond" were indistinguishable.
And let's not forget the weak verbs. If you can't hear the
difference between "dive" and "dived," the comppter won't either.
It is not smart enough yet to figure out what you intend from the
context. But stay tuned during the next few years for
improvements.
What can we do with this technology? Think about writing and
editing before you think of absolutely anything else. That is the
basic part of any CATS program, and we won't abandon it as the
core of our training in favor of the jazzier side of computing,
namely, surfing the Web or sending E-mail. Now, you know why one
of our students got a pretty curt answer from me when he said he
wanted to use Dragon in a Chat Room on the Internet. If you say
the right thing, but the system gets it wrong, can you ever seem
like an escapee from a mental institution! This Italicized text
is really not a good opening line:
"Hay, sweet Art, what's year @?"
Translated, this should be "Hey, sweet heart, what's your sign?"
Of course, we teach speech- input one-on-one rather than in
classes. Imagine classes with everyone talking at once and the
computers talking back.
Now, you may think, after reading the previous paragraphs, that I
am trying to scare you off from this kind of technology. I hope
not, but I am providing some cautions. When I have the
speech-input system working here with large print output, I can
dictate faster than I can type - and I type at more than 60 words
per minute. I am obliged to pay very close attention to what I am
doing at any speed. What is my greatest difficulty?
Thinking through what I really want to say and what I hope people
will hear and understand. I'll bet that sounds familiar.
The two articles concluding this series deal with how blind and
partially sighted people can use speech-input technology. The
article titles are: "I See What I'm Saying" and "Let the Dragon
Tell You What You Said."
###
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask] In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html
|