VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Steve Zielinski <[log in to unmask]>
Reply To:
Date:
Tue, 30 Nov 1999 06:20:03 -0600
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (262 lines)
Just ran across this little article.  It sounds like an interesting way
of providing more realistic human sounding speech.  I don't know anything
more about this development other than what appears in the forwarded
article.

Steve


---------- Forwarded message ----------
Date: Tue, 30 Nov 1999 18:44:02 +1100
From: Steve Pattison <[log in to unmask]>
To: Multiple recipients of NFBnet GUI-TALK Mailing List <[log in to unmask]>
Subject: Fwd: VIP-L: article possible new screen reader

acb-l Message from "Jesus Garcia" [log in to unmask]

I found this in the Miami Herald.  It appears to have peaked the
interest of
individuals who judge numerous science projects annually.

Miami Herald Online

Visionary teen twins create text-to-speech software for blind
Breakthrough program wins Westinghouse science award

PHIL LONG
[log in to unmask]

VERO BEACH -- Inspired by the work of their crusading
great-grandmother,
Joseph and William Pechter -- 17-year-old twins --
may be on the verge of a breakthrough in communications for the
blind.

The St. Edward's Upper School seniors have developed a computer
program that
seeks to solve some of the most difficult problems
in translating the written word into human-sounding speech.

``When you do a science project, you want to benefit someone
through your
research,'' William Pechter said. ``This helps the
blind to read.''

Because the program does it so quickly, accurately and with one
keystroke,
the Pechters' Hybrid Text-To-Speech 2000 may also
give blind people unprecedented access to the Internet and to
e-mail --
using almost any modern laptop or desk top computer.

If only their great-grandmother, Norma Newman Cohen, had had
something like
this. Cohen was founder of the Florida Fight for
Sight and co-founder in New York of the National Fight for Sight.
A longtime
resident of Palm Beach County, she raised money
for education and research to help blind people.

Although Joseph and William never knew their great-grandmother,
stories of
her accomplishments have been handed down from
generation to generation. While they were contemplating what kind
of science
project to attempt, they were looking for one
that was a challenge and would help people.
10,000 WORDS

Hybrid 2000 is built on a 10,000-word dictionary created by the
brothers. It
puts words in context using a combination of
matrices designed by the twins. It adds inflection to make
sentences
realistic and it uses the brothers' voices to make the
end product sound more human.

So impressed were judges at the prestigious Siemens Westinghouse
Southeast
regional science fair at Georgia Tech last week
that they awarded the Pechters first place, gave them $30,000 and
invited
them to compete at the national finals next week
in Washington, D.C. They've reached the top 12 of 30,000
applicants in the
contest.

With an accuracy rate of 96 percent of words recognized, Hybrid
2000 is an
inspiring piece of work, experts in the field say.

``It's remarkable. It is nice to know that they had the foresight
to figure
out what blind people need,'' said Curtis Chong,
director of technology for the National Federation of the Blind
in
Baltimore. ``It sounds like a good first cut at the next
generation of speech synthesis.''

WRITTEN TO SPOKEN

Besides capturing text and e-mail from the Internet, there's
another method
of conversion. If a blind person wants to hear
today's newspaper or a magazine story, he or someone else can
place that
story on a ``flatbed scanner'' -- a machine that
copies the words into the computer instead of onto paper. The
Hybrid 2000
program then turns the story into a text file and
converts the words into spoken language.

There have been text-to-speech programs for years, priced from
$17 to
$10,000. But the Pechters' design targets some of the
remaining shortfalls and puts it in a program small enough to fit
onto a
compact disk, William explained.

William and Joseph are straight-A students among 350 classmates,
but they
are anything but computer nerds.

WELL-ROUNDED

Both are on St. Edward's varsity tennis and soccer teams. They
also won this
year's state high school stock market contest,
parlaying a mythical $100,000 into nearly $500,000 in 10 weeks.

Joseph plays Rachmaninoff and other classics on the piano.
William is an
accomplished guitarist.

Both volunteer at the Red Cross and the Humane Society. They've
helped teach
computers to senior citizens and to elementary
school students.

Because of the program's easy application to the Internet and
e-mail, added
Joseph, it will allow blind people to harness
the power of cyberspace.

``Our goal was to create the most accurate and most
understandable
text-to-speech program,'' said William.

Misunderstood words slow down the text-to-speech process and that
is what
the Pechters targeted.

Here's the nontechnical version of how Hybrid 2000 works:

HOW IT WORKS

The computer compares each word to the 10,000 words in the
dictionary.

In the rare case that the word doesn't appear in the dictionary,
the program
sounds out the word, breaking each letter into
its possible sounds, searching to come up with a correct word.

If the word is in the dictionary and it doesn't have a
homograph -- another
word spelled the same way but with a different
meaning, it lets the word go to the next step.

A major strength the Pechters built into the Hybrid 2000 is its
ability to
differentiate hundreds of homographs.

Humans select the right homograph because the brain puts the word
in context
with other words around it.

For example, when we see: ``The wind is blowing,'' we pronounce
w-i-n-d with
a short ``i'' because we also see the word ``blowing,''
and recognize it as something that wind does.

When we see: ``Please wind the grandfather clock,'' we pronounce
the word
with a long ``i'' because of its relationship to
the word ``clock.''

ALL THE HOMOGRAPHS

To make their program do the same thing, Joseph and William
created a
dictionary of every word that has a homograph -- like
``wind.''

Then they installed a grammar dictionary of 230,000 words
identified by
their parts of speech. Words like ``wind'' are pronounced
differently based on whether they are nouns or verbs.

Next, the program looks at the two words before and after the
target word.

Like the human brain, Hybrid 2000 identifies the connection
between ``wind''
and ``blowing,'' and recognizes it as different
from the pronunciation called for if ``wind'' had been two words
away from
``clock.''

To ``teach'' the program to make that kind of association
correctly, the
Pechters fed in 300,000 sentences from 10 complete
novels. Somewhere in those 300,000 sentences is likely to be the
same
combination as the one the program is looking at.

FINAL CHALLENGE

Once Hybrid 2000 has accomplished these steps, it tackles the
last big
problem.

Joseph: ``We didn't want it to sound like a robot.''

The brothers paid close attention to inflection -- changing the
pitch or
loudness of the voice. If the program sees a question
mark at the end of a sentence, for example, it changes the
inflection.

To create the proper inflection for each situation, the Pechters
recorded
many words at least three times. Take the word ``now,''
for example.

``Go there now!'' requires a different pronunciation of the word
``now''
than: ``Go there now?''

The sentence: ``Now we are going to have dinner,'' requires yet a
third
emphasis on the word.

The Hybrid 2000 reads the question mark then locates the
pronunciation of
``now'' that goes with the question mark.

All the steps explained in the preceding paragraphs, Hybrid 2000
can do in
1,000th of a second.

Regards Steve,
mailto:[log in to unmask]


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html


ATOM RSS1 RSS2