BLIND-DEV Archives

Development of Adaptive Hardware & Software for the Blind/VI

BLIND-DEV@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Pete ." <[log in to unmask]>
Reply To:
BLIND-DEV: Development of Adaptive Hardware & Software for the Blind/VI" <[log in to unmask]>
Date:
Thu, 1 Apr 1999 13:56:15 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (151 lines)
<---- Begin Forwarded Message ---->

Subject: VIP-L: Fwd: tech: new developments in speech synthesis

>Return-Path: <[log in to unmask]>
>X-Sender: [log in to unmask]
>Date: Thu, 01 Apr 1999 19:19:54 +1000
>To: [log in to unmask]
>From: [log in to unmask] (pattist)
>Subject: VIP-L: Fwd: tech: new developments in speech synthesis
>Sender: [log in to unmask]
>Reply-To: [log in to unmask] (pattist)
>
>To: (Recipient list suppressed)
>From: Amy Ruell <[log in to unmask]>
>
>from the New York times
>
>March 25, 1999
>
>Text-to-Speech Programs With Touchy-Feely Voices
>
>By ANNE EISENBERG
>
>     Like everyone else these days, computers are getting in touch with
>     their feelings. Or at least learning to fake it.
>
>     Computer-generated voices are starting to sound a bit more human.
>     Soon the classic robotic monotones of synthetic speech will have
>     tinges of the sad, the happy, the polite, the warm and even the
>     frightened.
>
>     Manufacturers, of course, are eager for the era of the empathetic
>     computer voice. They need speech synthesizers that sound, if not
>     natural, at least personable enough to attract customers to a range
>     of new devices that talk, like systems that can read e-mail to
>     drivers as they barrel down the highway or can read to the blind.
>
>     Researchers say that warmer voices will soon be available to
>     deliver, for instance, a solicitous "Would you like me to read you
>     your messages?" and even a sharp "Wake up!" if detectors find that
>     a driver is dozing off at the wheel. They will even be able to read
>     a poem by Robert Frost from a CD-ROM encyclopedia with a touch of
>     poignancy as well as clarity.
>
>     "We are not there yet, but we are getting closer," said Andre
>     Schenk, director of linguistic technology development at Lernout &
>     Hauspie in Ieper, Belgium. "Computers themselves cannot understand
>     the text, of course, they cannot comprehend what they are saying,
>     but we can improve the quality of the voice. And we can mark the
>     text for them, saying this part should be spoken happily, this part
>     sadly." Lernout & Hauspie makes text-to-speech programs that take
>     typed text and convert it into speech.
>
>     The text-to-speech products with improved, more touchy-feely
>     voices, expected in about a year, are a result of decades of
>     research. To create such voices, companies had to identify the
>     subtleties in vocal stress that people associate with emotions like
>     anger, fear, joy and despair.
>
>     "I was interested in how emotion altered speech, and in writing
>     instructions for a voice synthesizer that included some of those
>     changes," said Janet Cahn, a researcher in the field of emotion and
>     computer-generated language who last month completed her doctorate
>     at the Massachusetts Institute of Technology's Media Laboratory.
>     Dr. Cahn has been experimenting with enlivening computer-generated
>     voices since the late 80's, when she did her master's thesis on
>     "expressive synthesized speech."
>
>     Dr. Cahn wrote instructions for a voice synthesizer to read
>     sentences like "I told you not to call me at the office,"
>     accounting first for the effect that stressing certain words has on
>     the meaning: "I told you not to call me at the office," for
>     example, and "I told you not to call me at the office."
>
>     Then she wrote program code that allows the voice synthesizer to
>     express emotion. "When people are sad, research suggests they talk
>     more slowly, so to convey the emotion 'sad,' I tried for a more
>     relaxed, almost slack voice," Dr. Cahn explained. She also adjusted
>     the articulation. "Angry speakers tend to articulate very
>     precisely, but sad speakers have very imprecise, almost slurred,
>     articulation."
>
>     Dr. Cahn varied the settings on the speech synthesizer, adjusting
>     the pitch range, putting in pauses and varying the speech rate and
>     voice quality for her sample sentences. Then she tried the
>     sentences out on M.I.T. students and asked them to identify the
>     emotions portrayed. "They got the emotions right about half the
>     time -- that's roughly what people can do with human speech in
>     tests," she said. Some of the sample sentences can be heard on her
>     Web page. A later project, in which lines from an Abbott and
>     Costello routine or from "Waiting for Godot" are rendered with one
>     of six flavors of emotions -- impatient, plaintive, disdainful,
>     distraught, annoyed or cordial -- can be heard at the Computer
>     Museum in Boston.
>
>     Researchers are seeking ways to expand the range of
>     computer-generated speech. "Normal, sitting-at-the-desk speech is
>     not applicable to many practical purposes," said Prof. Iain R.
>     Murray, a lecturer in applied computing at the University of Dundee
>     in Scotland. "For instance, you might need a very spirited warning
>     if a missile is coming right toward you." Dr. Murray is playing
>     with the way voices sound when people are in what he called
>     "excited states -- terrified, for instance, or drunk."
>
>     There is even something in the future of speech generation for
>     fast-talking New Yorkers. At International Business Machines, Salim
>     Roukos, manager of conversational systems at the company's Thomas
>     J. Watson Research Center in Yorktown Heights, N.Y., is working
>     with machines that carry on conversations with people who want to
>     do a specific task, like trading stock. One project will take into
>     account how fast people talk to the speech synthesizer and respond
>     accordingly.
>
>     "Right now, people interact with computers mainly by pushing
>     buttons," Dr. Roukos said. "We want them to just talk to the
>     machine as to a human, so we need a machine that will adapt to
>     what's going on in the conversation." He added: "People who speak
>     fast prefer a machine that speaks fast. That's one of the reasons
>     I'm designing this stuff. I like faster machines, too."
>
>     Just how realistic are these computer voices going to become?
>     Joseph Olive, who has worked on text-to-speech conversion since
>     1970 at Bell Laboratories, now Lucent Technologies, in Murray Hill,
>     N.J., said the problems were difficult ones that would require much
>     more research. But the desire to solve them runs deep, he added.
>
>     In 1974, Dr. Olive wrote an opera scored for soprano and computer.
>     "I was working on speech synthesis and transformed some of the work
>     into singing," he explained. In his opera, a scientist teaches a
>     computer how to speak with feeling. The computer falls in love with
>     her, so the scientist, who cannot cope with that, disassembles the
>     machine.
>
>     The main theme of the opera, Dr. Olive said, is the desire to have
>     computers not just speak but speak with feeling. "But for right
>     now, though," he said, "I have my hands full transmitting the
>     accurate meaning behind the message."
>
>Regards,
>Steve Pattison,
>[log in to unmask]
>
>** vip-l is sponsored by Blind Citizens Australia and
> administered by Tim Noonan
>



<----  End Forwarded Message  ---->

ATOM RSS1 RSS2