<---- Begin Forwarded Message ---->
Subject: VIP-L: Fwd: tech: new developments in speech synthesis
>Return-Path: <[log in to unmask]>
>X-Sender: [log in to unmask]
>Date: Thu, 01 Apr 1999 19:19:54 +1000
>To: [log in to unmask]
>From: [log in to unmask] (pattist)
>Subject: VIP-L: Fwd: tech: new developments in speech synthesis
>Sender: [log in to unmask]
>Reply-To: [log in to unmask] (pattist)
>
>To: (Recipient list suppressed)
>From: Amy Ruell <[log in to unmask]>
>
>from the New York times
>
>March 25, 1999
>
>Text-to-Speech Programs With Touchy-Feely Voices
>
>By ANNE EISENBERG
>
> Like everyone else these days, computers are getting in touch with
> their feelings. Or at least learning to fake it.
>
> Computer-generated voices are starting to sound a bit more human.
> Soon the classic robotic monotones of synthetic speech will have
> tinges of the sad, the happy, the polite, the warm and even the
> frightened.
>
> Manufacturers, of course, are eager for the era of the empathetic
> computer voice. They need speech synthesizers that sound, if not
> natural, at least personable enough to attract customers to a range
> of new devices that talk, like systems that can read e-mail to
> drivers as they barrel down the highway or can read to the blind.
>
> Researchers say that warmer voices will soon be available to
> deliver, for instance, a solicitous "Would you like me to read you
> your messages?" and even a sharp "Wake up!" if detectors find that
> a driver is dozing off at the wheel. They will even be able to read
> a poem by Robert Frost from a CD-ROM encyclopedia with a touch of
> poignancy as well as clarity.
>
> "We are not there yet, but we are getting closer," said Andre
> Schenk, director of linguistic technology development at Lernout &
> Hauspie in Ieper, Belgium. "Computers themselves cannot understand
> the text, of course, they cannot comprehend what they are saying,
> but we can improve the quality of the voice. And we can mark the
> text for them, saying this part should be spoken happily, this part
> sadly." Lernout & Hauspie makes text-to-speech programs that take
> typed text and convert it into speech.
>
> The text-to-speech products with improved, more touchy-feely
> voices, expected in about a year, are a result of decades of
> research. To create such voices, companies had to identify the
> subtleties in vocal stress that people associate with emotions like
> anger, fear, joy and despair.
>
> "I was interested in how emotion altered speech, and in writing
> instructions for a voice synthesizer that included some of those
> changes," said Janet Cahn, a researcher in the field of emotion and
> computer-generated language who last month completed her doctorate
> at the Massachusetts Institute of Technology's Media Laboratory.
> Dr. Cahn has been experimenting with enlivening computer-generated
> voices since the late 80's, when she did her master's thesis on
> "expressive synthesized speech."
>
> Dr. Cahn wrote instructions for a voice synthesizer to read
> sentences like "I told you not to call me at the office,"
> accounting first for the effect that stressing certain words has on
> the meaning: "I told you not to call me at the office," for
> example, and "I told you not to call me at the office."
>
> Then she wrote program code that allows the voice synthesizer to
> express emotion. "When people are sad, research suggests they talk
> more slowly, so to convey the emotion 'sad,' I tried for a more
> relaxed, almost slack voice," Dr. Cahn explained. She also adjusted
> the articulation. "Angry speakers tend to articulate very
> precisely, but sad speakers have very imprecise, almost slurred,
> articulation."
>
> Dr. Cahn varied the settings on the speech synthesizer, adjusting
> the pitch range, putting in pauses and varying the speech rate and
> voice quality for her sample sentences. Then she tried the
> sentences out on M.I.T. students and asked them to identify the
> emotions portrayed. "They got the emotions right about half the
> time -- that's roughly what people can do with human speech in
> tests," she said. Some of the sample sentences can be heard on her
> Web page. A later project, in which lines from an Abbott and
> Costello routine or from "Waiting for Godot" are rendered with one
> of six flavors of emotions -- impatient, plaintive, disdainful,
> distraught, annoyed or cordial -- can be heard at the Computer
> Museum in Boston.
>
> Researchers are seeking ways to expand the range of
> computer-generated speech. "Normal, sitting-at-the-desk speech is
> not applicable to many practical purposes," said Prof. Iain R.
> Murray, a lecturer in applied computing at the University of Dundee
> in Scotland. "For instance, you might need a very spirited warning
> if a missile is coming right toward you." Dr. Murray is playing
> with the way voices sound when people are in what he called
> "excited states -- terrified, for instance, or drunk."
>
> There is even something in the future of speech generation for
> fast-talking New Yorkers. At International Business Machines, Salim
> Roukos, manager of conversational systems at the company's Thomas
> J. Watson Research Center in Yorktown Heights, N.Y., is working
> with machines that carry on conversations with people who want to
> do a specific task, like trading stock. One project will take into
> account how fast people talk to the speech synthesizer and respond
> accordingly.
>
> "Right now, people interact with computers mainly by pushing
> buttons," Dr. Roukos said. "We want them to just talk to the
> machine as to a human, so we need a machine that will adapt to
> what's going on in the conversation." He added: "People who speak
> fast prefer a machine that speaks fast. That's one of the reasons
> I'm designing this stuff. I like faster machines, too."
>
> Just how realistic are these computer voices going to become?
> Joseph Olive, who has worked on text-to-speech conversion since
> 1970 at Bell Laboratories, now Lucent Technologies, in Murray Hill,
> N.J., said the problems were difficult ones that would require much
> more research. But the desire to solve them runs deep, he added.
>
> In 1974, Dr. Olive wrote an opera scored for soprano and computer.
> "I was working on speech synthesis and transformed some of the work
> into singing," he explained. In his opera, a scientist teaches a
> computer how to speak with feeling. The computer falls in love with
> her, so the scientist, who cannot cope with that, disassembles the
> machine.
>
> The main theme of the opera, Dr. Olive said, is the desire to have
> computers not just speak but speak with feeling. "But for right
> now, though," he said, "I have my hands full transmitting the
> accurate meaning behind the message."
>
>Regards,
>Steve Pattison,
>[log in to unmask]
>
>** vip-l is sponsored by Blind Citizens Australia and
> administered by Tim Noonan
>
<---- End Forwarded Message ---->
|