VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Kelly Pierce <[log in to unmask]>
Reply To:
Kelly Pierce <[log in to unmask]>
Date:
Thu, 2 May 2002 06:41:40 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (167 lines)
here's a great summary of all the text to MP3 audio programs.  It's kind
of cool that sighted people are finding the spoken text we use helpful in
their lives.

Kelly


The New York Times


May 2, 2002

STATE OF THE ART

Hearing Text, Not Tunes, on Your MP3 Player

By DAVID POGUE

TALKING computers are nothing new. Computers in sci-fi movies have been
chatting away for years (think HAL in "2001"), the physicist Stephen
Hawking uses a voice synthesizer to communicate, and the occasional nerd
still thinks it's cute to record a computer-generated answering-machine
greeting.

Otherwise, though, computer speech has been thoroughly ignored by the
average consumer - which seems odd, considering that every Macintosh and
Windows PC comes with built-in software that reads back text. On the Mac,
you can choose from 18 computerized voices (every conceivable variation
of male, female and alien) to read back documents in Word, AppleWorks,
America Online and other programs. In Windows, a utility program called
Narrator can read aloud menus and dialog boxes.

A quick search at www.downloads.com, furthermore, unearths dozens of free
and shareware programs that mine the same territory. Some are specialized
programs designed to pronounce each word as you type it, or to give voice
to the typed exchanges in your instant-message chats. Most, though, are
simply designed to read back text on the screen.

For most people, the question is: Why? Sure, listening to documents read
aloud sometimes helps you catch mistakes a traditional proofreading pass
might miss. Blind computer users, children learning to read, and people
learning English may benefit, too. Then, too, text-reading programs make
it possible for you to "listen to articles on the Web while fixing your
lunch," as one software company cheerfully puts it. Still, for mainstream
consumers, these aren't what you would call desperately needed functions.

But the winds of change are blowing in the field known as text-to-speech.
New Windows programs with names like iSpeak, TextAloudMP3 and
Text-to-Audio do more than simply read your text out loud: they can also
turn it into the high-quality compact sound recordings known as MP3
files.

Teenagers ignited the MP3 craze by converting their favorite pop-music
CD's into MP3 files that play back on portable music players. What makes
these new speech programs remarkable is that they open up the same kind
of freedom to the over-20 set. They let you listen to your documents -
e-mail, Web pages, reports, manuals, electronic books, or anything else
you can type or download - as you commute, work out or work outside.

Of course, commuters and joggers have been listening to Books on Tape for
years, and companies like Audible.com create what you might call Books on
MP3. These products are expensive, however, and your listening is
restricted to other people's stuff.

Using your PC to record your own material has a drawback, though: you
won't be listening to the voice of a professional actor. (Listening to
James Earl Jones read your e-mail to you would certainly be a rush, but
might be out of your price range.) In fact, you won't even be listening
to a human being. When you listen to the old Apple and Microsoft voices,
"lifelike" isn't the adjective that springs to mind. In charitable
moments, you might describe them as sounding like drunken Scandinavian
robots.

Fortunately, a white knight has emerged to rescue you from the prospect
of listening to mechanical voices forever: AT&T, which has developed a
set of new, vastly improved voices called Natural Voices. The inflection
isn't always on track - they sometimes produce nonsensical line readings,
as if an actor were auditioning with a script he didn't quite understand
- but you would otherwise swear you were listening to a professional,
blow-dried American newscaster. Only a few words betray a hint of what
you'd call a PC accent.

At the moment, the weakest MP3-enabled text reader is iSpeak, from Fonix
($ 70 at www.fonix.com). It's supposed to be able to read Microsoft
Outlook 2000 e-mail messages "with the click of a mouse," read text "from
any Web page" and "vocalize" words as you type. Unfortunately, all of
this excitement takes place only within the iSpeak program window. Yes,
it can read text from any Web page, if you copy and paste it into iSpeak
first. A handy iSpeak menu does indeed appear in Outlook, but it just
copies the current message back into the iSpeak window. Sure enough, the
program can speak each word as you type it, provided you're typing into
the iSpeak program itself.

At this point, iSpeak is also the only program that doesn't capitalize on
the AT&T Natural Voices (though it will soon; more on this topic in a
moment). Instead, it uses Fonix's own voices, which are superior to the
stock Microsoft and Apple voices but feature a lot of what voice teachers
call glottal stops. You get the impression that the person doing the
reading keeps thwacking his own Adam's apple.

Text-to-Audio ($50 at premier-programming.com), on the other hand, shows
tremendous promise. It's the only text-to-MP3 program that can import
Microsoft Word files for conversion, not just plain text files. It even
displays these files, formatting intact, and highlights the words as it
reads. Text-to-Audio comes with an MP3 playing program, too, so you can
double-check the resulting sound files before committing them to a music
player.

Unfortunately, Text-to-Audio has more eccentricities than Ross Perot. It
can only recognize text files whose names end with ".tx," rather than the
standard ".txt," which pretty much means you have to rename every file
before you import it. More damaging, though, are the glitches that result
when you choose one of those terrific AT&T voices. For some reason the
program ignores periods, turning every document into a gigantic run-on
sentence. It also treats apostrophes as spaces, pronouncing "don't" as
"donn-tee" and "you'll" as "you L. L." Is there such thing as remedial
reading classes for computers?

The company blames the AT&T voices for these glitches and says an update
is due this month. Yet TextAloud MP3 (www.nextup.com, $25 with ordinary
voices, or $51 with the AT&T voices), a rival program that can also use
those voices, exhibits no such glitches. It's the undisputed winner in
this three-way Sound Like a Human contest.

Better yet, TextAloud offers a couple of extremely useful features that
feel painfully absent in its competitors. For example, only TextAloud can
speak the words in the windows of everyday programs like word processors,
Web browsers and e-mail programs (you press predefined keystrokes to
start and stop the talking). Furthermore, whenever you highlight text in
any program and then press Ctrl+C, the program offers to sock that text
away on its own internal clipboard. The idea is that as you cruise
through e-mail messages, Web pages and other documents, you can build up
a playlist that you can later listen to, or convert to MP3's en masse.
The program exhibits a few bugs and misspellings, and it still can't
import (for conversion to MP3) anything beyond plain text files or text
you've copied, but it's nonetheless the program to beat.

None of these MP3-capable text readers are especially user-friendly. The
first time you fire one up, you're likely to stare blankly, having no
hint how to begin. Furthermore, after you create a few MP3 files, finding
them on your hard drive and getting them onto your portable MP3 player is
left up to you.

The ultimate MP3 text reader would bypass this problem by loading its
converted files directly onto the MP3 player. It would accept Word and
PDF files, read text from within your favorite programs, and use AT&T's
voices. As it turns out, all of this is precisely what Fonix says it will
offer in iSpeak 3.0, scheduled for a June release. If the new program
lives up to the company's promises, it should be a doozy.

If you can't wait, TextAloud MP3 is a competent little talker whose
ability to read aloud any open document makes it especially attractive.
In any case, MP3-making text readers open a new world of times and places
in which you can get work or "reading" done, made all the more pleasant
to listen to by AT&T's new voices. It's hard to see a downside to
technological advances like these - except, perhaps, all those
out-of-work Norwegian robots.


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html


ATOM RSS1 RSS2