David Poehlman <[log in to unmask]>
Reply To:
David Poehlman <[log in to unmask]>
Thu, 2 May 2002 08:53:08 -0400
text/plain (216 lines)
----- Original Message -----
From: "Catherine Alfieri" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Thursday, May 02, 2002 5:18 AM
Subject: CURR: Hearing Text, Not Tunes, on Your MP3 Player

Resources for people who find reading difficult for different

Hearing Text, Not Tunes, on Your MP3 Player

May 2, 2002


TALKING computers are nothing new. Computers in sci-fi
movies have been chatting away for years (think HAL in
"2001"), the physicist Stephen Hawking uses a voice
synthesizer to communicate, and the occasional nerd still
thinks it's cute to record a computer-generated
answering-machine greeting.

Otherwise, though, computer speech has been thoroughly
ignored by the average consumer - which seems odd,
considering that every Macintosh and Windows PC comes with
built-in software that reads back text. On the Mac, you can
choose from 18 computerized voices (every conceivable
variation of male, female and alien) to read back documents
in Word, AppleWorks, America Online and other programs. In
Windows, a utility program called Narrator can read aloud
menus and dialog boxes.

A quick search at, furthermore, unearths
dozens of free and shareware programs that mine the same
territory. Some are specialized programs designed to
pronounce each word as you type it, or to give voice to the
typed exchanges in your instant-message chats. Most,
though, are simply designed to read back text on the

For most people, the question is: Why? Sure, listening to
documents read aloud sometimes helps you catch mistakes a
traditional proofreading pass might miss. Blind computer
users, children learning to read, and people learning
English may benefit, too. Then, too, text-reading programs
make it possible for you to "listen to articles on the Web
while fixing your lunch," as one software company
cheerfully puts it. Still, for mainstream consumers, these
aren't what you would call desperately needed functions.

But the winds of change are blowing in the field known as
text-to-speech. New Windows programs with names like
iSpeak, TextAloudMP3 and Text-to-Audio do more than simply
read your text out loud: they can also turn it into the
high-quality compact sound recordings known as MP3 files.

Teenagers ignited the MP3 craze by converting their
favorite pop-music CD's into MP3 files that play back on
portable music players. What makes these new speech
programs remarkable is that they open up the same kind of
freedom to the over-20 set. They let you listen to your
documents - e-mail, Web pages, reports, manuals, electronic
books, or anything else you can type or download - as you
commute, work out or work outside.

Of course, commuters and joggers have been listening to
Books on Tape for years, and companies like
create what you might call Books on MP3. These products are
expensive, however, and your listening is restricted to
other people's stuff.

Using your PC to record your own material has a drawback,
though: you won't be listening to the voice of a
professional actor. (Listening to James Earl Jones read
your e-mail to you would certainly be a rush, but might be
out of your price range.) In fact, you won't even be
listening to a human being. When you listen to the old
Apple and Microsoft voices, "lifelike" isn't the adjective
that springs to mind. In charitable moments, you might
describe them as sounding like drunken Scandinavian robots.
Fortunately, a white knight has emerged to rescue you from
the prospect of listening to mechanical voices forever:
AT&T, which has developed a set of new, vastly improved
voices called Natural Voices. The inflection isn't always
on track - they sometimes produce nonsensical line
readings, as if an actor were auditioning with a script he
didn't quite understand - but you would otherwise swear you
were listening to a professional, blow-dried American
newscaster. Only a few words betray a hint of what you'd
call a PC accent.

At the moment, the weakest MP3-enabled text reader is
iSpeak, from Fonix ($70 at It's supposed to
be able to read Microsoft Outlook 2000 e-mail messages
"with the click of a mouse," read text "from any Web page"
and "vocalize" words as you type. Unfortunately, all of
this excitement takes place only within the iSpeak program
window. Yes, it can read text from any Web page, if you
copy and paste it into iSpeak first. A handy iSpeak menu
does indeed appear in Outlook, but it just copies the
current message back into the iSpeak window. Sure enough,
the program can speak each word as you type it, provided
you're typing into the iSpeak program itself.

At this point, iSpeak is also the only program that doesn't
capitalize on the AT&T Natural Voices (though it will soon;
more on this topic in a moment). Instead, it uses Fonix's
own voices, which are superior to the stock Microsoft and
Apple voices but feature a lot of what voice teachers call
glottal stops. You get the impression that the person doing
the reading keeps thwacking his own Adam's apple.

Text-to-Audio ($50 at premier, on the
other hand, shows tremendous promise. It's the only
text-to-MP3 program that can import Microsoft Word files
for conversion, not just plain text files. It even displays
these files, formatting intact, and highlights the words as
it reads. Text-to-Audio comes with an MP3 playing program,
too, so you can double-check the resulting sound files
before committing them to a music player.

Unfortunately, Text-to-Audio has more eccentricities than
Ross Perot. It can only recognize text files whose names
end with ".tx," rather than the standard ".txt," which
pretty much means you have to rename every file before you
import it. More damaging, though, are the glitches that
result when you choose one of those terrific AT&T voices.
For some reason the program ignores periods, turning every
document into a gigantic run-on sentence. It also treats
apostrophes as spaces, pronouncing "don't" as "donn-tee"
and "you'll" as "you L. L." Is there such thing as remedial
reading classes for computers?

The company blames the AT&T voices for these glitches and
says an update is due this month. Yet TextAloud MP3
(, $25 with ordinary voices, or $51 with the
AT&T voices), a rival program that can also use those
voices, exhibits no such glitches. It's the undisputed
winner in this three-way Sound Like a Human contest.

Better yet, TextAloud offers a couple of extremely useful
features that feel painfully absent in its competitors. For
example, only TextAloud can speak the words in the windows
of everyday programs like word processors, Web browsers and
e-mail programs (you press predefined keystrokes to start
and stop the talking). Furthermore, whenever you highlight
text in any program and then press Ctrl+C, the program
offers to sock that text away on its own internal
clipboard. The idea is that as you cruise through e-mail
messages, Web pages and other documents, you can build up a
playlist that you can later listen to, or convert to MP3's
en masse. The program exhibits a few bugs and misspellings,
and it still can't import (for conversion to MP3) anything
beyond plain text files or text you've copied, but it's
nonetheless the program to beat.

None of these MP3-capable text readers are especially
user-friendly. The first time you fire one up, you're
likely to stare blankly, having no hint how to begin.
Furthermore, after you create a few MP3 files, finding them
on your hard drive and getting them onto your portable MP3
player is left up to you.

The ultimate MP3 text reader would bypass this problem by
loading its converted files directly onto the MP3 player.
It would accept Word and PDF files, read text from within
your favorite programs, and use AT&T's voices. As it turns
out, all of this is precisely what Fonix says it will offer
in iSpeak 3.0, scheduled for a June release. If the new
program lives up to the company's promises, it should be a

If you can't wait, TextAloud MP3 is a competent little
talker whose ability to read aloud any open document makes
it especially attractive. In any case, MP3-making text
readers open a new world of times and places in which you
can get work or "reading" done, made all the more pleasant
to listen to by AT&T's new voices. It's hard to see a
downside to technological advances like these - except,
perhaps, all those out-of-work Norwegian robots.

Copyright 2002 The New York Times Company

