VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
George Cassell <[log in to unmask]>
Reply To:
George Cassell <[log in to unmask]>
Date:
Mon, 20 Feb 2006 18:08:21 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (131 lines)
MSDN (Microsoft Developer Network) Blog
Monday, February 20, 2006

Vista's Speech Capabilities

By SYSK 63

The next version of OS (Vista) will have state-of-the-art speech technology
built right in. WinFX will have a powerful API for enabling your users to
speak to your apps and your apps to speak to your users.

At the last PDC (2005), Phillip Schmid, Robert Brown and Steve Chang gave a
great talk "Ten Amazing Ways to Speech-Enable Your Application", available
at

http://microsoft.sitestream.com/PDC05/PRS/PRSL03_files/Default.htm.

Below are some key points from the talk.

Vista will ship with 8 language speech recognizers

Vista shell is speech enabled, i.e. you can drive it without using a mouse
or keyboard.  If you can see it on the screen, you can say it!  Anything you
can do with a keyboard or mouse, you can say it!

Dictation is built into OS, i.e. any application that has a text field can
take in dictation.  Yes, no code needed -- your application is automatically
dictation enabled!

System.Speech API is now part of WinFX.  Why use it?  To add more speech
enabled functionality than "what you see - you can say".  E.g. you can
speech enable deeply nested menus...

Speech Synthesizer Example:
using System.Speech.Synthesis;

SpeechSynthesizer synthesizer = new SpeechSynthesizer();

// To speak
synthesizer.SpeakText("Your sentence goes here");

// To send the output of synthesizer to .wav file
synthesizer.SetOutputToWaveFile("YOUR FILE PATH HERE");name
synthesizer.SpeakText("Your sentence goes here");

To customize speach recognition, do the following:
// 1.  Create SpeechRecognizer instance (normally, once per application)
using System.Speech.Recognition;

SpeechRecognizer speechRecognizer = new SpeechRecognizer();

// 2.  Create Grammar instance
Grammer phoneGrammer = new Grammar("YOUR GRAMMAR FILE HERE");

The grammar file is an xml file with words the speech recognizer should
understand, and their mapped actions; e.g. in a provisioning application,
you might have the following commands:

<rule id=PhoneCommands" scope="public">
     <one-of>
         <item> purchase new phone
<tag>synthisizerAction="PurchaseNewPhone"</tag> </item>
         <item> reuse existing phone
<tag>synthisizerAction="ReusePhone"</tag> </item>
     </one-of>
</rule>

// 3.  Load grammar into recognizer
recognizer.LoadGrammar(phoneGrammer); // note: there can be many grammars
loaded at the same time

// 4.  Subscribe to SpeechRecognized event
phoneGrammer.SpeechRecognized += new
EventHandler<RecognitionEventArgs>(PhoneGrammer_SpeechRecognized);

void PhoneGrammer_SpeechRecognized(object sender, RecognitionEventArgs e)
{
    switch((string) e.Result.Semantics["synthesizerAction"].Value)
    {
        case "PurchaseNewPhone":
            // TODO: Show new phone purchase form
            break;
        case "ReusePhone":
            // TODO: Show existing phone re-purposing form
            break;
    }
}


Now, how does Microsoft Speech Server relate to Vista's speech
functionality?
Microsoft Speech Server is about speech enabling your applications from the
phone.

First, let's set the stage.  For those who are not familiar with this
technology, Microsoft Speech Server acts as a digital data-to-voice
translator:

    - It interprets voice commands/data from a user and digitizes it
    - It offers digitized information as XML to a web application for
manipulation
    - It takes digital information from a web application and
'vocalizes'/'reads' it to a user.

The possibilities range from sales support (e.g. you can search for customer
phone numbers/addresses over the phone with your voice), to getting vocal
directions from MapPoint to your customer's location read to you while on
the road, to commerce sites allowing you to check on the order status of an
online purchase, get an ETA, and even to change the destination shipping
address, to being able to record a message for a person and have it sent via
email attachment via Exchange.  Not to mention unprecedented support for
developers to create friendly web applications for more easy access to the
visually impaired.

The future version of Speech Server will use the same API, as one exposed in
Vista, for extending the reach of your .NET applications to the telephone.

The SDK is available now, and can be used with VS 2005!

Published Wednesday, February 15, 2006 7:00 AM by irenak

http://blogs.msdn.com/irenak/archive/2006/02/15/532530.aspx


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://listserv.icors.org/archives/vicug-l.html

ATOM RSS1 RSS2