VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"M. J. P. Senk" <[log in to unmask]>
Reply To:
VICUG-L: Visually Impaired Computer Users' Group List
Date:
Fri, 22 May 1998 00:42:50 -0400
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (540 lines)
Inspired by descriptions of new technology in the recent post about Web
Speak and the Productivity Works, I was soon surfing www.labyrinten.se.
This Swedish company has a beta copy of the DAISY digital book software
and www.daisy.org reports that a portable player based on CD technology
called the Plextalk is being manufactured.  So far, I have not been able
to locate any material to read, but titles apparently are available in
Japan.

Here is a description of this technology which may someday enable us to
download talking books over the Internet.

--- from www.daisy.org ---

                    DAISY - Digital Talking Book System

                           A system presentation

        This material presents a radically new way to record, store,
   distribute and read talking books for print impaired individuals, i.e.
         people who have difficulties reading printed information.

    The system is built upon digital technology, using standard personal
   computers and components as its hardware platform. The system is based
   on a general concept called "Digital Audio-based Information System",
                           or "DAISY" for short.

   The new talking book system is being developed by a project funded by
   the DAISY Consortium, an international group of talking book producers
   such as libraries and institutions. Software and hardware development
     is carried out in collaboration with several commercial partners.

                                 Definition

   The term "talking book" traditionally refers to a recording of a human
   narrator&#146;s voice, reading out printed information from a book or
    other printed publication. Talking books are normally produced under
    special agreements with the publishers, and are only intended to be
   read by blind or other print impaired individuals. The typical talking
     book is lent to the readers, not given away or sold. The user (the
   reader of the talking book) gains access to the printed information by
      listening to the recording, using a suitable device and means to
                           control the playback.

     A talking book is in this regard not the same thing as a "book on
       tape", which is a recording of a book produced and distributed
     commercially by a publisher. These books can be bought and read by
                                  anyone.

   There are also "electronic books" these days. These are not narrated,
      but consist of electronic text that can be read by using e.g. a
                             personal computer.

                            The main design goal

     While developing the data structure and data handling methods for
                DAISY, some of the main objective has been:

    To make the system capable of conveying the printed material&#146;s
   logical structure to the talking book reader, and to allow the user to
   use that structure to navigate and read the material. This means that
   the data representing the recording of the book should be possible to
      organise and structure, so that it fully resembles the original.

     To give the reader the same - or better - speed and flexibility in
    accessing the book&#146;s information as a fully able user has when
   reading a printed book. This means fast, arbitrary access to any part
                             of the recording.

       The system should be capable of storing the talking book data
   efficiently. The goal should be that any talking book recording (even
        with a length of up to 50 hours or more) could be stored and
   distributed on a single mass storage medium, such as a CD-ROM disc. To
       achieve this, audio data compression technology must be used.

    The system should be independent of distribution media to be able to
       adapt new distribution and storage technology as it develops.
    Furthermore, the system should not be fixed to any particular method
   for digital audio data compression, since these technologies will also
   be evolving in the future. In short, the system must be "future-safe",
    since precious recording efforts must be preserved, even though the
            field of digital technology is far from mature yet.

                              Structured audio

   The key to meet the design goals lies in the system&#146;s ability to
   intelligently and automatically divide the voice data representing the
       talking book recording into manageable blocks or speech, each
     representing a small, separate unit of information in the printed
                                 original.

     To carry out this task, the system uses an advanced voice analysis
     technology while recording the narrator&#146;s voice. The incoming
     stream of digitised voice data is in this process broken down into
      segments, based on the flow of the speech. For example, when the
    narrator makes a slight pause to indicate the end of a sentence, the
      recording system can use this pause to identify a unit, which is
                         referred to as a "phrase".

   Even though a recorded phrase often corresponds to a sentence of text,
   this does not need to be the case. The recording system can be set up
    to detect anything from single words to whole paragraphs as phrases.

      The structuring process is carried out automatically during the
   recording of the talking book. The playback equipment does not need to
    do any such processing, but can instead make use of the structure to
           allow structured access to the talking book material.

   The concept of the "phrase" is central to the system, since it is the
     smallest informational unit the user can access. The system&#146;s
   automatic phrase division capability is the main structuring principle
   for the talking book. The digital talking book is not stored as just a
      stream of digitised voice data, but as a database of small data
   objects. This concept is what makes the DAISY system unique, and it is
     also what makes DAISY so well suited for reading of talking books.

    Structured digital audio technology offers new and efficient methods
        for recording and editing in the production stage of a book.
    Structured audio can be stored or distributed efficiently, either on
       mass storage medium or via network transmission. Most of all,
   structured audio means that the reader can access the information in a
   truly efficient way. By this technology, the talking book can become a
                          modern information tool.

                             Structured access

   The DAISY book is normally structured to resemble the printed original
   as closely as possible. The main navigational index is typically based
     on the book&#146;s Table of Contents (ToC), containing a number of
   sections such as chapters, sub-chapters etc. The phrases that contain
       the narrated speech in the "audio database" are organised into
     sections that correspond to the different entries in the ToC. This
   structuring is done as a part of the recording process, and means that
     the reader can easily and quickly navigate the book by moving from
                       section to section in the ToC.

    As the book&#146;s Table of Contents is navigated, the headings are
   announced by the system playing the relevant audio - i.e., the phrases
    that correspond to the particular section heading. The reader is in
   other words moving through a "talking table of contents". As headings
         are announced, the playback position is also moved to the
     corresponding location in the recording. When the user starts the
         playback, the narration begins from the selected section.

     Structured access does not stop with sections and phrases, though.
   Each phrase of the audio database has a unique identity which defines
    both its sequential placement in the talking book recording and also
   which section it belongs to. A phrase can also have other attributes,
     e.g. to identify it to be the first phrase on a new page, a phrase
   with a link to another phrase, a phrase with a footnote and so on. By
      making use of these attributes, the talking book material can be
    navigated and read in many and powerful ways. Talking book access in
      the future can therefore be similar to hypermedia or multimedia
                                  access.

                          The transition to DAISY

     A "tape transfer" module in the DAISY recording system allows old
   talking book material - stored on master tapes - to be transferred and
    converted to the new digital format. The transfer process is highly
    automated and can currently work at twice the normal playback speed.

    In the transfer process, the same voice analysis methods are used as
    in the ordinary ("live") recording process. The audio information is
    automatically converted to DAISY format as it is transferred to the
   recording software. To further automate the transfer process, standard
    index tones on the tape can be identified and used to automatically
                    break the material up into sections.

     After the transfer, the operator can use the system&#146;s editing
   tools to clean up the material, insert text for the section headings,
    adjusting the hierarchy of the talking book&#146;s Table of Contents
   etc. Page breaks and other attributes can also be defined as a part of
     the editing work. The editing is efficient and flexible with full
                              audio feedback.

     The recording system also has the ability to create analogue tape
     versions of a DAISY book by means of a "Tape Manager" module. This
      way, a producer will be able to provide books in both DAISY and
   analogue tape format to a reasonable cost. This ability will be useful
    during the transition from the traditional talking book to the DAISY
                                   book.

                       Based around an open standard

    The DAISY talking book system is built around the DAISY Data Format,
        an open storage format specification for digital audio-based
              applications, published by the DAISY Consortium.

   The DAISY data format specification is available to anyone interested
     in making their own implementation of it, e.g. to create their own
                recording software or DAISY playback device.

   The DAISY data format is suggested by the DAISY Consortium to become a
    commonly accepted standard for digital talking books and structured
   audio management amongst producers and libraries worldwide. This would
    secure interlending of digital talking books and also help to reduce
                   costs for development of new systems.

   The DAISY data format is advanced enough for making of highly complex
    talking book material, and also for creating new kinds of electronic
     publications making use of digital audio. The format specification
   allows for other data types - such as text, graphics etc. &#150; to be
     stored together with the voice data, linked together on the phrase
                    level for synchronised presentation.

    Even though the data format is advanced, it is also well suited for
   storing materials of less complex nature than textbooks, cookery books
      etc. All kinds of publications - ranging from poetry or leisure
    literature to religious texts - can actually benefit from using the
      DAISY format. The readers will use one type of access device and
   reading methods for all kinds of books, even though the structure and
                    complexity of the material may vary.

                           Standard PC technology

     The current talking book system developed by the DAISY Consortium
     consists of software running on standard IBM-compatible PC:s with
   multimedia capabilities (CD-ROM and audio input/output hardware). The
   operating systems used are Microsoft Windows 95 and Windows NT. The PC
    offers a generally available, low cost, open hardware platform, well
       suited for running both DAISY recording and playback software.

         As the system is based on the DAISY Data Format, the DAISY
   Consortium&#146;s system is just one of many possible implementations.
    Other platforms than the PC might be introduced for recording and/or
                          playing back DAISY books

     The system uses CD-ROM technology today, since it is currently the
   most cost-effective medium for storage of large amounts of data. Other
   types of media may come into the picture in the future, as they become
                  commercially or practically attractive.

   As the DAISY data format as such is independent of storage medium, it
   is fully up to each producer to choose the storage/distribution medium
    or distribution channel. However, to allow for interlending between
       libraries, a common medium is of great help, and so the DAISY
             Consortium currently recommends CD-ROM technology.

   If and when a new storage medium or distribution channel is introduced
     amongst talking book producers, there will be a need for a general
      acceptance of the new technology. Devices used for talking book
      reading must be equipped with appropriate hardware to be able to
                           access the new medium.

                          Efficient audio storage

    By performing sophisticated audio data compression, large amounts of
    voice data may be stored on a single medium. As an example, a single
     CD-ROM can hold up to 50 hours of recorded narration, allowing for
   large books to be stored conveniently on a single disk. Even at 20-30
   hours per CD, the sound quality of the voice is very good - as good as
    or better than a good analogue tape recording. The audio quality may
        be further increased for books with shorter playback length.

   If the producer so wishes, it is fully possible to store more than one
   publication on each medium. As even more compact audio storage will be
    made available in the future, this will allow for a whole collection
                 of titles to be stored on a single medium.

     The DAISY data format is not dependent on a particular audio data
   compression technology, since it needs to be future-safe. The field is
    evolving rapidly, and new technologies can be accepted, as they are
   becoming available. However, the DAISY data format specifies a limited
   number of data formats that must be supported as a minimum capability
        by DAISY-compliant equipment. These data formats include PCM
      (uncompressed audio data), ADPCM and MPEG-2 layer 2 compression.
   Several different sampling frequencies and bit-rates are supported, to
       allow for a good balance between data size and audio quality.

                    A new kind of production environment

      DAISY books are recorded and edited using a dedicated recording
   software package. The recording software runs on a modern, powerful PC
                     running Windows 95 or Windows NT.

   The recording is made directly from microphone to a data medium, e.g.
   a hard disk, a magneto-optical (MO) disk or the like. A narrator with
    moderate computer skills should be capable of handling all, or most
    of, the recording, editing and production process without assistance
                   form a studio or computer technician.

       The audio processing equipment in the recording studio can be
   virtually the same as for analogue talking book recording. What is new
   is the PC instead of the tape recorder. The narrator or operator will
    also have a new kind of remote control for the PC, and may also make
                 use of a normal PC keyboard for the work.

    The structured approach used in DAISY means that a recording project
    can be carried out in a non-linear fashion. The book can be narrated
   in any order, and the recording can be done section per section. It is
      possible to define the structure of the recording project before
   filling it with voice data, which means that the narrator can navigate
      the recording by using a Table of Contents on the screen of the
                               recording PC.

      The recording system is flexible enough to allow also for a more
       linear method of production. The narrator can then just start
    recording and create new sections while reading along. The sections
   can be named and organised at a later stage to resemble the structure
               of the original book&#146;s Table of Contents.

       Mistakes made during narration can very easily and quickly be
      corrected by using the keyboard or the remote control. Since the
   recording is done in a phrase-based fashion, it is very easy to "punch
      in" at the right place to correct mistakes or to add more data.

    The narrator can place different attributes on certain phrases, such
    as for indication of page break, beginning of a new paragraph and so
    on. This can be done by pressing a button on the remote control or a
        key on the keyboard while reading. If so desired, the phrase
   attributes may also be set or edited later, when the recorded data has
                            been stored to disk.

       The recording system offers highly powerful ways of inserting,
    deleting, copying and moving recorded data. Working with structured
     audio is rather similar to working with text in a word processor.

                          DAISY reading equipment

   The same DAISY book can be read on several platforms. Today, there is
   playback software available for standard multimedia PC:s , as well as
                       purpose-built reading devices.

    The PC has the advantage of being useful for many other things than
   just DAISY book reading, and offers advanced reading features such as
     text searching, making of notes etc. The disadvantage of the PC is
   that it still is rather expensive, at least if it is only going to be
   used for talking book reading. Reading books on a personal computer is
   also hardly not going to be the preferred way of reading by all users.

   The advantage of a dedicated DAISY player is that it can be made light
     and small in comparison with a typical PC, and also that it can be
    produced at a lower cost. Such a player allows for a wider range of
   reading situations. Its limitations lie in difficulty in expanding the
     hardware, e.g. by adding reading devices for new storage media, as
    well as in its lack of keyboard and screen, which in turn means that
                 advanced reading features can not be used.

   Thanks to the DAISY data format, however, a CD-ROM containing a DAISY
     book can be read on both platforms. The same book can therefore be
     moved between different devices to allow for reading in different
     ways. Advanced users, such as students, may have access to both a
        personal computer and a purpose-built DAISY playback device.

                    Dramatically improved reader access

       Usually, one talking book is stored on one distribution media,
     typically a CD-ROM disc. The disc is distributed to the reader by
   postal services. The disc is inserted into the CD-ROM device of the PC
    or dedicated playback machine. The title of the book or books on the
       CD can be determined by a special command, or be automatically
                                 presented.

   The user navigates and reads the DAISY book by using its talking Table
    of Contents as the main navigational index. The ToC for the talking
    book normally resembles the printed material&#146;s ToC, though this
                        is fully up to the producer.

      By navigating amongst the section headings in the ToC, the user
   selects the desired playback position. The process of finding a place
    in the book is thus similar to when reading a printed book. In fact,
   it is even simpler and quicker, since the talking book reader does not
     have to flip the pages of the book to the correct page, but merely
         selects the desired heading and starts the voice playback.

     The response of the system is fast. Typically, using a standard PC
    with a CD-ROM gives random access times to sections of less than one
    second. Dedicated playback devices may have the same or even better
                                performance.

      When a PC is used for reading, the user has the ability to move
    directly to a particular section by searching for a text string that
   appears in the heading. The typical DAISY book has a Table of Contents
    in electronic text format that complements the speech, and this ToC
       can be used both for on-screen presentation and for searching.

    When in a section, the user may brows the material much in the same
   way as a printed book may be skim-read. The user may move the playback
    position forwards and backwards in the material by phrase, by phrase
    group or by page, depending on to which degree the talking book has
                       been indexed by the producer.

    If the DAISY book has been page-indexed, the talking book reader may
   instantly move the playback position to the phrase that represents the
     first text of a new page in the original text. This way, the DAISY
    reader can flip pages and search for particular pages in relation to
                           the printed original.

       The system can move very quickly between speech units within a
   section. By only listening to the first few moments of each phrase, it
       is an easy process to find a place in the recording by simply
   listening for it. Of course, the reader can also listen to the talking
   book as a continuos narration. When the playback is started, it always
   starts at the beginning of a phrase. The speech can be instantaneously
                           stopped at any point.

                           Playback voice control

    The narration of a talking book can be done differently depending on
   who is narrating as well as the nature and purpose of the talking book
   material. The DAISY system offers several ways to change the speed of
   the recording, without changing the pitch of the voice. Such features
    are highly desirable by many advanced users, such as students. These
   features can of course also be turned off to let the reader listen to
    the unmodified version of the narrator&#146;s voice. There are even
     possibilities to slow down the narration relative to the original,
       again without changing the pitch of the voice. This may be of
       assistance to some users, e.g. when reading advanced material.

    The ability to control playback speed is not an aspect of the DAISY
    book as such, though the use of structured audio makes it relatively
    easy to create these kinds of features in the reading software or a
    playback device. Both the DAISY playback software and the dedicated
    DAISY player that exist today support "Intelligent Time Compression"
   or ITC for short. With ITC, the playback speed can be varied from 75%
     or the original up to 300% speed, i.e. the recording can be played
      back up to 3 times faster then it was recorded. Naturally, high
     playback speed means that the speech will be harder to understand,
               even if the pitch of the voice is not changed.

                         Other features for reading

     Every phrase in the talking book - that is, the block of recorded
   audio corresponding to something like a sentence in the printed book -
    has a unique identification in the "audio database" that makes up a
      DAISY book. This identity defines the placement of the phrase in
    relation to the other phrases. It also identifies which section each
                             phrase belongs to.

      This database-style approach allows for simple creation of links
      between phrases and user data, as for example notes made by the
    student in relation to a narrated textbook. Also the reader can mark
      parts of special interest in the book, and then use these marked
     phrases as a user-defined index for browsing of the talking book.

    Bookmarks may be placed in the material at any phrase location. By a
     simple command, the system can be instructed to move to a desired
   bookmark. The references to the book s voice data take up very little
   storage space, and they may therefore easily be stored on a local hard
    disk on the playback system, or even be transferred to diskette for
                             use on another PC.

     A special bookmark is automatically placed by the system when the
    continuous playback is stopped. This bookmark is saved with the user
    data and the system can use it to automatically restore the reading
             position the next time the talking book is opened.

     The user data is automatically stored on disk and is linked to the
     currently opened book. The next time the same book is loaded, the
             system automatically loads the relevant user data.

        Dedicated playback devices can not normally offer all those
   sophisticated features. However, they typically offer page search and
     bookmark handling. They also keep track of user data for the DAISY
                              books last read.

       Future versions of the playback software may include even more
      features to make up the perfect environment e.g. for students or
    dyslectic readers. For example, by storing the original book&#146;s
    text in electronic format along with the audio, the user may have a
   speech synthesiser or Braille display connected to the system and use
   it to e.g. spell out names in the text. With this set-up, the user can
     also search for certain text strings that occur in the book, which
             will add yet another access method to the system.

   The ability to present text on screen in full synchronisation with the
   narrated audio may be of great assistance to for example readers with
   dyslexia. However, even though the DAISY data format allows for this,
     dual-media books will cost more to produce than pure audio books.
   Also, most legal systems have copyright legislation that will make it
       hard or impossible to obtain the original book&#146;s text in
                     electronic format for publishing.

                        Flexibility in distribution

   The system has been desired to be as future safe as possible. Thus, it
    is not fixed on any particular storage medium for the recorded data.
       The producer can choose a suitable media for storage of master
    recordings, and then produces the distribution media when necessary.
   If the same media type is used both for master and distribution, DAISY
   books for lending can be created by a simple and fast copying process.

    Any storage media that can store ordinary data files under operating
     system control may be used as information carrier for the talking
   book. Furthermore, any information distribution channel that can offer
   file transfer under a commonly supported control protocol can be used
                      for distribution of DAISY books.

     The amount of data used by a DAISY book is rather big, at least in
   comparison with electronic text, which means that network transmission
      will only be feasible if a lot of bandwidth can be used at a low
   price. However, as networks such as the Internet is rapidly evolving,
    it is likely that electronic distribution can be introduced into the
                 DAISY concept in a not too distant future.

      For some years to come, it is highly likely that most producers
       related to the DAISY Consortium will use CD-ROM disks as their
    preferred distribution media. The producer would then typically use
      CD-R technology to create the disks. This technology is now very
   cost-effective and the production can be done at a lower cost than for
    analogue cassettes. Since a single CD-ROM can replace a large number
     of cassette tapes, the reduction in size and weight will bring the
          costs for transportation and storage down dramatically.

    As soon as new mass storage media become available and economically
      favourable, they might be integrated into the system. When this
        happens, the user&#146;s reading equipment only needs to be
    complemented with a device for reading the media in question. On the
    production side, it will be a rather trivial matter to convert from
   one type of media to another - data can just be copied between them at
                         no loss of audio quality.

     An example of a new storage medium is DVD-ROM, which is a rapidly
    evolving standard in the PC market today. The DVD-ROM disc can store
    several times more data than a CD-ROM, which will allow for even the
     largest talking books to be distributed on a single DVD-ROM disc.
     Alternatively, the sound quality of the typical DAISY book can be
      improved by using data compression technology that gives better
                  quality but demands more storage space.

     However, before the DVD-ROM technology can become useful for DAISY
      books, it needs to be negotiated amongst all parts of the DAISY
   Consortium and its allies that the new medium should be supported. All
   playback devices must be equipped with compatible hardware to be able
                         to access the new medium.

    The DVD-ROM standard must be mature enough to be a safe alternative
   for the future - different devices and discs must be fully compatible.
    There will also be some time before DVD-ROM devices are shipped as a
   cheap, standard device with multimedia PC:s, as is the case for CD-ROM
   today. There also remains the problem how to produce DVD-ROM disks in
    a cost-effective manner. DVD-R devices currently exist or are being
   developed, and as soon as the price drops to a reasonable level, they
                           can be taken into use.

                             Back to home page
     _________________________________________________________________

      This page is maintained by [log in to unmask] (March 25, 1998)

ATOM RSS1 RSS2