AudioMath: Speaking Mathematics with MathML - PowerPoint PPT Presentation

1 / 92
About This Presentation
Title:

AudioMath: Speaking Mathematics with MathML

Description:

Interpretation of math expressions in non-visual media ... Amaya and Mozilla browsers have native support. About 30 elements and 50 attributes ... – PowerPoint PPT presentation

Number of Views:300
Avg rating:3.0/5.0
Slides: 93
Provided by: LPF8
Category:

less

Transcript and Presenter's Notes

Title: AudioMath: Speaking Mathematics with MathML


1
AudioMath Speaking Mathematics with MathML
  • Helder Filipe Ferreira hfilipe_at_fe.up.pt
  • Laboratory of Speech Processing,
    Electro-acoustics, Signals and Instrumentation
  • Laboratory of Signals and Systems
  • Faculty of Engineering University of Porto,
    Portugal
  • Second European Workshop on MathML Scientific
    e-Contents
  • 16 to 18 September 2004. Kuopio. Finland.

2
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

3
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Preliminaries
  • Background on publication of mathematical
    e-contents
  • Web Accessibility and Mathematics
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

4
AudioMath Speaking Mathematics with
MathMLPreliminaries (1/2)
  • Provocative questions
  • How can a blind person surpass the difficulty in
    reading on-line documents with mathematical
    expressions?
  • Why wasn't this completely solved yet?
  • Is it not necessary? Is not easy?
  • These questions are only the top of the iceberg
    of a big problem with accessibility in the
    internet. This concerns technical, scientific or
    even simple documents containing math
    expressions.

5
AudioMath Speaking Mathematics with
MathMLPreliminaries (2/2)
  • One possible solution
  • The use of Text-to-Speech (TTS) technology to
    create audio versions of the mathematical
    contents.
  • Audio medium is accessible and general purpose
    TTS engines are available
  • In principle, math can be spoken out
  • Which leads us into several questions on how to
    do it
  • How to read math?
  • Which are the cognitive problems behind?
  • What technique should be used to code the math
    information?

6
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Preliminaries
  • Background on publication of mathematical
    e-contents
  • Web Accessibility and Mathematics
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

7
AudioMath Speaking Mathematics with
MathMLBackground on publication of mathematical
e-contents (1/4)
  • The publication of scientific documents
    containing mathematical formulae is extremely
    demanding.
  • The appearance of the TeX system by Donald Knuth
    solved the majority of problems with printed
    documents.
  • Then WYSIWYG editors, and subsequently markup
    languages for Internet appeared, such us HTML.
  • However HTML per se doesnt allow the use of a
    mathematical description language directly into
    the document.
  • So, other solutions were envisioned

8
AudioMath Speaking Mathematics with
MathMLBackground on publication of mathematical
e-contents (2/4)
  • ...
  • (X)HTML Images (jpg, gif, png, SVG) Math
    expressions as images (raster or vector types)
    (non accessible).
  • HTML Symbol Fonts / (X)HTML CSS Use tables
    to structure information. No semantic meaning of
    the math expression. Ex translator TtH.
  • Applets Using applets to generate mathematical
    expressions. Slow and non accessible process. Ex
    WebEQ.
  • Word / RTF / PDF / Postscript / TeX / LaTeX The
    HTML documents produced by these materials
    represent math expressions in the form of images
    (non accessible). Ex TeX4ht.

9
AudioMath Speaking Mathematics with
MathMLBackground on publication of mathematical
e-contents (3/4)
  • ...
  • MathML (Mathematical Markup Language) its one of
    the most accessible solutions
  • Allows the visualization of math expressions on
    the web
  • Dynamic and interactive contents
  • Publishing technical information in electronic
    format
  • Swap math data between applications
  • Interpretation of math expressions in non-visual
    media
  • Existence of tools that allow conversion between
    Latex/Tex and MathML

10
AudioMath Speaking Mathematics with
MathMLBackground on publication of mathematical
e-contents (4/4)
  • Why the use of MathML?
  • Developed by W3C and becoming a standard.
  • The rapidly growing use of it by several relevant
    organizations associated with the teaching and
    learning of mathematical contents, as well as the
    involvement of software houses.
  • Emergence of editors and applications that create
    and manipulate MathML documents.
  • Existence of conversion tools for the main
    publishing formats.
  • The fact that it is a markup language allows its
    parsing, interpretation and conversion to other
    formats, and consequently a higher accessibility,
    portability and platform independence.
  • This presentation will focus on MathML as the
    supporting technology for speaking mathematics.

11
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Preliminaries
  • Background on publication of mathematical
    e-contents
  • Web Accessibility and Mathematics
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

12
AudioMath Speaking Mathematics with MathMLWeb
Accessibility and Mathematics (1/3)
  • Though non-technical publications are now
    readily available to blind people the degree of
    access diminishes as the amount of technical
    information increases. ... It is also true to
    say that once the technical documents have been
    acquired their reading is a struggle and far from
    a pleasurable experience. Fitzpatrick
    Monaghan, BULAG 1999.

Amount of technical information
Access to information by blind people
13
AudioMath Speaking Mathematics with MathMLWeb
Accessibility and Mathematics (2/3)
  • How hard can it be to communicate about math on
    the Internet? The truth is, its a fairly
    difficult task. The Math Forum
    http//mathforum.org/typesetting
  • Nowadays the accessibility of technical and
    scientific documents online is very reduced.
  • Elements that can compose a web document
  • Text
  • Images graphics or pictures
  • Math expressions
  • Applets
  • Scripts
  • Multimedia (Flash, Shockwave, ...)

14
AudioMath Speaking Mathematics with MathMLWeb
Accessibility and Mathematics (3/3)
  • Is there web accessibility on currently available
    technical representations of mathematical
    formulae?
  • Images
  • Applets
  • SVG
  • MathML
  • (X)HTMLCSS
  • How can blind people read online documents
    containing mathematical expressions, then?
  • Usually they cant... except for a few cases to
    be seen later (topic Mathematical Audio Rendering
    Tools Overview).

Stars indicate the degree of the accessibility,
effectiveness and manipulation we can achieve
with the technique.
15
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Speaking Mathematics
  • Which MathML markup to use?
  • Interpretation of MathML tag set elements and
    attributes
  • Math Formulae Navigation
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

16
AudioMath Speaking Mathematics with
MathMLSpeaking Mathematics (1/4)
  • Its known that people read text letter-by-letter
    beginning at or near the leftmost symbol in a
    text line and often engage in backward scans to
    retrieve information that was previously read.
  • But reading mathematics is very different than a
    simple text.
  • To start with, mathematics is a two-dimensional
    written language. Almost all the other ordinary
    languages are primarily spoken and later on
    written, and both in a one-dimensional form (in a
    clearly defined sequence).

17
AudioMath Speaking Mathematics with
MathMLSpeaking Mathematics (2/4)
  • For a non-visually impaired person, understanding
    a mathematical formula requires a repeated scan
    and jumping over secondary portions.
  • Example
  • 1st scan its a square root
  • Another scan it has a fraction on it
  • Another scan ab/c in numerator and dxe in
    denominator
  • However this can be a very complex task to blind
    people. Therefore studies should be done to
    understand how we should provide a correct access
    to math expressions.

18
AudioMath Speaking Mathematics with
MathMLSpeaking Mathematics (3/4)
  • The few studies that can give us some clues, that
    we know of, are
  • Research being done by Arthur Karshmer
  • PhD thesis by Stevens (related to math prosody)
  • PhD thesis by Raman (related to audio rendering
    of latex)
  • MathSpeak project (related to Nemeth Code)
  • AudioMath project
  • Problems, far from solved, in speaking
    mathematics include
  • Navigation in a mathematical formulae
  • Complete study on mathematical prosody
  • Almost lack of studies on cognitive issues about
    reading mathematics to humans

19
AudioMath Speaking Mathematics with
MathMLSpeaking Mathematics (4/4)
  • No standard protocol exists for articulating
    mathematical expressions as it does for
    articulating the words of an English sentence.
    The MathSpeak Project.

20
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Speaking Mathematics
  • Which MathML markup to use?
  • Interpretation of MathML tag set elements and
    attributes
  • Math Formulae Navigation
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

21
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (1/7)
  • Representation of a math expression is perceived
    by two distinct but associated concepts
  • Visual structure or notation
  • Ex a/b, ab-1
  • MathML Presentation Markup

22
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (2/7)
  • and
  • Meaning that it represents
  • Ex a divided by b
  • MathML Content Markup
  • The relationship between notation (Presentation
    Markup) and meaning (Content Markup) is not
    univocal.

23
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (3/7)
  • Presentation Markup
  • Notation of a math expression
  • Amaya and Mozilla browsers have native support
  • About 30 elements and 50 attributes
  • Ambiguous on the semantics
  • Not the best to use for audio rendering, however
    its adaptation its possible
  • Transforming Presentation markup into Content
    markup its not advised, however the OpenMath
    group has been working on such stylesheets.

24
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (4/7)
  • Content Markup
  • Meaning of a math expression
  • Netscape and Internet Explorer support this
  • About 150 elements and 12 attributes
  • Mostly used to transfer MathML between
    applications
  • Ambiguous in notation
  • Best to use for audio rendering
  • Can be converted into Presentation markup
  • Limited in the tag set (only covers the basic
    algebra, arithmetic, logic, set theory, ... )

25
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (5/7)
  • Its obvious that Content markup would suit
    better for audio rendering...
  • However there are 2 major open issues
  • MathML content markup only covers basic math. The
    operators dictionary and support its not very
    big. (OpenMath seems to be doing a better job on
    that ).
  • And, so far, the majority of published MathML
    online is using the Presentation Markup (authors
    use WYSIWYG editors).
  • Therefore, MathML Presentation Markup is for the
    moment the best choice.

26
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (6/7)
  • However, take the following example
  • In MathML Presentation Markup the indexes 2 and
    3 will be coded as subscript and
    superscript.
  • This gives no information if the expression
    refers to the cubic power of the element A2 or
    the permutation of 3 elements taken 2 at a
    time.
  • So, the downside is that MathML Presentation
    Markup requires a relatively much bigger effort
    in the interpretation of mathematical
    expressions.

27
AudioMath Speaking Mathematics with MathMLWhich
MathML markup to use? (7/7)
  • Conclusions
  • An application that does MathML Audio Rendering
    should support both Content and Presentation
    Markup.
  • By starting first with the Presentation Markup we
    need to develop some kind of interpretation that
    allow us to construct the corresponding to the
    content markup.

28
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Speaking Mathematics
  • Which MathML markup to use?
  • Interpretation of MathML tag set elements and
    attributes
  • Math Formulae Navigation
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

29
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (1/8)
  • Not all the elements and attributes need to be
    processed for audio rendering...
  • Styles and visual attributes not always provide
    extra information about the expression. And
    rarely enhance the audio description.
  • However if Presentation Markup is used some style
    attributes might be important to disambiguate
    meaning.
  • Example

30
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (2/8)
  • Comments on Presentation Markup elements and
    attributes for audio rendering (1/5)
  • Special attributes
  • class, style, id, xref, xlinkhref, other
    usually not used.
  • Token elements attributes
  • mathbackground, mathsize, fontsize, fontweight,
    fontfamily, color usually not used.
  • mathcolor might be interesting to know the color
    of the symbol.
  • mathvariant style variants are important.
  • Italic might indicate a function
  • Bold might indicate a vector
  • Fraktur might indicate Lie algebra

31
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (3/8)
  • Comments on Presentation Markup elements and
    attributes for audio rendering (2/5)
  • Elements
  • maction not used in general, however, it might
    be interesting to know actions if we want to give
    access to formula manipulation and interaction.
  • maligngroup, malignmark not used.
  • menclose important to identify the operator.
  • merror not used. MathML test should be done
    before the audio rendering.
  • mfenced important to know the delimiters.

32
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (4/8)
  • Comments on Presentation Markup elements and
    attributes for audio rendering (3/5)
  • mfrac used.
  • linethickness important to identify if its a
    fraction or a combinatorial number.
  • mglyph used.
  • alt important to speak out the description.
  • mi, mn important.
  • mlabeledtr used to speak out the table label.
  • multiscripts used but its attributes aren't
    needed.
  • mo important.
  • moveablelimits, fence, separator, accent might
    be used to perceive the operators behavior.
  • mover used.

33
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (5/8)
  • Comments on Presentation Markup elements and
    attributes for audio rendering (4/5)
  • mpadded not needed.
  • mphantom all the contents inside should be
    ignored.
  • mprescripts used.
  • mroot used.
  • mrow important to give clues about sub
    expressions (where to place pauses, for
    instance).
  • ms used.
  • mspace not needed.
  • msqrt used.

34
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (6/8)
  • Comments on Presentation Markup elements and
    attributes for audio rendering (5/5)
  • mstyle used.
  • displaystyle needed because it can change some
    behaviors of ltmogt.
  • msub, msubsup, msup all used.
  • mtable, mtd, mtr used but attributes not needed.
  • mtext used.
  • munder, munderover used.
  • none not needed.

35
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (7/8)
  • Comments on Content Markup elements and
    attributes for audio rendering
  • Special attributes
  • class, style, id, xref, xlinkhref, other
    usually not used.
  • encoding needed for interpretation.
  • definitionURL this seems a W3Cs escape route to
    the lack of a bigger list of operators. But there
    is no standard on what we can find on the URL. So
    its unpredictable how it should render in audio.
  • Elements
  • annotation, annotation-xml, semantics might be
    used if the application has the capability to
    process other types of markup or languages.
  • apply is important to delimit the operator
    action area.
  • Operator elements needed for interpretation.

36
AudioMath Speaking Mathematics with
MathMLInterpretation of MathML tag set elements
and attributes (8/8)
  • Comments on MathML special characters for audio
    rendering
  • ApplyFunction
  • Ex f(x)
  • Without ApplyFunction renders f open
    parenthesis x close parenthesis.
  • With ApplyFunction renders f of x.
  • InvisibleTimes
  • Ex xy
  • Without InvisibleTimes renders x y.
  • With InvisibleTimes renders x times y.

37
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Speaking Mathematics
  • Which MathML markup to use?
  • Interpretation of MathML tag set elements and
    attributes
  • Math Formulae Navigation
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks

38
AudioMath Speaking Mathematics with MathMLMath
Formulae Navigation (1/4)
  • Reminder The more complex the expression the
    more difficult it is to read it or understand a
    spoken version of it.
  • Therefore some kind of navigation has to be
    provided.
  • AudioMath hypothesis (under study)
  • Use of content layers controlled by the user
  • Controlled by keyboard

39
AudioMath Speaking Mathematics with MathMLMath
Formulae Navigation (2/4)
  • Example
  • Level 0 This is a fraction.
  • Level 1 Fraction with numerator minus b plus
    minus and denominator 2 times a.
  • Level 2 numerator minus b plus minus
  • Level 2.1 minus b
  • Level 2.2 square root of b squared minus
  • Level 3 denominator 2 times a

40
AudioMath Speaking Mathematics with MathMLMath
Formulae Navigation (3/4)
  • Interaction with the user (proposal)
  • Some kind of navigation by keyboard.
  • There are 4 directions in the navigation
  • Up arrow to climb a level
  • Down arrow get down one level
  • Left arrow get next selection inside a level
  • Right arrow get one selection back inside that
    level
  • Special keys
  • Enter read level
  • Esc get back home (level 0)

41
AudioMath Speaking Mathematics with MathMLMath
Formulae Navigation (4/4)
  • Questions remaining unanswered
  • How many layers/levels should there be?
  • Who defines the layers? The user agent or the
    external plug-in or the application that reads
    the math document?
  • What should be the level of detail on the layers?
  • Will this improve enormously the quality of the
    perceiveness of the audio rendering?
  • Will the users adapt to this?

42
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • ASTER
  • Introduction
  • Brief analysis on its audio rendering of Latex
  • MathPlayer 2.0
  • Introduction
  • Brief analysis on its audio rendering of MathML
  • AudioMath
  • Final conclusions and remarks

43
AudioMath Speaking Mathematics with
MathMLASTER Introduction
  • ASTER (Audio System for Technical Readings) is an
    application that accepts TeX notation as input
    and produces audio rendering as output.
  • Developed by T.V. Raman in 1994 during his PhD.
  • Use of the Emacs front-end (Linux).
  • ASTER has 3 main components
  • Latex parser creates an internal representation
    easier for the program to manipulate.
  • AFL (Audio Formatting Language) used to render
    the parsed text using speech and other sounds.
  • Browser used to help the audio rendering.

44
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • ASTER
  • Introduction
  • Brief analysis on its audio rendering of Latex
  • MathPlayer 2.0
  • Introduction
  • Brief analysis on its audio rendering of MathML
  • AudioMath
  • Final conclusions and remarks

45
AudioMath Speaking Mathematics with
MathMLASTER Brief analysis on its audio
rendering of Latex (1/2)
  • Demos from the ASTER site - http//www.cs.cornell.
    edu/Info/People/raman/aster/demo.html

46
AudioMath Speaking Mathematics with
MathMLASTER Brief analysis on its audio
rendering of Latex (2/2)
  • A few comments
  • ASTER was a breakthrough. TV Ramans work its
    considered a bible in mathematics audio
    rendering.
  • It supports a large number of mathematical
    formulae in Latex.
  • No math formulae navigation support. It gets
    complicated with complex math expressions.
    However ASTER uses a variable substitution
    process for resolving this problem. And complex
    expressions can be divided in sub expressions on
    users request.

47
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • ASTER
  • Introduction
  • Brief analysis on its audio rendering of Latex
  • MathPlayer 2.0
  • Introduction
  • Brief analysis on its audio rendering of MathML
  • AudioMath
  • Final conclusions and remarks

48
AudioMath Speaking Mathematics with
MathMLMathPlayer 2.0 Introduction
  • MathPlayer is a mathematics display engine for
    Microsofts Internet Explorer 6.0, developed by
    Design Science.
  • Uses MathML Presentation markup as input and
    visual rendering (version 1.0 and 2.0) and audio
    rendering (only in version 2.0 out in 2004) as
    output.
  • Download http//www.dessci.com/en/products/mathpl
    ayer/download.htm

49
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • ASTER
  • Introduction
  • Brief analysis on its audio rendering of Latex
  • MathPlayer 2.0
  • Introduction
  • Brief analysis on its audio rendering of MathML
  • AudioMath
  • Final conclusions and remarks

50
AudioMath Speaking Mathematics with
MathMLMathPlayer 2.0 Brief analysis on its
audio rendering of MathML (1/2)
  • Demos created by using MathType, Microsoft Word,
    MathPlayer and Microsoft SAM TTS Engine
    demomathplayer.htm

51
AudioMath Speaking Mathematics with
MathMLMathPlayer 2.0 Brief analysis on its
audio rendering of MathML (2/2)
  • A few comments
  • The table problem why not detect that its a
    matrix?
  • Lack of some keywords begin ltoperatorgt and end
    ltoperatorgt can result in ambiguous readings.
  • No math formulae navigation support. Once again,
    it gets complicated with complex math
    expressions.
  • No usermode options provided.

Demo2
52
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

53
AudioMath Speaking Mathematics with MathMLThe
AudioMath Project (1/4)
  • Initiative of the Laboratory of Speech
    Processing, Signals and Instrumentation of the
    Faculty of Engineering University of Oporto,
    Portugal.
  • This laboratory is dedicated to the research
    development of Voice User Interfaces and Speech
    Technology for Accessibility Solutions.
  • Aims
  • To build a tool (AudioMath) to operate either as
    standalone or integrated in a speech interface
    (TTS - text-to-speech) that does
  • Mathematics Audio Rendering
  • Parsing, interpretation and conversion of MathML
    into plain text format
  • Generation of the appropriate prosodic contour
    for reading of the math formulas text
  • An intra-formula browsing device (navigation)
  • Recognition and conversion of any other text or
    markup elements not directly understandable by
    the TTS engine (numerals, abbreviations, )

54
AudioMath Speaking Mathematics with MathMLThe
AudioMath Project (2/4)
  • The rationale is
  • Once the formula is described in MathML or
    equivalent technique the basis is there to create
    the textual description of the mathematical
    formulae and the reading of the resulting text
    with the best perceptual and communication
    effectiveness.
  • AudioMath is an ActiveX dynamic link library
    (dll) that can be used by any program through
    internal calls.
  • Developed in Perl and for Windows 9x/Me/2K/XP.
    However its main code can also be used in
    Linux/Unix (since its in Perl...).

55
AudioMath Speaking Mathematics with MathMLThe
AudioMath Project (3/4)
  • AudioMaths main applications
  • Reading of technical and scientific documents
    online in an accessible way, with particular
    benefit for vision impaired persons.
  • Teaching or learning how to read mathematical
    formulae.
  • Enhancing general accessibility to computer-based
    applications, when applied to a TTS engine.

56
AudioMath Speaking Mathematics with MathMLThe
AudioMath Project (4/4)
57
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

58
AudioMath Speaking Mathematics with
MathMLAudioMath Architecture (1/2)
  • AudioMath has been built in a modular, extensible
    and configurable architecture.
  • AudioMath contains 6 major conversion modules
  • Numerals (conversion of several types of numeric
    forms)
  • Abbreviations (conversion of abbreviations in a
    text)
  • Acronyms (conversion of acronyms in the document)
  • Network References (conversion of IPs, emails and
    URI/URLs)
  • Mathematical (conversion of MathML expressions)
  • Auto-Discovery (the brain of the operation that
    recognizes or identifies elements in the document
    and calls the respective conversion modules)

59
AudioMath Speaking Mathematics with
MathMLAudioMath Architecture (2/2)
60
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Text Analysis
  • Parsing and Interpreting MathML
  • Converting and Speaking Mathematical Contents
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

61
AudioMath Speaking Mathematics with MathMLText
Analysis (1/3)
  • There are several types of text in a document
  • Acronym
  • (ex UN United Nations )
  • Abbreviation
  • (ex EQ. equation nº - number)
  • Numeral
  • (ex 1.2 1,3 XV 23º 1,333... )
  • Web Reference
  • (ex hfilipe_at_fe.up.pt )
  • Math expression
  • (ex ltmathgtltmigtxlt/migtltmogtlt/mogtltmngt3lt/mngtlt/mathgt)
  • Special Unicode character or a math glyph.
  • (ex eacute represents é)

62
AudioMath Speaking Mathematics with MathMLText
Analysis (2/3)
  • Steps to follow
  • 1. In the case of European Portuguese, convert
    all the Unicode into Latin1.
  • 2. Auto-discovery process, based on regular
    expressions methods, that recognizes types of
    elements ( numerals, acronyms, ... )
  • 3. Calls to the modules that convert the
    recognized elements into a full plaintext form.
  • 4. Go to 2 and repeat until all its converted.

63
AudioMath Speaking Mathematics with MathMLText
Analysis (3/3)
  • To speed up the process the document should be
    divided into blocks of text, splitting the MathML
    markup from the rest of the text.
  • Text processing is strongly based on regular
    expressions and databases (for acronyms and
    abbreviations).
  • Ex if (n m/(?(?\-\--\)?0-9)?\.,
    ?0-9\/igs) / its a percentage number/
  • Dictionaries and databases where included inside
    the code as hash tables (more speed but less
    flexibility in updates).
  • Supports usermode options.
  • Ex user likes to hear spelled-out the decimal
    part of decimal numbers
  • 1.25 one point twenty five or one point two five

64
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Text Analysis
  • Parsing and Interpreting MathML
  • Converting and Speaking Mathematical Contents
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

65
AudioMath Speaking Mathematics with
MathMLParsing and Interpreting MathML (1/3)
  • MathML code is parsed using the XMLParser
    module which acts as a SAX parser type
    (event-based), supporting encoding ISO-8859-1
    (Latin-1) and discarding XML namespaces.
  • AudioMath works with MathML Presentation Markup
    so, a relatively bigger effort and computation
    is needed in the interpretation of the
    mathematical expressions.
  • Currently, in AudioMath, interpreting MathML is
  • A process of raising flags each time a starting
    and ending tag is detected, which allows to know
    the history of the markup and to retrieve
    information, enabling to understand the structure
    of the math expression and do its conversion.

66
AudioMath Speaking Mathematics with
MathMLParsing and Interpreting MathML (2/3)
  • As the expression is becoming discovered, the
    conversion process takes place by calling several
    algorithms as well as Unicode and MathML
    dictionaries.
  • AudioMath uses 2 kinds of dictionaries
  • MathML entities to Unicode
  • Ex InvisibleTimes is converted to U02062
  • Unicode to European Portuguese full plaintext
    form
  • Ex U02062 is converted into the portuguese word
    vezes(times).

67
AudioMath Speaking Mathematics with
MathMLParsing and Interpreting MathML (3/3)
  • The conversion to full plaintext form is done
    according to a database of vocabulary and speech
    rules.
  • To be seen later
  • Once again the bigger the expression, the more
    complex the interpretation and conversion process
    becomes. Navigational mechanisms to browse the
    math expression should be provided (work in
    progress).
  • This browsing must happen both externally
    (already seen on the topic Math Formula
    Navigation) and internally (to infer the meaning
    of the mathematical contents).

68
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Text Analysis
  • Parsing and Interpreting MathML
  • Converting and Speaking Mathematical Contents
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

69
AudioMath Speaking Mathematics with
MathMLConverting and Speaking Mathematical
Contents (1/2)
  • The objective of automatically speaking
    mathematical contents has to deal, besides the
    non-trivial issues of text generation and
    phrasing, with the generation of prosody to
    impose over the synthetic speech.
  • For example, consider the possible readings of
    the expression on the side
  • Square root of a squared plus b squared, end of
    radicand.
  • Square root of power base a exponent two, end of
    power, plus power base b exponent two, end of
    power, end of radicand
  • Which of these forms is more correct, not
    ambiguous and more efficient?

70
AudioMath Speaking Mathematics with
MathMLConverting and Speaking Mathematical
Contents (2/2)
  • Do the experience read a math expression
    monotonically to someone that is not looking at
    it and ask for a written version after the
    dictation.
  • Result if one is not careful enough theyll all
    sound much a like and quite ambiguous.
    Identification is therefore difficult.
  • The solution must pass by the adoption of formal
    ways of text generation that keep the right
    structure information of the formula, combined
    with the right prosodic information (f0 contour,
    pauses, )

71
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

72
AudioMath Speaking Mathematics with
MathMLDatabase of Vocabulary and Speech Rules
(1/5)
  • One of the AudioMaths current tasks is the
    definition of a database of vocabulary and speech
    rules, for several subsets of math formulae.
  • These rules and intonation are implemented at
    conversion time by tagging the text with prosodic
    marks, to command the TTS engine in order to
    produce the required pauses and f0 modulations.
  • Math corpus is being defined and categorized by
    different operation types algebra, trigonometry,
    integrals,
  • Each type has its structure analyzed and defined
    in a formal plaintext way.
  • Different readings of the same formula are spoken
    and analyzed for prosody and perceiveness.
  • Pitch patterns and pauses are inferred from
    speech.

73
AudioMath Speaking Mathematics with
MathMLDatabase of Vocabulary and Speech Rules
(2/5)
  • Examples of vocabularies

Square root of (pause) a squared plus b squared
(pause) end of radicand
Fraction (pause) on top (pause) minus b plus
minus square root of (pause) b square minus four
times a times c (pause) end of radicand (double
pause) on bottom two times a (pause) end of
fraction
74
AudioMath Speaking Mathematics with
MathMLDatabase of Vocabulary and Speech Rules
(3/5)
  • Speech waveform example

75
AudioMath Speaking Mathematics with
MathMLDatabase of Vocabulary and Speech Rules
(4/5)
  • Prosody conclusions inferred from speech
    waveforms (1/2)
  • There are 2 distinct types of pauses
  • Large pause
  • Ex Square root of (pause) a squared plus b
    squared (pause) end of radicand
  • Short pause (optional)
  • Ex Square root of (pause) a squared (pause) plus
    b squared (pause) end of radicand

76
AudioMath Speaking Mathematics with
MathMLDatabase of Vocabulary and Speech Rules
(5/5)
  • Prosody conclusions inferred from speech
    waveforms (2/2)
  • There are rising and falling movements of f0 in
    the speakers intonation intended to provide
    classification of the boundaries introduced by
    the pauses
  • Rising tone used when lower hierarchical level
    is starting. Ex root of
  • Falling tone used when level is ended. Ex b
    squared
  • Falling and Rising tone used to clarify the
    smaller separating pause. Ex a squared
  • Emphatic Falling tone used at the end of the
    expression. Ex end of radicand

77
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • The AudioMath Project
  • AudioMath Architecture
  • The AudioMath Process
  • Database of Vocabulary and Speech Rules
  • Current Status and Future Work
  • Final conclusions and remarks

78
AudioMath Speaking Mathematics with
MathMLCurrent Status and Future Work (1/3)
  • Current Status (1/2)
  • Unicode support (MathML and Unicode Dictionary
    for European Portuguese)
  • MathML Presentation Markup tags supported
  • Token elements
  • ltmogt, ltmigt, ltmngt, ltmtextgt, ltmspacegt, ltmsgt,
    ltmglyphgt
  • General Layout Schemata
  • ltmrowgt, ltmsqrtgt, ltmrootgt, ltmfracgt, ltmstylegt,
    ltmerrorgt, ltmphantomgt, ltmpaddedgt
  • Tables and Matrices
  • ltmlabeledtrgt, ltmaligngroupgt, ltmalignmarkgt
  • Enlivening Expressions
  • ltmactiongt, ltnonegt
  • MathML Presentation Markup tags partially
    supported
  • ltmsupgt, ltmfencedgt, ltmenclosegt, ltmtablegt, ltmtrgt,
    ltmtdgt

79
AudioMath Speaking Mathematics with
MathMLCurrent Status and Future Work (2/3)
  • Current Status (2/2)
  • Also, detects and converts
  • Numerals cardinals, ordinals, decimals, romans,
    percentages, dates, time, scales, sport results,
    fractions, currency, powers, telephones and
    engineering formats.
  • Abbreviations social, currency, chemistry and
    physics (on database).
  • Acronyms several (on database).
  • Network references emails, url/uri and ips.
  • Browser for functionality test and TTS
    integration.
  • Few studies on mathematical prosody and speech
    database rules for mathematics.
  • Usermode support (preferences on rendering).
  • User evaluation is performed in the several
    iterations of the AudioMaths development.

80
AudioMath Speaking Mathematics with
MathMLCurrent Status and Future Work (3/3)
  • Future Work
  • Complete the support to the MathML Presentation
    Markup.
  • Add support to the MathML Content Markup.
  • Further learning on how to read mathematical
    formulae.
  • Develop modules that support HTML, XHTML, SSML
    and others.
  • Providing mechanisms for navigating inside
    mathematical formulae (eventually a special audio
    browser).
  • Adding support for new languages (English,
    French, ).
  • Develop the study on the prosody of reading
    mathematical formulae.
  • We can only see a short distance ahead, but we
    can see plenty there that needs to be done. Alan
    Turing.

81
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks
  • Final conclusions
  • Appendixes
  • Suggested Readings
  • Web References
  • Knowledge Requirements

82
AudioMath Speaking Mathematics with MathMLFinal
Conclusions (1/2)
  • About speaking mathematics
  • Reading mathematics is very different than
    reading a text.
  • Navigational mechanisms should be provided.
  • More studies on prosody are needed.
  • There are still no standards on how to read math
  • About using MathML to speak mathematics
  • MathML is becoming a standard
  • Content markup should be better for audio
    rendering, however Presentation markup is more
    widely used.
  • If Presentation markup is used more computation
    efforts need to be made to the interpretation and
    transformation to audio rendering.
  • Not all elements and attributes of MathML need to
    be used on audio rendering.

83
AudioMath Speaking Mathematics with MathMLFinal
Conclusions (2/2)
  • About AudioMath
  • Its an accessibility tool (audio rendering of
    MathML and numerals, abbreviations, acronyms and
    network references).
  • Uses special dictionaries to adapt to several
    situations MathML entities and Unicode entities.
  • Built in a modular, extensible and configurable
    architecture.
  • Usermode options are supported.
  • AudioMath project studies the mathematical
    prosody and builds a database for speech rules.

84
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks
  • Final conclusions
  • Appendixes
  • Suggested Readings
  • Web References
  • Knowledge Requirements

85
AudioMath Speaking Mathematics with
MathMLSuggested Readings (1/2)
  • AudioMath related
  • Ferreira, Helder et al. Enhancing the
    Accessibility of Mathematics for Blind People
    The AudioMath Project. ICCHP04.
  • Ferreira, Helder et al. Audio Rendering of
    Mathematical Formulae using MathML and AudioMath.
    UI4ALL04.
  • Ferreira, Helder. Contribute to the automatic
    reading of scientific documents (portuguese
    version only). Final year project 2003.
  • Other publications
  • Gillian Douglas et al. Cognitive Analysis of
    Equation Reading Application to the Development
    of the Math Genie. ICCHP04.
  • Fitzpatrick D. et al. Multi-modal Mathematics
    Conveying Math Using Synthetic Speech and Speech
    Recognition. ICCHP04.

86
AudioMath Speaking Mathematics with
MathMLSuggested Readings (2/2)
  • Stöger, B. et al. Mathematical Working
    Environment for the Blind Motivation and Basic
    Ideas. ICCHP04.
  • Karshmer, A. et al. How well can we read
    equations to blind mathematics students some
    answers from psychology. HCII03.
  • Rotard, M. et al. Access to Mathematical
    Expressions in MathML for the Blind. HCII03.
  • Stevens, R. Principles for the Design of Auditory
    Interfaces to Present Complex Information to
    Blind People. PhD Thesis 1996.
  • Raman T.V. Audio System For Technical Readings.
    PhD Dissertation 1994.

87
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks
  • Final conclusions
  • Appendixes
  • Suggested Readings
  • Web References
  • Knowledge Requirements

88
AudioMath Speaking Mathematics with MathMLWeb
References (1/2)
  • MathML, W3C
  • http//www.w3c.org/Math/
  • Accessible Mathematics
  • http//www.latrobe.edu.au/webaccess/maths.html
  • Guidelines for Topic Specific Accessibility
    (including Mathematics)
  • http//ncam.wgbh.org/salt/guidelines/sec11.html
  • Mathematical Access for Technology and Science
    for Visually Disabled People
  • http//www.cs.york.ac.uk/maths/

89
AudioMath Speaking Mathematics with MathMLWeb
References (2/2)
  • Math Speak Project
  • http//www.rit.edu/easi/easisem/talkmath.htm
  • Math Computerized, Spoken and Braille
  • http//www.rit.edu/7Eeasi/math.htm
  • Mathematics for Computer Generated Spoken
    Documents
  • http//www.cs.cornell.edu/Info/People/raman/aster/
    demo.html
  • The AudioMath Project
  • http//lpf-esi.fe.up.pt/audiomath

90
AudioMath Speaking Mathematics with MathMLThis
Presentation
  • Introduction
  • Audio Rendering of MathML
  • Mathematical Audio Rendering Tools Overview
  • AudioMath
  • Final conclusions and remarks
  • Final conclusions
  • Appendixes
  • Suggested Readings
  • Web References
  • Knowledge Requirements

91
AudioMath Speaking Mathematics with
MathMLKnowledge Requirements
  • To develop any application that speaks MathML,
    you might need
  • Knowledge about XML and related technologies
  • Concepts in text processing, FSM, regular
    expressions
  • Vocabulary for the math expressions
  • TTS concepts
  • Prosody database
  • Lots of patience and hard work!! ?

92
AudioMath Speaking Mathematics with MathMLThe
end
  • Thank you for your attention!
  • Any questions? ?
Write a Comment
User Comments (0)
About PowerShow.com