External Tools Not Only for ArabTeX Documents - PowerPoint PPT Presentation

About This Presentation
Title:

External Tools Not Only for ArabTeX Documents

Description:

ArabCode nontrivial conversion of encoding standards of Arabic script ... Contemporary and historical orthography iqra' h_a_dA an-na.s.sa bi-intibAhiN versus ... – PowerPoint PPT presentation

Number of Views:176
Avg rating:3.0/5.0
Slides: 16
Provided by: ufalMf
Category:

less

Transcript and Presenter's Notes

Title: External Tools Not Only for ArabTeX Documents


1
External Tools Not Only for ArabTeX Documents
  • Karel Mokry Otakar Smrz
  • Faculty of Mathematics and Physics
  • Charles University in Prague

2
which include
  • ArabCode nontrivial conversion of encoding
    standards of Arabic script
  • ArabSpell rule-driven spelling system suited
    especially for vocalized Arabic encoded in
    ArabTeX notation
  • acolor.sty package for control over coloring in
    ArabTeX and LaTeX typesetting systems

3
ArabTeX encoding concept
  • Lower ASCII, human-readable, rather phonetic
  • Algorithmic determination of several phenomena of
    Arabic script
  • Evaluation of context, parametric interpretation
  • Contemporary and historical orthography
  • ltiqra h_a_dA an-na.s.sa bi-intibAhiNgt
  • versus
  • Aiqoragto haA lnaSa binotibaAhK

4
Ordinary graphemic approach
  • Unicode / Unicode Transformation Format (UTF)
    with great descriptive scope
  • Ux0639 / 0xD8 0xB9 (Arabic ayn)
  • 0000 0110 0011 1001 / 1101 1000 1011 1001
  • Ux004C / 0x4C (Latin L)
  • 0000 0000 0100 1100 / 0100 1100
  • Windows CP 1256, ISO 8859-6, ASMO 449 etc.
  • Buckwalter Transliteration using lower ASCII

5
ArabCode solution
  • Set of subroutines and scripts in Perl
  • Complex ArabTeX ? UTF / Unicode
  • Documented Unicode ? UTF
  • Quite easy UTF / Unicode ? Windows ? ISO ? ASMO ?
    Buckwalter ? etc.
  • Currently ArabTeX ? Windows and Windows ? UTF ?
    ISO ? ASMO ? Buckwalter

6
ArabCode method
  • Considering problem ArabTeX ? UTF / Unicode
  • Present
  • Regular expressions system tool, fast and safe
  • Rules wired-in in the code hard to maintain,
    inflexible
  • Future
  • Finite-state transducer most adequate, use of
    own implementation may slow computation down
  • External grammar clear and extensible rules

7
ArabSpell motivation
  • Spell-checking of entries of human-edited lexical
    database
  • Supervision over misuse of notation, document
    consistency requirement
  • Trial and error way of teaching it
  • One version already applied to educational
    purpose documents and a book of Arabic proverbs

8
ArabSpell novel concept
  • Separation of the definition of the language and
    the response from the spell-checking engine
  • Right Linear Grammar and convenient syntax
  • source ltcodegt lttextgttarget lttextgt
  • Nondeterministic Finite Automaton and its
    construction from the grammar

t


t
x
t
source
e
target

ltcodegt
9
Grammar of Arabic syllable
  • Nonterm generative rules
  • syllable lt "Unruly input!" gt
    CVCemptysyllable CVCempty
    Cending
  • Cluster definition rules
  • C ltgt lt'gt ltbgt lttgt lt_tgt ltggt lt.hgt lt_hgt ltdgt
    lt_dgt ltrgt ltzgt ltsgt ltsgt lt.sgt lt.dgt lt.tgt lt.zgt ltgt
    lt.ggt ltfgt ltqgt ltkgt ltlgt ltmgt ltngt lthgt ltwgt ltygt
  • V ltgt ltagt ltigt ltugt ltAgt ltIgt ltUgt ltgt

10
continuation
  • lt_agt lt "Dagger 'alif occurred." gt ltaagt lt
    "Use ltAgt instead!" gt ltiygt lt "Use ltIgt
    instead!" gt ltuwgt lt "Use ltUgt instead!" gt
  • ending lt "Invalid ending?" gt ltuNgt ltiNgt
    ltaNgt ltaNYgt ltYgt ltgt ltaNAgt ltUAgt ltaWgt ltaWAgt lt
    "Silent 'alif enforced." gt
  • empty ltgt ltgt see Cempty above
  • Multi-functionality of the ltgt operator

11
ArabSpell features
  • Clusters enable eminent network optimization
  • Spelling lt Perl subroutines gt extend the class
    of languages beyond regular ones
  • Bracket matching, word repetition
  • Control over long-distance dependencies
  • Easy counting, e.g. word and sentence length
  • Reports in different language versions
  • Detailed yet flexible grammar for Arabic, models
    of other formalizable languages

12
Using acolor.sty
  • Typesetting Arabic script in color with ArabTeX
  • Text marking, hide-and-check of diacritics
  • Primers, textbooks, educational purposes
  • Coloring commands combined with original ArabTeX
    vocalization control
  • No modification of the input data themselves

13
for any diacritics
  • \coldiared\fullvocalize\accentshigh
  • \nocolshadda\colotherblue\vocalize
  • \nocolall\colhamzagreen\vocalize

14
for other marking
  • \nocolall\colbeginningblue\novocalize
  • \nocolall\colshaddawhite\novocalize
  • \colisolatedred\vocalize\accentslow

15
Acknowledgement
  • Arabic script displays in this presentation were
    typeset using the ArabTeX package for TeX and
    LaTeX by Prof. Dr. Klaus Lagally of the
    University of Stuttgart. Existence of this system
    has inspired our work principally.
Write a Comment
User Comments (0)
About PowerShow.com