Components of Multilingual Software development using Unicode - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Components of Multilingual Software development using Unicode

Description:

Unicode is the only format that supports almost all characters that we use today. ... kern (Kerning) mark (Mark to base positioning) ?? ?? ???? ?? ???? ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 20
Provided by: sha8156
Category:

less

Transcript and Presenter's Notes

Title: Components of Multilingual Software development using Unicode


1
Components of Multilingual Software development
using Unicode
  • ???? ???? ???

2
Introduction
  • Unicode is the only format that supports almost
    all characters that we use today. All major
    operating systems and programming languages
    support or will support Unicode in near future. I
    will discuss Microsoft based technologies for
    making multilingual applications.

3
Main Components
  • Unicode
  • Uniscribe
  • OpenType fonts

4
Basic flow
5
Unicode
  • The Unicode Standard is a character set for data
    interchange in plain text format. It contains no
    attributes regarding language, display format or
    rendering.
  • In order to avoid duplication of characters,
    Unicode encodes text by script, not by language.

6
Uniscribe
  • A rendering process is needed for displaying
    Unicode-based characters of complex scripts. This
    process maps characters to glyphs.
  • Options for processing complex scripts
  • Text functions
  • Edit controls
  • Rich edit controls

7
OpenType Fonts
  • Glyphs
  • Encoded, Non-encoded
  • Simple, Ligature, Mark
  • OpenType Library Services for Arabic script
    (OTLS)
  • Shaping features
  • Positioning features

8
OpenType Fonts (cont.)
  • Layout features within OpenType fonts are
    organized by scripts and languages, allowing a
    single font to support multiple writing systems,
    even within the same script.
  • All features are applied on Glyphs and not on
    Characters.
  • When a character is typed, its keystroke value
    is preserved and glyphs are displayed according
    to the layouts.

9
Encoded Glyphs
  • Any character in Unicode.
  • One glyph may have more than one code points.
  • For Arabic script based fonts, only characters in
    Arabic code page (0600-06FF) should be encoded
    because Arabic presentation forms contain
    different shapes of characters and ligatures.
    These issues are handled by Positioning and
    Shaping features.

10
Non-Encoded Glyphs
  • Contextual shapes of all desired characters.
  • Ligatures

11
Other Glyphs
  • Simple Glyphs
  • Glyphs for single character alphabets and
    numbers.
  • Ligature Glyphs
  • Multi-character glyphs, normally combination of
    characters.
  • Mark Glyphs
  • Zero width glyphs like Airab

12
About OTLS
  • A font contains support for some scripts, each
    script contains support for some languages and
    OTLS are defined for each language.
  • Every script contains a default language entry.
    OTLS defined for this language are used for all
    languages of that script.
  • This helps making single font supporting
    languages having same glyphs with different
    behavior.

13
Shaping features
  • ccmp (Character composition and decomposition)
  • ? ? ?
  • isol (Isolated character form substitution)
  • ? ?
  • fina (Final character form substitution )
  • ? ??

14
Shaping features (cont.)
  • medi (Medial character form substitution)
  • ? ???
  • Init (Initial character form substitution )
  • ? ??
  • rlig (Required ligature substitution )
  • ? ? ??

15
Positioning features
  • curs (Cursive positioning)
  • kern (Kerning)
  • mark (Mark to base positioning)
  • ?? ?? ???? ?? ????
  • mkmk (Mark to mark positioning)
  • ??? ??? ?????? ??????

16
Other features of OpenType
  • OpenType fonts can be embedded directly to the
    document
  • Microsoft Word documents
  • HTML documents

17
Related technologies
  • Pango
  • An open-source framework for the layout and
    rendering of internationalized text for GTK and
    GNOME.
  • It supports OpenType fonts.

18
Related technologies (cont.)
  • Pango

19
?????
Write a Comment
User Comments (0)
About PowerShow.com