Dynamic Glyph Generation - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic Glyph Generation

Description:

Dynamic Glyph Generation. Based on variable length encoding schema. Yap Cheah Shen ... Rasterized bitmap. Macromedia Flash, SVG ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 19
Provided by: yap
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Glyph Generation


1
Dynamic Glyph Generation
  • Based on variable length encoding schema

Yap Cheah Shen eForth Technology. Glyph
Typesetting Workshop Kyoto, 29Nov2003
2
Outline of Presentation
  • Morpheme Latin vs. Han
  • Latin text encoding
  • Missing character in Chinese text
  • Solution
  • Implementation details
  • Glyph decomposition database
  • Topological conversion of strokes
  • Automatic frame calculation
  • Integrating to existing OS
  • Other issue

3
Morpheme Latin vs. Han
  • Morpheme is the smallest meaningful unit in a
    language.
  • For Latin text, it is word.
  • For Chinese text, it is Hanzi or Kanji.
  • Representing a real-world idea, morpheme keeps
    changing from time to time
  • Morphemes form an open-set.

4
Latin Text Encoding
  • Alphabets form a fix set of symbols.
  • All words can be represented as sequences of
    alphabets.
  • They are the ideal encoding units for Latin text
    e.g., ASCII.
  • No missing word encoding problem.

5
Missing Characters in Chinese Text
  • Not all existing Hanzi are encoded.
  • Hanzi are in an open-set , theoretically,
    historically and practically.
  • Wrong assumptions and designs of existing
    encoding schema.
  • Unending loop of assigning code point, OS update,
    new font, new input method table Industries are
    happy. (users suffer)

6
Solution-1
  • Parts or components as encoding unit.
  • ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
  • Most characters can be represented by a finite
    set of basic parts.
  • Strokes are used to construct rarely used parts.(
    thousand of parts appear only once or twice)

7
Solution -2
  • A close-set of basic parts and strokes as
    encoding unit.
  • 3 Joining operator horizontal , vertical, and
    enclosing.
  • 1 Shielding operator for hiding stroke
  • Prefix notation allowing recursive composition.

8
Solution-3
  • Ordinary CJK fix-length encoding schema, numeric
    value as character code.
  • Input method table
  • Convert input keystroke to character code.
  • Static Font file
  • Glyph data is pre-designed
  • Access glyph data by character code.
  • Text file
  • Sequence of character code.

9
Solution-4
  • Additional feature of variable length encoding
    CJK environment.
  • Input
  • Character can be sorted, filtered by parts.
  • Compatible with any existing input method.
  • Display
  • Font file stores commonly used characters and
    parts.
  • Generate glyph on the fly by glyph descriptive
    sequence.
  • Storage and data-exchange
  • Compatible with Unicode.
  • Ideographic description sequence.

10
Dynamic Glyph Generator
  • Input
  • Various type of Variable length descriptive
    character code sequence.
  • ??? of Academia Sinica
  • ??? of CBETA
  • Unicode ideographic descriptive characters
  • Output display print
  • True-type compatible outline
  • Rasterized bitmap.
  • Macromedia Flash, SVG
  • The Task a layout problem, fitting a 1
    dimensional sequence into a 2 dimensional square.

11
Implementation -1
  • The system consists of 3 major parts
  • Glyph decomposition database
  • Courtesy of Prof. Hsieh from Academia Sinica,
    Taiwan http//www.sinica.edu.tw/cdp/
  • Outline of strokes and components
  • Beijing ZhongYi Co. professional outline font
    vendor. http//www.zhongyicts.com.cn/
  • The eForth system putting everything together,
    hardware-software co-engineering.

12
Implementation-2
  • Glyph decomposition database
  • All CJK glyph defined by Unicode 4.0 , 71000 in
    total.
  • 549 basic parts, stroke sequence are preserved
  • 3996 total parts
  • Total parts frequency 165122
  • Accumulated frequency
  • Top 50 51389 31
  • Top 200 87381 53
  • Top 1000 129393 78

13
Implementation-3
  • Stroke are describe as a outline with skeletal
    line.
  • Both outline and skeletal line are Quadric Bezier
    curves.
  • Outline points are recalculated according to
    scaled- skeletal line.
  • Result
  • Stroke data is highly reusable
  • Stroke weights are adjustable

14
Implementation-4
  • Automatic frame calculation
  • Algorithm of estimating the complexity of each
    parts, to decide the proportion of the part in
    result glyph.
  • ? ?25, ? 70 , roughly.
  • ? ? 55, ? 40, roughly.
  • Result
  • Clear glyph descriptive expressions
  • Search engine friendly
  • Human readable

15
Integrating into existing OS/GUI
  • String manipulation library
  • Number of characters
  • -1 for operators, 1 for characters
  • Characters width
  • Graphic sub-system
  • drawing a text line (e.g. ExtTextOut)
  • Text handling widgets
  • Awareness of glyphs expression for caret,
    selection and delete/backspace.

16
Other Issues
  • Quality of the glyph
  • Trade-off with space More part outlines, better
    quality.
  • Speed of generation
  • No problem for IBM PC, glyph generation is rare.
  • For handheld device, Hardware acceleration is
    recommended.

17
Examples
  • ? Vertical combination
  • ? Horizontal combination
  • ? enclosing
  • hide
  • ? ??? or ?????
  • ??? ?-5 hide 5th stroke
  • ?? ?-5
  • ?-4 U20009

18
Thank You
Write a Comment
User Comments (0)
About PowerShow.com