Creating Interfaces: Localization - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Creating Interfaces: Localization

Description:

two Chinese (kanji) character sets: modern (China) and traditional (Taiwan ... use symbols/icons that are meaningful to audience. tricky. Flags may not be ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 18
Provided by: jeanine2
Category:

less

Transcript and Presenter's Notes

Title: Creating Interfaces: Localization


1
Creating Interfaces Localization
  • Language other issues
  • character codes
  • Homework preparation for future topics

2
Finish presentations
  • Everyone post constructive comments on at least 2
    other projects.
  • (Note catch up on other postings.)

3
Many, interconnected issues
  • Create web site for use in several specific
    'local' places.
  • Create multiple web sites, each for use in
    specific place.
  • in an efficient, effective manner so any
    underlying common content does not need to be
    duplicated (and commonality diluted).
  • Develop tools (networking s/w, standards, etc.)
    that promote Web as "global, interoperable tool
    of communication"
  • www.w3c.org

4
Localization
  • not just language
  • language is not just character code
  • UCS (universal character set) and UNICODE, many,
    many related standards to address encoding
    issues.
  • dates
  • local date and also way to express 'western' date
  • time
  • money
  • position on and flow across page
  • acceptable images, photography, icons
  • ?

5
Character code
  • Note European languages plus several other
    'small' alphabets easily handled.
  • We/I (typical monolingual American) can't hardly
    appreciate the challenge
  • two Chinese (kanji) character sets modern
    (China) and traditional (Taiwan most of the
    Chinese diaspora)
  • 'ruby' symbols 'over' ideographs

6
  • http//www.cs.tut.fi/jkorpela/chars.htmlcode
  • character repertoire A set of distinct
    characters.
  • character code A mapping, often presented in
    tabular form, which defines a one-to-one
    correspondence between characters in a character
    repertoire and a set of nonnegative integers.
  • character encoding A method (algorithm) for
    presenting characters in digital form by mapping
    sequences of code numbers of characters into
    sequences of octets. In the simplest case, each
    character is mapped to an integer in the range 0
    - 255 according to a character code and these are
    used as such as octets. Naturally, this only
    works for character repertoires with at most 256
    characters. For larger sets, more complicated
    encodings are needed. Encodings have names, which
    can be registered.

7
charset
  • Using the terms just defined, the charset
    attribute in an HTML meta tag means encoding
  • "text/htmlcharsetutf-8" /
  • "text/htmlcharsetISO-8859-1" /

8
Language
  • Attribute of html tag
  • MAY be used by browsers (spell-check,
    hyphenation, speech synthesizers), search
    engines, other tools.
  • See two-letter codes
  • www.w3c.org/WAI/ER/IG/ert/iso639.htm

9
more
  • A glyph is a presentation of a particular shape
    which a character may have when rendered or
    displayed.
  • speak of same glyph in italic, bold, etc.
  • A repertoire of glyphs comprises a font. In a
    more technical sense, as the implementation of a
    font, a font is a numbered set of glyphs. The
    numbers correspond to code positions of the
    characters (presented by the glyphs). Thus, a
    font in that sense is character code dependent.
    An expression like "Unicode font" refers to such
    issues and does not imply that the font contains
    glyphs for all Unicode characters.

10
Examples
  • ASCII is a character repertoire, code and
    encoding. Note confusion about 7 vs 8 bit ASCII
  • ISO Latin 1 alias ISO 8859-1 standard defines a
    repertoire, code and encoding of which ASCII is a
    subset. ISO 8859 is a family of many encodings,
    indicated by the n. ISO 8859-5 handles
    Cyrillic.

11
Unicode
  • provides a unique number for every character,
    no matter what the platform, no matter what the
    program, no matter what the language. This is
    the goal.
  • The Unicode Standard has been adopted by such
    industry leaders as Apple, HP, IBM, JustSystem,
    Microsoft, Oracle, SAP, Sun, Sybase, Unisys and
    many others. Unicode is required by modern
    standards such as XML, Java, ECMAScript
    (JavaScript), LDAP, CORBA 3.0, WML, etc., and is
    the official way to implement ISO/IEC 10646.
  • It is supported in many operating systems, all
    modern browsers, and many other products. The
    emergence of the Unicode Standard, and the
    availability of tools supporting it, are among
    the most significant recent global software
    technology trends.

12
Note
  • Unicode goal is universal coverage
  • Unicode is product of a consortium of 'mostly US
    companies'.
  • Some controversy in its treatment of things
  • Combining certain kanji characters

13
Unicode consortium
  • Go to http//www.unicode.org/unicode/standard/What
    IsUnicode.html
  • Examine the Translations on the left. See what
    language characters do not appear on your
    computer.
  • Select one and
  • Go to Display Problems and see if you can fix it.

14
XML progress
  • XML 1.0 to XML 1.1
  • Issue complaint that new standard had features
    to suit IBM
  • The IBM-specific problem that XML 1.1 aims to fix
    has to do with a special character that
    designates to IBM mainframe systems the end of a
    line of text. XML 1.0 chokes on that character,
    but version 1.1 would recognize it.
  • ZDNet News http//zdnet.com.com/2100-1104-962392.
    html

15
Techniques
  • One web site / screen provide options to go to
    different pages
  • use symbols/icons that are meaningful to audience
  • tricky. Flags may not be appropriate.
  • use images containing text in the specific
    language
  • risky choice hope that computer/platform/browser
    has character encoding and font to display
    language
  • poor choice use English word for other language.
  • http//www.lionbridge.com/ Example of
    company/site supporting 'global reach'.

16
quiz
  • What is the word in that language for
  • Spanish
  • Chinese (Mandarin? Hainese?)
  • Korean
  • Japanese
  • Hebrew
  • Russian
  • French
  • Finnish
  • Arabic (Classical?, ?)
  • Hindi (Urdu?, ?)
  • What is the direction of text? What is the
    format for dates? Time? Money?, relevant cultural
    issues?

17
Homework
  • Next Accessibility discussion, exercises
  • Prepare
  • download Instant Saxon standalone translator for
    xml and xslt.
  • Download simulators for small screen browsers
    (cell phones)
  • Nokia Mobile Internet Toolkit. Need to register
    (no costs).
  • OpenWave.
  • For (regular phone speech recognition/synthesis)
    register with studio.tellme.com
Write a Comment
User Comments (0)
About PowerShow.com