Title: Globalization Improvements in Microsoft .NET Framework v2.0
1Globalization Improvements in Microsoft .NET
Framework v2.0
- Achim Ruopp
- International Program Manager
- Microsoft Corporation
2Agenda
- Custom Cultures
- Replacement Cultures
- Supplemental Cultures
- Defining and Using Custom Cultures
- Improved Unicode Standard Support
- International Domain Names
- Normalization
- Supplementary and Combining Characters
- Unicode Character Data Information
- New Calendars
- Miscellaneous Improvements
3Visual Studio and .NET Framework Versions
4Custom Cultures
- Culture locale in .NET Framework
- Users require the ability to create new cultures
- Replacement Cultures
- Customizing existing cultures
- using the same culture name
- Supplemental Cultures
- Customizing existing cultures
- using a new culture name e.g. ja-JP-MyCompany or
en-US-24hour - New combination of language and location
- es-US for Spanish in the US
- Support for minority languages
- Sorbian in Germany wen-DE, hsn-DE, dsn-DE
5Custom Cultures in v1.0/v1.1
- New combination of language and location
- Set CurrentCulture and CurrentUIculture to
different values - Example es-US (Spanish-United States)
- set CurrentThread.CurrentUICulture to es (for
resources) - set CurrentThread.CurrentCulture to en-US (for
formatting) - Disadvantages
- Date text is English, unless DateTimeFormatInfo
is overridden - Override CultureInfo to create custom culture
- Existing custom culture sample on GotDotNet
http//www.gotdotnet.com/Community/UserSamples/Det
ails.aspx?SampleGuida193b952-2e44-45ed-811d-c1fab
f2f6e8a - Disadvantages
- Developers have to create their own
implementations - Custom Culture not equal to built-in cultures
- Application domain limitation custom culture
must be recreated across app domains - Some APIs create CultureInfo objects internally
which cannot be overriden
6Custom Cultures in v2.0
- Goals
- Fix the disadvantages from v1.0/v1.1
- Make custom cultures equal to built in cultures
- Makes usage easy
- Allow to easily deploy custom cultures
- Allow Custom Cultures to be serialized into LDML
(Locale Data Markup Language) for portability - LDML is described in Unicode Technical Standard
35 - http//www.unicode.org/reports/tr35/
7Custom Cultures in v2.0
- System.Globalization.CultureAndRegionInfoBuilder
- New class to create and manage custom cultures
- Need to add reference to sysglobl.dll to use
- Have to be administrator to register a custom
culture - Custom cultures are equivalents to built-in
cultures - Listed by CultureInfo.GetCultures(CultureTypes.All
Cultures) - Cannot define new sort orders, calendars or
character encodings - Can reuse all the existing ones
- Cannot define a new LocaleID (LCID) for
interfacing with native Win32 code - UI designer only available as a sample
8Custom Cultures in v2.0Replacement Cultures
- Culture will have the same LCID as the culture it
replaces - Culture cannot change collation info
- Culture must include the default calendar from
the replaced culture in the available calendars
list - Does not have to be default
- Use culture-invariant formatting for
communicating data to other machines!
9Personalize A Culture24 Hour Format
- CultureInfo ciUS new CultureInfo("en-US",
false) - RegionInfo riUS new RegionInfo("US")
- CultureAndRegionInfoBuilder carib new
CultureAndRegionInfoBuilder("en-US",
CultureAndRegionModifiers.Replacement) - carib.LoadDataFromCultureInfo( ciUS )
- carib.LoadDataFromRegionInfo( riUS )
- carib.GregorianDateTimeFormat.ShortTimePattern
"HHmmss" -
- // Register to deploy on this machine need to
be admin - carib.Register()
- // Optional save to an LDML file to deploy to
other machines - carib.Save("e\\tmp\\myculture.ldml")
- // Instantiate a new CultureInfo object
- CultureInfo ci new CultureInfo("en-US")
10Create A New CultureSpanish in the U.S.
- CultureInfo ciES new CultureInfo("es-ES")
- CultureInfo ciUS new CultureInfo("en-US")
- RegionInfo riUS new RegionInfo("US")
- CultureAndRegionInfoBuilder carib new
CultureAndRegionInfoBuilder("es-US", - CultureAndRegionModifiers.None)
- carib.LoadDataFromCultureInfo( ciES )
- carib.LoadDataFromRegionInfo( riUS )
- // Set the currency symbol and DateTimeFormat
information - carib.NumberFormat.CurrencySymbol
ciUS.NumberFormat.CurrencySymbol - carib.GregorianDateTimeFormat
ciUS.DateTimeFormat - carib.GregorianDateTimeFormat.DayNames
ciES.DateTimeFormat.DayNames - ...
- // Register the data for deployment on this
machine - carib.Register()
- // Instantiate a new CultureInfo from our new
data - CultureInfo ci new CultureInfo("es-US")
11Developing and Deploying a Custom Culture
12Multilingual Web AddressesInternational Domain
Names
- New Standard (IDN, RFCs 3454, 3490, 3491, 3492)
- http//www.ietf.org/html.charters/idn-charter.html
- Enables non-ASCII domain names
- www.loréal.com
- www.???.com
- IDN Mapping class containing IDN conversion APIs
- System.Globalization.IdnMapping
- GetAscii and GetUnicode provide host name
conversion between Unicode and Punycode - Browser plug-ins available for Internet Explorer
- Clarification
- This support does not cover Internationalized
Resource Identifiers (IRIs, RFC 3987) - Microsoft Web Platform is already using UTF-8,
but needs more work to be compliant
13International Domain Names
14Normalization
- Support for all 4 normalization forms (C,D,KC,KD)
- õhµ (U00f5 U0068 U0302 U00b5 U00a8)LATIN
SMALL LETTER O WITH TILDE LATIN SMALL LETTER H
COMBINING CIRCUMFLEX ACCENT MICRO SIGN
DIAERESIS - FormC õhµ (U00f5 U0125 U00b5 U00a8)
- FormD ohµ (U006f U0303 U0068 U0302
U00b5 U00a8) - FormKC õhµ (U00f5 U0125 U03bc U0020
U0308) - FormKD ohµ (U006f U0303 U0068 U0302
U03bc U0020 U0308) - Normalize strings
- public string Normalize(System.Text.NormalizationF
orm normalizationForm) - Verify normalization or normalization form
- public bool IsNormalized(System.Text.Normalization
Form normalizationForm) - Implementation conforms to sample in Unicode
Standard Annex 15 - http//www.unicode.org/reports/tr15/
15Supplementary and Combining Characters
- v. 1.0, 1.1
- Methods in StringInfo class to parse characters
and walk text elements - v. 2.0
- Additional methods on StringInfo to find
surrogate pairs/combining characters in
substrings - public string SubstringByTextElements (int
startingTextElement, int lengthInTextElements) - New property to determine length of strings
containing surrogate pairs/combining characters - public int LengthInTextElements get
- Char.IsLetter recognizes surrogate pairs in
strings - New methods in System.Char class to detect high
and low surrogate codepoints, surrogate pairs
16Updates to encodings
- Now built into the Base Class Library (BCL)
- Improved performance
- More flexibility
- Consistent results across supported platforms
- Encoding enumeration API
- UTF-32 support (little endian and big endian)
- Use UTF-16 or UTF-8 for better performance and
memory efficiency - UTF-16 big endian support
- Encoding/decoding fallbacks
- Allows reacting to errors in encoding to and from
Unicode in different ways - Exception
- Replacement
- Best fit
- Custom
17Unicode Character Data Information
- New System.Globalization.CharUnicodeInfo class
- GetUnicodeCategory
- categories could be used in regular expressions
in v1.0/v1.1 but not easily retrieved - Numeric character information
- GetDecimalDigitValue
- GetDigitValue
- e.g. U2160 Roman Numeral One ? 1
- GetNumericValue
- e.g. U00BC Vulgar Fraction One Quarter ¼ 0.25
18New Calendars
- Full calendars
- Can be set as default calendar for CultureInfo
- Enable parsing and formatting of dates
- New UmAlQura calendar
- Implementation of Hijiri calendar
- Conversion calendars
- Enable date conversions between regional
calendars - Jalaali calendar
- Calender used in Farsi communities
- East Asian Lunisolar calendars
- PRC
- Taiwan
- Japan
- Korea
-
19Miscellaneous Improvements
- Windows data derived cultures
- Windows XP SP2 added 25 new locales
- .NET Framework v2.0 can emulate cultures for
these - Information about text direction
- LineOrientation/IsRightToLeft properties in
TextInfo - RFC 3066(bis)-compliant language tags
- IetfLanguageTag property of CultureInfo
- GetCultureByIetfLanguageTag function in
CultureInfo
20Miscellaneous Improvements
- Shortest day names
- ShortestDayNames in DateTimeFormatInfo
- Enables display of compact date strings/calendars
- Information about command line support
- GetConsoleFallbackUICulture
21References
- Visual Studio 2005/.NET Framework v2.0
- http//lab.msdn.microsoft.com/vs2005/
- LDML specification
- http//www.unicode.org/reports/tr35/
- Internet Explorer plugins for IDN support
- http//support.microsoft.com/default.aspx?scidkb
en-us842848 - Microsoft Globaldev website
- http//www.microsoft.com/globaldev/
- Newsgroup
- newsmicrosoft.public.dotnet.internationalization
- W3C IDN/IRI article
- http//www.w3.org/International/articles/idn-and-i
ri/
22References - continued
- Blogs
- http//blogs.msdn.com/michkap
- http//blogs.msdn.com/AchimR
- http//www.dasblonde.net/
- http//blogs.msdn.com/BCLTeam
23(No Transcript)