Title: IDN TLD Policy Issues
1IDN TLD Policy Issues
Session III Internet Domain Administration
Generic TLD Implementations of IDN APT- ITU
Joint Workshops on ENUM and IDN Brunei
Darussalam Mohamed Sharil Tarmizi Senior
Advisor, MCMC Malaysia Chairman, GAC,
ICANN Former ICANN IDN Committee member Views
expressed are my personal views only
2Glossary A little bit of history
- gTLDs
- Generic Top Level Domain Names
- eg. .com, .net, .org, .edu, .gov, .mil, .int
- New gTLDs
- eg. .biz, .museum, .aero, .info. more to come ?
In the Domain Name System (DNS) naming of
computers there is a hierarchy of names. The root
of system is unnamed. There are a set of what are
called "top-level domain names" (TLDs). These are
the generic TLDs (EDU, COM, NET, ORG, GOV, MIL,
and INT), and the two letter country codes from
ISO-3166. It is extremely unlikely that any other
TLDs will be created.
John Postel RFC1591
3Glossary A little bit of history (cont)
- Internationalised Domain Names (IDN)
- vs
- Multilingual domain names (MDN)
- Are we talking about languages or characters ?
- OR
- Both ?
4IDN refers to ?
- A domain name where one or more characters is not
the historical subset of Latin LDH set (a-z),
digits (0-9) and hyphen (LDH) used in the DNS - Associated with Unicode (ISO 10646)-based labels
- Major transition from 38 characters to more than
thousands of possible Unicode code points
5Demand for Multilingual Internet?
- Internet users in Asia-Pacific are growing
- How many speak English or recognise a,b,c?
- Consequence of the Internet boom worldwide is
many users potentially do not understand ASCII - ASCII character domain names create a digital
divide - Native speakers of languages not expressed in
ASCII are disadvantaged - E.g. Arabic, Bhutanese, Chinese, Hindi,
Japanese, Korean, Nepali, Tamil, Thai and others
who use non-ASCII scripts
6Whats in a character ?
Number One in Arabic ?
Alif in Arabic ?
Alif in Jawi?
Number One in English ?
Some other meaning in some other character that I
do not know ?
Code point 0627
Code point 0661
7Addressing
Domain names are basically distinguishable
internet names with classified structure - do we
go by meaning, if so, what language ? OR by
symbol ?
air.net.my
??? . ???.my
8Desires
- Internet to be internationalised to reduce
digital divide - Internet to be more accessible to users of
non-English character sets - Technology should be harmonised
- Need to achieve global technical standards
- Effective global competition
9IDN in a ccTLD.
- ?????.com.bn OR ?????.??.bn OR ?????.??.??
- ?????.com.id OR ?????.??.id OR ?????.??.??
- ?????.com.my OR ?????.??.my OR ?????.??.??
- Jawi script Malay language with Arabic script
base - Who has jurisdiction ?
- What characters to register or use ?
- Which language table ?
- Any conflicts ?
10Potential Confusion ?
- Language or Script ?
- A language is a way that human interact
- A script is the written form of a language
- Many written languages share the same script
- Some written languages use more than one script
and is shared across several regions - Example, ??.com
- is this in Chinese, Japanese or Korean?
Borrowed from J.Seng
11Potential Confusion ?
- Name and Identifier
- Name is a word or phrase that constitutes the
distinctive designation of a person or thing - Identifier is a string of characters that
uniquely identify a person or thing - Example
- Mohamed Sharil is a Name
- mohamedsharil is an Identifier
- is mohamedsharil.net a Name or Identifer?
- What about ????? ????.net ?
12IDN in a gTLD.
- mohamedsharil.net OR mohamedsharil.rangkaian ?
- ????? ????.net OR ????? ???? .?? OR ????? ????
.??????? - Is my name in Jawi or Arabic ?
- Who has jurisdiction over Jawi/Arabic .net ?
- Should there even be a .net in Arabic ?
- If .net means network, can I use the Malay
translation which is .rangkaian or must I use
the Arabic script equivalent ? - What if it is in Chinese as well ?
13IDN in a gTLD.(cont)
- What about..
- ????? ????.??? OR ????? mohamed.???
- Should a mix between two different scripts be
allowed ? - Who will have jurisdiction ?
- What if there is a different script in the second
level? - What if it is some other combination?
14Issues
- Possible cyber-squatting ?
- Same word in different languages?
- Do you still follow RFC 1591 ?
- If so, who owns/hosts the top level domain ?
- UDRP WIPO has an IDN process
- How to do IDN UDRP? Conflict of decisions ?
15Issues
- Does your language/script have Unicode points ?
- Harmonisation of Unicode tables ?
- Trademarks ?
- Social, culture religion
- Country, ethnic group, society ?
- Sovereignty ?
- Names with special significance or sensitive ?
- Religious, racial, ethnic etc.
- Reserved names ?
16It is not just a technical or intellectual
property issue
- Cultural issue
- Different interpretation of words
- Offensive in some and permitted in other cultures
? - Language
- who has custody over any language ? Transcends
national boundaries - Over 5000 written languages in the world
- About 200 nations in the world
- Major language groups English, French, Spanish,
Chinese, Arabic, Russian major African
languages, Hindi, Tamil etc.
17Implications to other initiatives
- Principles of competition, market access,
consumer protection, and intellectual property
protection. - IDNs in the DNS - at the second-level and below,
under existing TLDs? - New IDN top-level domains, or internationalized
top-level domains? - Who should have it ? - sponsored, unsponsored, or
be at the ccTLDs or national governments ? - conflicting registrations due to similar
character sets, backward compatibility, or
special requirements of local languages.
18Some Suggestions
- Understand the technical limitation
- Script vs. Language
- Name vs. Identifier
- Internationalization vs. Localization
- Per label basis
- Understand what users wants
- Script vs. Language
- Name vs. Identifier
- Internationalization vs. Localization
Borrowed from J.Seng
19Some thoughts
- IDN.ccTLD is difficult enough
- IDN.IDN is a lot tougher
- Stability of Internet must be maintained
- Some developed countries already completed their
solutions (e.g. CJK) - But so far, deployed at ccTLD level
- Those who need IDN tend to be from developing
countries - Local language technical community needs to be
more active - Local people themselves will need to find a way
to tackle the issue
20Some further thoughts
- Harmonise Unicode tables ?
- A lot to learn from those who have done it..
- BIGGER ISSUE
- Who should have control over IDN gTLD ?
- Necessary to have multifaceted and
multi-stakeholder approach - engineers linguists more
- Cross boundary coordination efforts essential
- Start with IDN.ccTLD first and learn from
experience before going to IDN.IDN
21To complicate life further ?
- Arabic (Arabic)
- Arabic (Persian)
- Armenian
- Bengali
- Cyrillic (Russian)
- Devanagari (Hindi)
- Georgian
- Greek
- Gujarati
- Gurmukhi
- Han (Chinese)
- Hangul
- Hebrew
- Hiragana ?????
- Khmer
- Malayalam
- Syriac
- Tamil
- Thai
ASCII/English Unicode
Adapted from Unicode Consortium
22A Preliminary Framework for non ASCII TLDsBrief
Explanation of the Six Categories
- Semantic association with Geographic Units
- A TLD string that to a typical reader would be
clearly linked to recognized geographic unit, - as is the case with the existing ASCII ccTLDs.
2. Semantic association with Languages A TLD
string that to a typical reader would be clearly
linked to the name of a language. For example,
the Arabic word for "Arabic."
3. Semantic association with Cultural Groups
or Ethnicities A TLD string that to a typical
reader would be clearly linked to a cultural
group or ethnicity that is not defined by
recognized national boundaries. For example, the
Kurdish or Swahili peoples.
ICANN IDN Committee
23A Preliminary Framework for non ASCII TLDsBrief
Explanation of the Six Categories (2)
4. Semantic association with Existing
Sponsored TLDs A non-ASCII TLD string that to a
typical reader would be clearly linked to an
existing ASCII sponsored TLD.
- Semantic association with Existing Unsponsored
TLDs - A non-ASCII TLD string that to a typical reader
would be clearly linked to the existing
unsponsored ASCII - gTLDs, such as .com, .net, .org, .info, .biz, or
.name. -
- Everything else
- In this category, we mean to include every word,
abbreviation or other string that is not
semantically - associated with one of the previous five
categories.
ICANN IDN Committee
24A Preliminary Framework for non ASCII TLDs
Summary Diagrammatic View
Preliminary Potential non ASCII TLDs
Semantically Associated With Geographic Units
Everything Else
ICANN IDN Committee
25- ????? ?????
- Terima kasih
- sharil_at_cmc.gov.my
- www.mcmc.gov.my
- www.gac.icann.org
- www.icann.org
- www.itu.int
- www.minc.org