libidn-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

CVS libidn/doc/specifications


From: libidn-commit
Subject: CVS libidn/doc/specifications
Date: Sat, 3 Dec 2005 11:40:44 +0100

Update of /home/cvs/libidn/doc/specifications
In directory dopio:/tmp/cvs-serv23428

Added Files:
        draft-iab-idn-nextsteps-00.txt 
Log Message:
Add.


--- /home/cvs/libidn/doc/specifications/draft-iab-idn-nextsteps-00.txt  
2005/12/03 10:40:44     NONE
+++ /home/cvs/libidn/doc/specifications/draft-iab-idn-nextsteps-00.txt  
2005/12/03 10:40:44     1.1




Network Working Group                                         J. Klensin
Internet-Draft
Expires: June 5, 2006                                       P. Faltstrom
                                                                     IAB
                                                        December 2, 2005


  Review and Recommendations for Internationalized Domain Names (IDN)
                     draft-iab-idn-nextsteps-00.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on June 5, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   This note describe issues raised by the deployment and use of
   Internationalized Domain Names.  It describes problems both at the
   time of registration and those for use of those names for use in the
   DNS.  It recommends that IETF should update the IDN related RFCs and
   a framework to be followed in doing so, as well as summarizing and
   identifying some work that is required outside the IETF.  In
   particular, it proposes that some changes be investigated for the



Klensin & Faltstrom       Expires June 5, 2006                  [Page 1]

Internet-Draft              Framework for IDN              December 2005


   IDNA standard and its supporting tables, based on experience gained
   since those standards were completed.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Status of this Document and its Recommendations  . . . . .  4
     1.2.  The IDNA Standard  . . . . . . . . . . . . . . . . . . . .  4
     1.3.  Unicode Documents  . . . . . . . . . . . . . . . . . . . .  5
     1.4.  Definitions  . . . . . . . . . . . . . . . . . . . . . . .  5
       1.4.1.  language . . . . . . . . . . . . . . . . . . . . . . .  6
       1.4.2.  script . . . . . . . . . . . . . . . . . . . . . . . .  6
       1.4.3.  multilingual . . . . . . . . . . . . . . . . . . . . .  6
       1.4.4.  localization . . . . . . . . . . . . . . . . . . . . .  6
       1.4.5.  internationalization . . . . . . . . . . . . . . . . .  6
     1.5.  Statements and Guidelines  . . . . . . . . . . . . . . . .  7
       1.5.1.  IESG Statement . . . . . . . . . . . . . . . . . . . .  7
       1.5.2.  ICANN statements . . . . . . . . . . . . . . . . . . .  7
   2.  Problem  . . . . . . . . . . . . . . . . . . . . . . . . . . .  9
     2.1.  Examples of issues . . . . . . . . . . . . . . . . . . . .  9
       2.1.1.  Language specific character matching . . . . . . . . .  9
       2.1.2.  Multiple scripts . . . . . . . . . . . . . . . . . . .  9
       2.1.3.  Normalization and Character Mappings . . . . . . . . . 10
       2.1.4.  URL on a bus . . . . . . . . . . . . . . . . . . . . . 11
       2.1.5.  Bidirectional text . . . . . . . . . . . . . . . . . . 12
       2.1.6.  Confusable Character Issues  . . . . . . . . . . . . . 12
       2.1.7.  The IESG Statement and IDNA issues . . . . . . . . . . 13
       2.1.8.  Versions of Unicode  . . . . . . . . . . . . . . . . . 14
   3.  Conclusions from the IAB IDN ad-hoc committee  . . . . . . . . 15
     3.1.  Issues within the scope of the IETF  . . . . . . . . . . . 15
       3.1.1.  Review of IDNA . . . . . . . . . . . . . . . . . . . . 15
       3.1.2.  Non-DNS and Above-DNS Internationalization
               Approaches . . . . . . . . . . . . . . . . . . . . . . 16
       3.1.3.  Security issues, certificates, etc.  . . . . . . . . . 16
       3.1.4.  Non US-ASCII in local part of email addresses  . . . . 17
       3.1.5.  Use of the Unicode Character Set in the IETF . . . . . 18
     3.2.  Issues that fall within the purview of ICANN . . . . . . . 18
       3.2.1.  Dispute resolution . . . . . . . . . . . . . . . . . . 18
       3.2.2.  Policy at registries . . . . . . . . . . . . . . . . . 18
       3.2.3.  IDN TLDs . . . . . . . . . . . . . . . . . . . . . . . 18
   4.  Specific Recommendations for Action  . . . . . . . . . . . . . 19
     4.1.  Reduction of permitted character list  . . . . . . . . . . 19
     4.2.  Elimination of all non-language characters . . . . . . . . 19
     4.3.  Elimination of word-separation punctuation . . . . . . . . 19
     4.4.  Updating to new versions of Unicode  . . . . . . . . . . . 20
     4.5.  Combining Characters and Character Components  . . . . . . 20
     4.6.  Role and Uses of the DNS . . . . . . . . . . . . . . . . . 20



Klensin & Faltstrom       Expires June 5, 2006                  [Page 2]

Internet-Draft              Framework for IDN              December 2005


   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 21
   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 21
     7.2.  Non-normative References . . . . . . . . . . . . . . . . . 22
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24
   Intellectual Property and Copyright Statements . . . . . . . . . . 25












































Klensin & Faltstrom       Expires June 5, 2006                  [Page 3]

Internet-Draft              Framework for IDN              December 2005


1.  Introduction

1.1.  Status of this Document and its Recommendations

   This document reviews the IDN landscape from an IETF perspective and
   presents the recommendations and conclusions of an IAB-convened ad
   hoc committee charged with reviewing IDN issues and the path forward
   (See Section 6).  Its recommendations are recommendations to the
   IETF, or in a few cases to other bodies, for topics to be examined
   and actions to be taken if those bodies, after their examinations,
   consider those actions appropriate.

   Neither the IAB nor the members of the ad hoc committee have yet
   reached consensus that this document is ready for final publication.
   However, the IAB concluded that it was appropriate to expose it, as a
   working draft, for community comment and feedback.  Such comments
   should be sent to address@hidden (relevant material will be forwarded to
   the IAB IDN Ad Hoc committee).

1.2.  The IDNA Standard

   During 2002 IETF created the following RFCs that, together, define
   IDNs:

   RFC 3454 Preparation of Internationalized Strings ("stringprep")
      [RFC3454].
      Stringprep is a generic mechanism for taking a Unicode string and
      converting it into a canonical format.  Stringprep itself is just
      a collection of rules, tables, and operations.  Any protocol or
      algorithm that uses it must define a "stringprep profile", which
      specifies which of those rules are applied, how, and with which
      characteristics.

   RFC 3490 Internationalizing Domain Names in Applications (IDNA)
      [RFC3490].
      IDNA is the base specification in this group.  It specifies that
      Nameprep is used as the stringprep profile for domain names, and
      that Punycode is the relevant the encoding mechanism use for use
      in generating an ASCII-compatible ("ACE") form of the name.  It
      also applies some additional conversions and character filtering
      that are not part of Nameprep.

   RFC 3491 Nameprep: A Stringprep Profile for Internationalized Domain
      Names (IDN) [RFC3491].
      Nameprep is one such profile.  It is designed to meet the specific
      needs of IDNs and, in particular, to support case-folding for
      scripts that support what are traditionally known as upper and
      lower case forms of the same letters.  The result of the nameprep



Klensin & Faltstrom       Expires June 5, 2006                  [Page 4]

Internet-Draft              Framework for IDN              December 2005


      algorithm is a string containing a subset of the Unicode Character
      set, normalized and case folded so that case insensitive
      comparison can be made.

   RFC 3492 Punycode: A Bootstring encoding of Unicode for
      Internationalized Domain Names in Applications (IDNA) [RFC3492].
      Punycode is a mechanism for encoding a Unicode string in ASCII
      characters.  The characters used are the same the subset of
      characters that are allowed in the hostname definition of DNS,
      i.e., the "letter, digit, and hyphen" characters, sometimes known
      as "LDH".

1.3.  Unicode Documents

   Unicode is used as the base, and defining, character set for IDN.
   Unicode is standardized by the Unicode Consortium, and synchronized
   with ISO to create ISO/IEC 10646 [ISO10646].  At the time the RFCs
   mentioned earlier were created, Unicode was at version 3.2.  For
   reasons explained later, the RFCs explicitly use Unicode version 3.2
   [Unicode32] and no other version (see Section 2.1.8).

   Unicode is a very large and complex character set.  (The term
   "character set" or "charset" is used in a way that is peculiar to the
   IETF and may not be the same as the usage in other bodies and
   contexts.)  The Unicode Standard and related documents are created
   and maintained by the Unicode Technical Committee (UTC), one of the
   committees of the Unicode Consortium.

   The Consortium first published The Unicode Standard [Unicode10] in
   1991, and continues to develop standards based on that original work.
   Unicode is developed in conjunction with the International
   Organization for Standardization, and it shares its character
   repertoire with ISO/IEC 10646.  Unicode and ISO/IEC 10646 function
   equivalently as character encodings, but The Unicode Standard
   contains much more information for implementers, covering -- in depth
   -- topics such as bitwise encoding, collation, and rendering.  The
   Unicode Standard enumerates a multitude of character properties,
   including those needed for supporting bidirectional text.  The two
   standards do use slightly different terminology.

1.4.  Definitions

   The following terms and their meanings are criticial to understanding
   of IDNs and the rest of this document.  These terms are derived from
   [RFC3536], which contains additional discussion of some of them.






Klensin & Faltstrom       Expires June 5, 2006                  [Page 5]

Internet-Draft              Framework for IDN              December 2005


1.4.1.  language

   A language is a way that humans interact.  The use of language occurs
   in many forms, the most common of which are speech, writing, and
   signing.

   Some languages have a close relationship between the written and
   spoken forms, while others have a looser relationship.  RFC 3066
   [RFC3066] discusses languages in more detail and provides identifiers
   for languages for use in Internet protocols.  Note that computer
   languages are explicitly excluded from this definition.

1.4.2.  script

   A set of graphic characters used for the written form of one or more
   languages.  This definition is the one used in [ISO10646].

   Examples of scripts are Latin, Cyrillic, Greek, Arabic, and Han (the
   ideographs used in writing Chinese, Japanese, and Korean).  RFC 2277
   [RFC2277] discusses scripts in detail.

1.4.3.  multilingual

   The term "multilingual" has many widely-varying definitions and thus
   is not recommended for use in standards.  Some of the definitions
   relate to the ability to handle international characters; other
   definitions relate to the ability to handle multiple charsets; and
   still others relate to the ability to handle multiple languages.

1.4.4.  localization

   The process of adapting an internationalized application platform or
   application to a specific cultural environment.  In localization, the
   same semantics are preserved while the syntax or presentation forms
   may be changed.

   Localization is the act of tailoring an application for a different
   language or script or culture.  Some internationalized applications
   can handle a wide variety of languages.  Typical users only
   understand a small number of languages, so the program must be
   tailored to interact with users in just the languages they know.

1.4.5.  internationalization

   In the IETF, "internationalization" means to add or improve the
   handling of non-ASCII text in a protocol.

   Many protocols that handle text only handle one script (often, a



Klensin & Faltstrom       Expires June 5, 2006                  [Page 6]

Internet-Draft              Framework for IDN              December 2005


   subset of the characters used in writing English text), or leave the
   question of what character set is used up to local guesswork (which
   leads, of course, to interoperability problems).  Adding non-ASCII
   text to such a protocol allows the protocol to handle more scripts,
   with the intention of being able to include all of the scripts that
   are useful in the world.  It should be noted that many English words
   cannot be written in ASCII, various mythologies notwithstanding.

1.5.  Statements and Guidelines

   When the IDN RFCs were published, IESG and ICANN made statements that
   were intended to guide deployment and future work.  In recent months,
   ICANN has updated its statement and others have also made
   contributions.

1.5.1.  IESG Statement

   The IESG made a statement on IDNA
   (http://www.ietf.org/IESG/STATEMENTS/IDNstatement.txt):

       IDNA, through its requirement of Nameprep [RFC3491], uses
       equivalence tables that are based only on the characters
       themselves; no attention is paid to the intended language (if any)
       for the domain name. However, for many domain names, the intended
       language of one or more parts of the domain name actually does
       matter to the users.

       Similarly, many names cannot be presented and used without
       ambiguity unless the scripts to which their characters belong are
       known. In both cases, this additional information should be of
       concern to the registry.

   The statement is longer than this, but these paragraphs are the
   important ones.  The rest of the statement are explanations and
   examples.

1.5.2.  ICANN statements

1.5.2.1.  Initial ICANN Guidelines

   Soon after the IDNA standard was adopted, ICANN produced an initial
   version of "IDN Guidelines", which appears at
   http://www.icann.org/general/idn-guidelines-20jun03.htm.  There are
   some additional notes and comments, but the key guideline text is:

   Guidelines
   1.  Top-level domain registries that implement internationalized
   domain name capabilities will do so in strict compliance with the



Klensin & Faltstrom       Expires June 5, 2006                  [Page 7]

Internet-Draft              Framework for IDN              December 2005


   technical requirements described in RFCs 3490, 3491, and 3492
   (collectively, the "IDN standards").

   2.  In implementing the IDN standards, top-level domain registries

[1002 lines skipped]




reply via email to

[Prev in Thread] Current Thread [Next in Thread]