Re: two (bugs? misfeatures?) in libidn

help-libidn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: two (bugs? misfeatures?) in libidn

From:	Simon Josefsson
Subject:	Re: two (bugs? misfeatures?) in libidn
Date:	Thu, 16 Aug 2012 22:04:03 +0200
User-agent:	Gnus/5.130006 (Ma Gnus v0.6) Emacs/23.3 (gnu/linux)

Jon Nelson <address@hidden> writes:

> On Thu, Aug 2, 2012 at 3:21 PM, Simon Josefsson <address@hidden> wrote:
>> Jon Nelson <address@hidden> writes:
>>
>>> I've encountered two bugs or misfeatures in libidn:
>>
>> Hi!  Thanks for your report.
>>
>>> 1. given an idna-encoded input, it is possible to generate invalid
>>> UTF-8 output (as defined by RFC3629). The UTF-8 is invalid because
>>> codepoints above 0x10FFFF are used.
>>>
>>> See http://tools.ietf.org/html/rfc3629
>>
>> Can you be more concrete, what inputs does this happen for and what
>> output would you expect?  An example would help illustrate the problem.
>
> Example:   echo xn--1234xxxxxxxxxx | idn -u --debug

Thank you.  Interestingly, the punycode code from RFC 3492 happily
decodes the string to Unicode code points > U+10FFFF.  I can't see
anything in RFC 3492 (punycode) or RFC 3490 (IDNA ToUnicode) that
requires checking for code points > U+10FFFF, or where that check would
be done.  Arguable, the final conversion from UCS4 to UTF8 should
trigger an error in libidn, but then the damage is already done:
ToUnicode has returned a sequence of code points which are illegal.  So,
it seems ToUnicode should perform this check somewhere, but I can't find
where it would be suitable reading RFC 3492 and RFC 3490.  Thoughts?

/Simon

[Prev in Thread]

Current Thread

[Next in Thread]

Re: two (bugs? misfeatures?) in libidn, Simon Josefsson, 2012/08/02
- Re: two (bugs? misfeatures?) in libidn, Jon Nelson, 2012/08/16
  - Re: two (bugs? misfeatures?) in libidn, Simon Josefsson <=

Prev by Date: Re: two (bugs? misfeatures?) in libidn
Previous by thread: Re: two (bugs? misfeatures?) in libidn
Index(es):
- Date
- Thread