Re: Question regarding incomplete UTF-8 arguments.

help-libidn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question regarding incomplete UTF-8 arguments.

From:	Simon Josefsson
Subject:	Re: Question regarding incomplete UTF-8 arguments.
Date:	Wed, 05 Jun 2013 23:06:46 +0200
User-agent:	Gnus/5.130006 (Ma Gnus v0.6) Emacs/24.3 (gnu/linux)

Tetsuo Handa <address@hidden> writes:

> Hello.
>
> idna_to_unicode_8z8z from "info libidn" says:
...
>        Convert possibly ACE encoded domain name in UTF-8 format into a
>        UTF-8 string.  The domain name may contain several labels,
>        separated by dots.  The output buffer must be deallocated by the
>        caller.
>   
>        *Return value:* Returns `IDNA_SUCCESS' on success, or error code.
>
> Accroding to http://sourceforge.net/mailarchive/message.php?msg_id=30509057 ,
> it is a bug of GNU libidn library that the incomplete "zero-terminated UTF-8
> string." argument leading to read overrun.
...
>       char *src = strdup("address@hidden");

That is not a valid UTF-8 string.  The documentation says the function
only operate on valid UTF-8 strings.  It is known issue, see TODO:

  - Reject invalid Unicode data.

This means input has to be sanitized as valid UTF-8 before being passed
on to libidn functions.

As the TOOD suggests, it would be nicer if libidn rejected invalid
strings instead of doing bad things.

Maybe the manual should contain more warnings around this so that
application writer's don't miss it.

/Simon

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Question regarding incomplete UTF-8 arguments., Simon Josefsson <=

Prev by Date: Re: 'idn' utility segfaults
Next by Date: Re: idna_to_unicode_8z8z() takes a stroll through the heap
Previous by thread: Re: 'idn' utility segfaults
Next by thread: Re: idna_to_unicode_8z8z() takes a stroll through the heap
Index(es):
- Date
- Thread