[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: printf %d $'"\xff' returns random values in UTF-8
From: |
Stephane Chazelas |
Subject: |
Re: printf %d $'"\xff' returns random values in UTF-8 |
Date: |
Sun, 17 Sep 2017 11:26:02 +0100 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
2017-09-17 11:01:00 +0100, Stephane Chazelas:
[...]
> wchar_t wc;
> - size_t mblength, slen;
> + int mblength;
[...]
> + mblength = mbtowc (&wc, garglist->word->word+1, slen);
> + if (mblength > 0)
> + ch = wc;
[...]
Actually, "wc" should probably be initialised to 0 to cover for
cases where the string only contains state switching sequences
in stateful encodings (in which case, mbtowc may return their
length but not set "wc" as there's no character in there). (I've
not tested it and anyway sane systems would not have locales
with such charsets so it's mostly an academic consideration).
So:
diff --git a/builtins/printf.def b/builtins/printf.def
index 3d374ff..7a840bb 100644
--- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -1244,19 +1244,17 @@ asciicode ()
{
register intmax_t ch;
#if defined (HANDLE_MULTIBYTE)
- wchar_t wc;
- size_t mblength, slen;
+ wchar_t wc = 0;
+ int mblength;
+ size_t slen;
#endif
DECLARE_MBSTATE;
#if defined (HANDLE_MULTIBYTE)
slen = strlen (garglist->word->word+1);
- mblength = MBLEN (garglist->word->word+1, slen);
- if (mblength > 1)
- {
- mblength = mbtowc (&wc, garglist->word->word+1, slen);
- ch = wc; /* XXX */
- }
+ mblength = mbtowc (&wc, garglist->word->word+1, slen);
+ if (mblength > 0)
+ ch = wc;
else
#endif
ch = (unsigned char)garglist->word->word[1];
--
Stephane