qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 07/13] qapi: force a UTF-8 locale for running


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v4 07/13] qapi: force a UTF-8 locale for running Python
Date: Mon, 15 Jan 2018 11:15:01 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 01/15/2018 11:02 AM, Daniel P. Berrange wrote:
> Python2 did not validate locale correctness when reading input data, so
> would happily read UTF-8 data in non-UTF-8 locales. Python3 is strict so
> if you try to read UTF-8 data in the C locale, it will raise an error
> for any UTF-8 bytes that aren't representable in 7-bit ascii encoding.

Urgh, that sounds like a Python bug. The C locale is defined by POSIX to
be 8-bit clean (ie. a superset of ascii with 256 characters, not strict
ascii with only 128 characters and 128 bytes that form encoding errors).
 But that doesn't change the fact that we have to work around python's
braindead misinterpretation of reality.

> e.g.
> 
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 54: 
> ordinal not in range(128)
> Traceback (most recent call last):
>   File "/tmp/qemu-test/src/scripts/qapi-commands.py", line 317, in <module>
>     schema = QAPISchema(input_file)
>   File "/tmp/qemu-test/src/scripts/qapi.py", line 1468, in __init__
>     parser = QAPISchemaParser(open(fname, 'r'))
>   File "/tmp/qemu-test/src/scripts/qapi.py", line 301, in __init__
>     previously_included)
>   File "/tmp/qemu-test/src/scripts/qapi.py", line 348, in _include
>     exprs_include = QAPISchemaParser(fobj, previously_included, info)
>   File "/tmp/qemu-test/src/scripts/qapi.py", line 271, in __init__
>     self.src = fp.read()
>   File "/usr/lib64/python3.5/encodings/ascii.py", line 26, in decode
>     return codecs.ascii_decode(input, self.errors)[0]
> 
> Many distros support a new C.UTF-8 locale that is like the C locale,
> but with UTF-8 instead of 7-bit ASCII. That is not entirely portable
> though, so this patch instead forces the en_US.UTF-8 locale, which
> is pretty similar but more widely available.
> 
> We set LANG, rather than only LC_CTYPE, since generated source ought
> to be independant of all of the user's locale settings.

s/independant/independent/

LANG is the lowest-priority setting - if the user has explicitly set
LC_CTYPE or LC_ALL, their settings override what is in LANG.

> 
> This patch only forces UTF-8 for QAPI scripts, since that is the one
> showing the immediate error under Python3 with C locale, but potentially
> we ought to force this for all python scripts used in the build process.
> 
> Signed-off-by: Daniel P. Berrange <address@hidden>
> ---
>  Makefile | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index d86ecd2dd4..fde91cc42d 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -17,6 +17,8 @@ ifneq ($(wildcard config-host.mak),)
>  all:
>  include config-host.mak
>  
> +PYTHON_UTF8 = LANG=en_US.UTF-8 $(PYTHON)

I'm worried that this is not reproducible in the face of a user that
explicitly sets different locale env-vars with higher priority than LANG.

> +
>  git-submodule-update:
>  
>  .PHONY: git-submodule-update
> @@ -471,17 +473,17 @@ qapi-py = $(SRC_PATH)/scripts/qapi.py 
> $(SRC_PATH)/scripts/ordereddict.py
>  
>  qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
>  $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
> -     $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
> +     $(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-types.py \

But once we agree on the right override to stuff into PYTHON_UTF8, the
rest of the patch converting invocations to PYTHON_UTF8 makes sense.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]