[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 07/13] qapi: force a UTF-8 locale for running
From: |
Eric Blake |
Subject: |
Re: [Qemu-devel] [PATCH v4 07/13] qapi: force a UTF-8 locale for running Python |
Date: |
Mon, 15 Jan 2018 11:15:01 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 |
On 01/15/2018 11:02 AM, Daniel P. Berrange wrote:
> Python2 did not validate locale correctness when reading input data, so
> would happily read UTF-8 data in non-UTF-8 locales. Python3 is strict so
> if you try to read UTF-8 data in the C locale, it will raise an error
> for any UTF-8 bytes that aren't representable in 7-bit ascii encoding.
Urgh, that sounds like a Python bug. The C locale is defined by POSIX to
be 8-bit clean (ie. a superset of ascii with 256 characters, not strict
ascii with only 128 characters and 128 bytes that form encoding errors).
But that doesn't change the fact that we have to work around python's
braindead misinterpretation of reality.
> e.g.
>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 54:
> ordinal not in range(128)
> Traceback (most recent call last):
> File "/tmp/qemu-test/src/scripts/qapi-commands.py", line 317, in <module>
> schema = QAPISchema(input_file)
> File "/tmp/qemu-test/src/scripts/qapi.py", line 1468, in __init__
> parser = QAPISchemaParser(open(fname, 'r'))
> File "/tmp/qemu-test/src/scripts/qapi.py", line 301, in __init__
> previously_included)
> File "/tmp/qemu-test/src/scripts/qapi.py", line 348, in _include
> exprs_include = QAPISchemaParser(fobj, previously_included, info)
> File "/tmp/qemu-test/src/scripts/qapi.py", line 271, in __init__
> self.src = fp.read()
> File "/usr/lib64/python3.5/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
>
> Many distros support a new C.UTF-8 locale that is like the C locale,
> but with UTF-8 instead of 7-bit ASCII. That is not entirely portable
> though, so this patch instead forces the en_US.UTF-8 locale, which
> is pretty similar but more widely available.
>
> We set LANG, rather than only LC_CTYPE, since generated source ought
> to be independant of all of the user's locale settings.
s/independant/independent/
LANG is the lowest-priority setting - if the user has explicitly set
LC_CTYPE or LC_ALL, their settings override what is in LANG.
>
> This patch only forces UTF-8 for QAPI scripts, since that is the one
> showing the immediate error under Python3 with C locale, but potentially
> we ought to force this for all python scripts used in the build process.
>
> Signed-off-by: Daniel P. Berrange <address@hidden>
> ---
> Makefile | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index d86ecd2dd4..fde91cc42d 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -17,6 +17,8 @@ ifneq ($(wildcard config-host.mak),)
> all:
> include config-host.mak
>
> +PYTHON_UTF8 = LANG=en_US.UTF-8 $(PYTHON)
I'm worried that this is not reproducible in the face of a user that
explicitly sets different locale env-vars with higher priority than LANG.
> +
> git-submodule-update:
>
> .PHONY: git-submodule-update
> @@ -471,17 +473,17 @@ qapi-py = $(SRC_PATH)/scripts/qapi.py
> $(SRC_PATH)/scripts/ordereddict.py
>
> qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
> $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
> - $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
> + $(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-types.py \
But once we agree on the right override to stuff into PYTHON_UTF8, the
rest of the patch converting invocations to PYTHON_UTF8 makes sense.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature
- [Qemu-devel] [PATCH v4 00/13] Support building with py2 or py3, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 01/13] qapi: use items()/values() intead of iteritems()/itervalues(), Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 02/13] qapi: Use OrderedDict from standard library if available, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 03/13] qapi: adapt to moved location of StringIO module in py3, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 04/13] qapi: Adapt to moved location of 'maketrans' function in py3, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 05/13] qapi: remove '-q' arg to diff when comparing QAPI output, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 06/13] qapi: ensure stable sort ordering when checking QAPI entities, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 07/13] qapi: force a UTF-8 locale for running Python, Daniel P. Berrange, 2018/01/15
- Re: [Qemu-devel] [PATCH v4 07/13] qapi: force a UTF-8 locale for running Python,
Eric Blake <=
- [Qemu-devel] [PATCH v4 08/13] scripts: ensure signrom treats data as bytes, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 09/13] configure: allow use of python 3, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 10/13] input: add missing JIS keys to virtio input, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 11/13] ui: update keycodemapdb to get py3 fixes, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 13/13] docker: change Fedora images to run with python3, Daniel P. Berrange, 2018/01/15
- [Qemu-devel] [PATCH v4 12/13] travis: improve python version test coverage, Daniel P. Berrange, 2018/01/15