|
From: | Eric Blake |
Subject: | Re: [Qemu-devel] qapi escape-too-big test doesn't work if LANG=C ? |
Date: | Mon, 19 Mar 2018 10:20:44 -0500 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 |
On 03/19/2018 06:20 AM, Daniel P. Berrangé wrote:
On Mon, Mar 19, 2018 at 10:37:12AM +0000, Peter Maydell wrote:I recently tweaked my build scripts to run with LANG=C (trying to suppress gcc's irritating habit of using smartquotes rather than plain old ''). This seems to result in an error running the qapi-schema/escape-too-big test: PYTHONPATH=/home/petmay01/linaro/qemu-for-merges/scripts python3 -B /home/petmay01/linaro/qemu-for-merges/tests/qapi-schema/test-qapi.py /home/petmay01/linaro/qemu-for-merges/tests/qapi-schema/escape-too-big.jsontests/qapi-schema/escape-too-big.test.out2>tests/qapi-schema/escape-too-big.test.err; echo $?tests/qapi-schema/escape-too-big.test.exit1c1,10 < tests/qapi-schema/escape-too-big.json:3:14: For now, \u escape only supports non-zero values up to \u007f ---Traceback (most recent call last): File "tests/qapi-schema/test-qapi.py", line 64, in <module> schema = QAPISchema(sys.argv[1]) File "scripts/qapi/common.py", line 1492, in __init__ parser = QAPISchemaParser(open(fname, 'r')) File "scripts/qapi/common.py", line 264, in __init__ self.src = fp.read() File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 61: ordinal not in range(128)/home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:927: recipe for target 'check-tests/qapi-schema/escape-too-big.json' failedSo your "C" locale will be non-UTF-8, except on OS-X where the "C" locale is UTF-8 by default. Unfortunately while POSIX expects the "C" locale to be 8-bit cleanup, Python by default will reject any characters outside the 7-bit range with its "ascii" codec. So this is ultimately a python bug, but there's little we can do about that given how widely deployed the bug is.
Yep, the bug is in python. And we've already worked around it elsewhere in qemu; see commit d4e5ec877.
To workaround this problem in other applications what I have done is add the following to Makefiles before invoking python: LC_ALL= LANG=C LC_CTYPE=en_US.UTF-8 The LC_ALL= bit is needed because if the user has set LC_ALL themselves it will override LANG and all other LC_* variables. Setting LANG=C is not strictly needed, as LC_CTYPE will override it. CC'ing Eric since he was involved in the discussions about this bug in other libvirt related apps.
So the question is how to fix the test to not trip the python bugs. And the only non-7-bit-clean byte in that file is in a comment, so maybe the simplest stupid thing that would work for now is to rewrite the comment to drop the é and instead use pure ASCII. A nicer fix would be teaching scripts/qapi/common.py to read input in an 8-bit-clean manner regardless of locale, but that's more invasive (and I'd rather have Markus' help on that front), so for 2.12, I'll just do the obvious bug fix of avoiding the problem by skipping the unfortunate comment.
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
[Prev in Thread] | Current Thread | [Next in Thread] |