qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qapi escape-too-big test doesn't work if LANG=C ?


From: Eric Blake
Subject: Re: [Qemu-devel] qapi escape-too-big test doesn't work if LANG=C ?
Date: Mon, 19 Mar 2018 10:20:44 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 03/19/2018 06:20 AM, Daniel P. Berrangé wrote:
On Mon, Mar 19, 2018 at 10:37:12AM +0000, Peter Maydell wrote:
I recently tweaked my build scripts to run with LANG=C (trying
to suppress gcc's irritating habit of using smartquotes rather
than plain old ''). This seems to result in an error running
the qapi-schema/escape-too-big test:

PYTHONPATH=/home/petmay01/linaro/qemu-for-merges/scripts python3 -B
/home/petmay01/linaro/qemu-for-merges/tests/qapi-schema/test-qapi.py
/home/petmay01/linaro/qemu-for-merges/tests/qapi-schema/escape-too-big.json
tests/qapi-schema/escape-too-big.test.out
2>tests/qapi-schema/escape-too-big.test.err; echo $?
tests/qapi-schema/escape-too-big.test.exit
1c1,10
< tests/qapi-schema/escape-too-big.json:3:14: For now, \u escape only
supports non-zero values up to \u007f
---
Traceback (most recent call last):
   File "tests/qapi-schema/test-qapi.py", line 64, in <module>
     schema = QAPISchema(sys.argv[1])
   File "scripts/qapi/common.py", line 1492, in __init__
     parser = QAPISchemaParser(open(fname, 'r'))
   File "scripts/qapi/common.py", line 264, in __init__
     self.src = fp.read()
   File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
     return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 61: 
ordinal not in range(128)
/home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:927:
recipe for target 'check-tests/qapi-schema/escape-too-big.json' failed

So your "C" locale will be non-UTF-8, except on OS-X where the "C" locale
is UTF-8 by default.

Unfortunately while POSIX expects the "C" locale to be 8-bit cleanup,
Python by default will reject any characters outside the 7-bit range
with its "ascii" codec. So this is ultimately a python bug, but there's
little we can do about that given how widely deployed the bug is.

Yep, the bug is in python. And we've already worked around it elsewhere in qemu; see commit d4e5ec877.


To workaround this problem in other applications what I have done is
add the following to Makefiles before invoking python:

     LC_ALL= LANG=C LC_CTYPE=en_US.UTF-8

The LC_ALL= bit is needed because if the user has set LC_ALL themselves
it will override LANG and all other LC_* variables. Setting LANG=C is
not strictly needed, as LC_CTYPE will override it.

CC'ing Eric since he was involved in the discussions about this bug
in other libvirt related apps.

So the question is how to fix the test to not trip the python bugs. And the only non-7-bit-clean byte in that file is in a comment, so maybe the simplest stupid thing that would work for now is to rewrite the comment to drop the é and instead use pure ASCII. A nicer fix would be teaching scripts/qapi/common.py to read input in an 8-bit-clean manner regardless of locale, but that's more invasive (and I'd rather have Markus' help on that front), so for 2.12, I'll just do the obvious bug fix of avoiding the problem by skipping the unfortunate comment.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]