bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#25288: 25.1; term, ansi-term, broken output of utf8 text


From: npostavs
Subject: bug#25288: 25.1; term, ansi-term, broken output of utf8 text
Date: Wed, 28 Dec 2016 14:10:30 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)

found 25288 24.5
tags 25288 confirmed
quit

Vjacheslav <fvamail@gmail.com> writes:

> Trying to use this command from terminal running bash:
>
> [fva@localhost ~]$ python -c 'print "ш"*5000'
>
> produces garbage (шшш\321\210шшш) in output. Terminal needs
> reset. Possibly this is a bug which seen in very old linux, (breaks
> multibyte characters on buffer borders).
>
> default-process-coding-system is OK:
>
> default-process-coding-system is a variable defined in ‘C source code’.
> Its value is (utf-8-unix . utf-8-unix)

It looks like the problem is that the process filter function,
term-emulate-terminal, receives the output in chunks of 4096 bytes[1].  The
ш character is encoded in 2 bytes, which means it can be split across
chunks.

Is there a way to recognize incomplete decoding from lisp?  I can't see
any.


[1]: It's getting bytes rather than characters because in term-exec-1 we
have:

        ;; The process's output contains not just chars but also binary
        ;; escape codes, so we need to see the raw output.  We will have to
        ;; do the decoding by hand on the parts that are made of chars.
        (coding-system-for-read 'binary))






reply via email to

[Prev in Thread] Current Thread [Next in Thread]