emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TRAMP copies binary files incorrectly


From: Kenichi Handa
Subject: Re: TRAMP copies binary files incorrectly
Date: Thu, 11 Jan 2007 13:26:47 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.92 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)

In article <address@hidden>, Chris Moore <address@hidden> writes:

> I didn't have uudecode installed on the local machine, so TRAMP was
> using Emacs' Lisp version of uudecode, and using Emacs' write-region
> to save the results to a file.

> tramp.el is careful to bind coding-system-for-write to 'binary when
> writing the region:
>                    (let ((coding-system-for-write 'binary))
>                      (funcall loc-dec (point-min) (point-max))
>                      (write-region (point-min) (point-max) tmpfil))

> but unfortunately that's not enough to stop write-region playing with
> multi-byte characters - and that's probably the real bug.

> The " *tramp tmp*" buffer has coding-system-for-write set to 'binary,
> but also has enable-multibyte-characters set to t.

I think that's the problem.  How that buffer is created?
How the contents was inserted in that buffer?

> write-region uses
> fileio.c's e_write(), and that does the following, copying the
> buffer's value of enable-multibyte-characters into the coding system,
> before using it to write the region:

>       coding->src_multibyte
>         = !NILP (current_buffer->enable_multibyte_characters);

> My question is: should having the coding-system-for-write set to
> 'binary be enough to stop any multi-byte processing being done on
> write, regardless of the value of enable-multibyte-characters?  And if
> so, shouldn't we tell e_write() about it?

In a multibyte buffer, raw byte data, say 0x81, is
represented not by that byte itself but by 0x20 added and
with preceding special byte 0x9E (so the byte sequence is
0x9E 0xA1) to distinguish it from a normal multibyte
character (e.g. iso-8859-1's 0xC0 (A-grave) is represented
by 0x81 0xC0).  So, the writing process should convert 0x9E
0xA1 back to 0x81.  The flag coding->src_multibyte tells if
that kind of conversion is necessary or not.

> This patch demonstrates that it is enable-multibyte-characters which
> causes the problem, but I suspect that the bug really needs fixing in
> the C code:

> --- lisp/net/tramp.el 2007-01-11 01:19:46.000000000 +0100
> +++ lisp/net/new/tramp.el     2007-01-11 01:18:59.000000000 +0100
> @@ -3827,6 +3827,7 @@
>                    ;; line from the output here.  Go to point-max,
>                    ;; search backward for tramp_exit_status, delete
>                    ;; between point and point-max if found.
> +                  (set-buffer-multibyte nil)
>                    (let ((coding-system-for-write 'binary))
>                      (funcall loc-dec (point-min) (point-max))
>                      (write-region (point-min) (point-max) tmpfil))

No.  That change runs the function loc-dec in a unibyte
buffer after "0x9E 0xA1" being converted back to "0x81" by
(set-buffer-multibyte nil).  That make the difference.

But, as Stefan wrote, it is better to call
(set-buffer-multibyte nil) much earlier.

Anyway, it is better to fix the function bound to loc-dec to
work in a multibyte buffer too.  Which function is it?

---
Kenichi Handa
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]