[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fcall_process: wrong conversion
From: |
Herbert Euler |
Subject: |
Re: Fcall_process: wrong conversion |
Date: |
Tue, 16 May 2006 10:59:10 +0800 |
This doesn't work. I've followed the code, seems the reason is as
follows.
You changed the code in hexl.el to:
(let ((coding-system-for-read 'raw-text)
(coding-system-for-write buffer-file-coding-system)
(buffer-undo-list t))
(apply 'call-process-region (point-min) (point-max)
(expand-file-name hexl-program exec-directory)
t t nil
;; Manually encode the args, otherwise they're encoded using
;; coding-system-for-write (i.e. buffer-file-coding-system) which
;; may not be what we want (e.g. utf-16 on a non-utf-16 system).
(mapcar (lambda (s) (encode-coding-string s
locale-coding-system))
(split-string hexl-options)))
So when invoking call-process, the value of `coding-system-for-write'
is not nil. In my test, it is `utf-16le-with-signature'. The
coding-decide part in callproc.c is line 269 to 300:
if (nargs >= 5)
{
int must_encode = 0;
for (i = 4; i < nargs; i++)
CHECK_STRING (args[i]);
for (i = 4; i < nargs; i++)
if (STRING_MULTIBYTE (args[i]))
must_encode = 1;
if (!NILP (Vcoding_system_for_write))
val = Vcoding_system_for_write;
else if (! must_encode)
val = Qnil;
else
{
args2 = (Lisp_Object *) alloca ((nargs + 1) * sizeof *args2);
args2[0] = Qcall_process;
for (i = 0; i < nargs; i++) args2[i + 1] = args[i];
coding_systems = Ffind_operation_coding_system (nargs + 1,
args2);
if (CONSP (coding_systems))
val = XCDR (coding_systems);
else if (CONSP (Vdefault_process_coding_system))
val = XCDR (Vdefault_process_coding_system);
else
val = Qnil;
}
val = coding_inherit_eol_type (val, Qnil);
setup_coding_system (Fcheck_coding_system (val), &argument_coding);
}
}
If `Vcoding_system_for_write' is not nil, `val' will be set to that
value. So at the last line of this code, `detector', `decoder', and
`encoder' field of `argument_coding' will be set to UTF-16 relative
ones, and CODING_REQUIRE_ENCODING_MASK flag is turned on for
`common_flags' of `argument_coding' in coding.c, line 5042 to 5059:
else if (EQ (coding_type, Qutf_16))
{
val = AREF (attrs, coding_attr_utf_16_bom);
CODING_UTF_16_BOM (coding) = (CONSP (val) ? utf_16_detect_bom
: EQ (val, Qt) ? utf_16_with_bom
: utf_16_without_bom);
val = AREF (attrs, coding_attr_utf_16_endian);
CODING_UTF_16_ENDIAN (coding) = (EQ (val, Qbig) ? utf_16_big_endian
: utf_16_little_endian);
CODING_UTF_16_SURROGATE (coding) = 0;
coding->detector = detect_coding_utf_16;
coding->decoder = decode_coding_utf_16;
coding->encoder = encode_coding_utf_16;
coding->common_flags
|= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK);
if (CODING_UTF_16_BOM (coding) == utf_16_detect_bom)
coding->common_flags |= CODING_REQUIRE_DETECTION_MASK;
}
Go back to line 410 to 427, callproc.c:
if (nargs > 4)
{
register int i;
struct gcpro gcpro1, gcpro2, gcpro3;
GCPRO3 (infile, buffer, current_dir);
argument_coding.dst_multibyte = 0;
for (i = 4; i < nargs; i++)
{
argument_coding.src_multibyte = STRING_MULTIBYTE (args[i]);
if (CODING_REQUIRE_ENCODING (&argument_coding))
/* We must encode this argument. */
args[i] = encode_coding_string (&argument_coding, args[i], 1);
new_argv[i - 3] = SDATA (args[i]);
}
UNGCPRO;
new_argv[nargs - 3] = 0;
}
`CODING_REQUIRE_ENCODING' test the following things (line 491 to 496,
coding.h):
/* Return 1 if the coding context CODING requires code conversion on
encoding. */
#define CODING_REQUIRE_ENCODING(coding) \
((coding)->src_multibyte \
|| (coding)->common_flags & CODING_REQUIRE_ENCODING_MASK \
|| (coding)->mode & CODING_MODE_SELECTIVE_DISPLAY)
Although `argument_coding.src_multibyte' may be 0,
`argument_coding.common_flags & CODING_REQUIRE_ENCODING_MASK' must be
non-zero in this case. So `CODING_REQUIRE_ENCODING
(&argument_coding)' will return true.
As a result, whether arguments are encoded with `encode-coding-string'
like in your change will not affect the conversion done by
`call-process'. Perhaps we should not set `coding-system-for-write'
in `let' special form in such conditions.
And there is another problem: if `locale-coding-system' is UTF-16, is
it correct to add prefix "\377\376" or "\376\377" to every command
argument? If not, the current code of `call-process' is wrong, since
it will always add the prefix.
Regards,
Guanpeng Xu
From: Stefan Monnier <address@hidden>
To: "Herbert Euler" <address@hidden>
CC: address@hidden
Subject: Re: Fcall_process: wrong conversion
Date: Mon, 15 May 2006 12:06:48 -0400
> - Create a file contains UTF-16 text, either UTF-16BE or UTF-16LE
> is OK. For example, create a file contains "a" in UTF-16LE as
> its content and name this file with "1".
[...]
> - In case the buffer is encoded with utf-16-le, the content is
> displayed as "a". Type M-x hexl-mode RET, the result is
> \377?: Invalid argument
> displayed in the buffer.
Thanks. I've installed the patch below which should fix the problem.
Please confirm,
Stefan
--- hexl.el 11 avr 2006 12:45:49 -0400 1.103
+++ hexl.el 15 mai 2006 12:02:32 -0400
@@ -704,7 +704,12 @@
(buffer-undo-list t))
(apply 'call-process-region (point-min) (point-max)
(expand-file-name hexl-program exec-directory)
- t t nil (split-string hexl-options))
+ t t nil
+ ;; Manually encode the args, otherwise they're encoded using
+ ;; coding-system-for-write (i.e. buffer-file-coding-system)
which
+ ;; may not be what we want (e.g. utf-16 on a non-utf-16
system).
+ (mapcar (lambda (s) (encode-coding-string s
locale-coding-system))
+ (split-string hexl-options)))
(if (> (point) (hexl-address-to-marker hexl-max-address))
(hexl-goto-address hexl-max-address))))
_______________________________________________
Emacs-devel mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/emacs-devel
_________________________________________________________________
Don't just search. Find. Check out the new MSN Search!
http://search.msn.com/
- Fcall_process: wrong conversion, Herbert Euler, 2006/05/15
- Re: Fcall_process: wrong conversion, Stefan Monnier, 2006/05/15
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/15
- Re: Fcall_process: wrong conversion, Stefan Monnier, 2006/05/15
- Re: Fcall_process: wrong conversion,
Herbert Euler <=
- Re: Fcall_process: wrong conversion, Kenichi Handa, 2006/05/16
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/16
- Re: Fcall_process: wrong conversion, Kenichi Handa, 2006/05/16
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/16
- Re: Fcall_process: wrong conversion, Kenichi Handa, 2006/05/17
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/18
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/18
- Re: Fcall_process: wrong conversion, Kenichi Handa, 2006/05/18
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/18
- Re: Fcall_process: wrong conversion, Herbert Euler, 2006/05/18