emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dired doesn't work properly with a multibyte locale


From: Kenichi Handa
Subject: Re: dired doesn't work properly with a multibyte locale
Date: Wed, 15 Jan 2003 19:43:55 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

Sorry for the late reply.

In article <address@hidden>, Miles Bader <address@hidden> writes:

> I'm now using a multibyte locale (LANG=ja_JP.eucJP), and dired is
> screwed up: it can't properly find filenames in the directory listing.

> The reason seems to be that dired uses `ls --dired', which encodes the
> positions of filenames as byte-offsets into the ls output.  However, my
> system's `ls' program sees the non-C LANG, and so the `total' line at the
> beginning of the ls output is now a multibyte-encoded word.  Emacs decodes
> this fine, but the number of characters in the decoded word is _not_ the
> same as the number of bytes in the original ls output, so all the offsets
> from --dired are wrong.  [note that if there are multibyte-encoded
> filenames, the offsets will get screwed up further later in the listing]

> It doesn't seem simple to get the byte offset information, so perhaps the
> best thing to do is simply not use --dired if `file-name-coding-system' is
> a multibyte encoding.  That change is simple to make in dired (and I just
> manually set `dired-use-ls-dired' to nil), but I'm not sure how to tell if
> a particular coding system is multibyte or not.  It'd be nice if there was
> a function like `coding-system-multibyte-p'...

Even if we have such a function, it's very hard to correct
the byte offset information for a multibyte coding system.

Miles Bader <address@hidden> writes:
> On Sat, Jan 11, 2003 at 03:00:12PM -0500, Stefan Monnier wrote:
>>  > It doesn't seem simple to get the byte offset
>>  > information, so perhaps the best thing to do is simply
>>  > not use --dired if `file-name-coding-system' is a
>>  > multibyte encoding.  That change is simple to make in
>>  > dired (and I just manually set `dired-use-ls-dired' to
>>  > nil), but I'm not sure how to tell if a particular
>>  > coding system is multibyte or not.  It'd be nice if
>>  > there was a function like
>>  > `coding-system-multibyte-p'...
>>  
>>  The other solution is to get "ls --dired" output with a "binary"
>>  coding system, then use the byte-offsets to add text-properties, and
>>  then do the decode-coding-region.

Yes.  I think that is the correct fix.

> Won't the decode-coding-region smash all the text-properties?

It surely removes all text properties.  But, we can preserve
the text-property `dired-filename' by decoding one bunch by
one.  Could you please try the attached patch?  I have not
yet installed it because I don't have such a system at hand
and can't test it.

---
Ken'ichi HANDA
address@hidden

2003-01-15  Kenichi Handa  <address@hidden>

        * files.el (insert-directory): Read the output of "ls" by
        no-conversion, and decode it later while preserving
        `dired-filename' property.

*** files.el.~1.630.~   Wed Jan 15 13:12:22 2003
--- files.el    Wed Jan 15 17:44:45 2003
***************
*** 4017,4028 ****
  
          ;; Read the actual directory using `insert-directory-program'.
          ;; RESULT gets the status code.
!         (let* ((coding-system-for-read
                  (and enable-multibyte-characters
                       (or file-name-coding-system
!                          default-file-name-coding-system)))
!                ;; This is to control encoding the arguments in call-process.
!                (coding-system-for-write coding-system-for-read))
            (setq result
                  (if wildcard
                      ;; Run ls in the directory part of the file pattern
--- 4017,4031 ----
  
          ;; Read the actual directory using `insert-directory-program'.
          ;; RESULT gets the status code.
!         (let* (;; We at first read by no-conversion, then after
!                ;; putting text property `dired-filename, decode one
!                ;; bunch by one to preserve that property.
!                (coding-system-for-read 'no-conversion)
!                ;; This is to control encoding the arguments in call-process.
!                (coding-system-for-write 
                  (and enable-multibyte-characters
                       (or file-name-coding-system
!                          default-file-name-coding-system))))
            (setq result
                  (if wildcard
                      ;; Run ls in the directory part of the file pattern
***************
*** 4105,4110 ****
--- 4108,4130 ----
              (goto-char end)
              (beginning-of-line)
              (delete-region (point) (progn (forward-line 2) (point)))))
+ 
+         ;; Now decode what read if necessary.
+         (let ((coding (or coding-system-for-write
+                           (detect-coding-region beg (point) t)))
+               val pos)
+           (if (not (eq (coding-system-base coding) 'undecided))
+               (save-restriction
+                 (narrow-to-region beg (point))
+                 (goto-char (point-min))
+                 (while (not (eobp))
+                   (setq pos (point)
+                         val (get-text-property (point) 'dired-filename))
+                   (goto-char (next-single-property-change
+                               (point) 'dired-filename nil (point-max)))
+                   (decode-coding-region pos (point) coding)
+                   (if val
+                       (put-text-property pos (point) 'dired-filename t))))))
  
          (if full-directory-p
              ;; Try to insert the amount of free space.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]