bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: coding-system perfectionism locks user out


From: Dan Jacobson
Subject: Re: coding-system perfectionism locks user out
Date: 10 Feb 2002 09:24:27 +0800
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1

>>>>> "LSD" == Lee Sau Dan <danlee@informatik.uni-freiburg.de> writes:

>>>>> "Dan" == Dan Jacobson <jidanni@deadspam.com> writes:

Dan> I do 
Dan> $ lynx -dump 
http://www.geocities.com/Tokyo/Pagoda/3847/sapienti/hagfa99b.htm
Dan> > hagfa99b.txt 
Dan> $ emacs -q hagfa99b.txt I go into options>mule
Dan> and the choice to set coding system is blanked out... what's
Dan> worse, its keystroke isn't even mentioned in the menu.

LSD> What?  I  tried what you do.  No  problem, except that I  have to tell
LSD> Emacs that  this file is in BIG5  encoding.  You can do  that with C-x
LSD> RET  c chinese-big5  RET  C-x C-f  hagfa99b.txt  RET.  I  see what  is
LSD> expected in traditional Chinese characters.

OK,
1. the file must not already be in a buffer before you do this
command, else C-x C-f just switches to that buffer.

2. that command is not reproducible from C-x ESC ESC

3. If I modify the file a little, I still get challenged [即攔截] by
   emacs when I try to write the file, and am forced to pick a
   different coding system... it is clear that next time I read the
   file I will have to do the whole rigamarole again... I suppose I
   must use local-variables... but I didn't want to write in that
   file...

4. anyway, I've saved the file after picking one of the other choices,
   and it leaves a #backup#

Dan> Anyway, at this point the user would just see his data
Dan> garbled, with no pointers on what to do next.

LSD> Computers are not very clever.   They can't reliably tell English from
LSD> French.

well all I know is
$ less file
$ more file
etc. all give me what I want to see... emacs however won't cooperate.
[vi and ed won't either for this file... but if I mention that then I
won't seem as victimized, ruining my traditional posting theme :-) ]

Dan> [I then did M-! cat hagfa99b.txt and can then at least see
Dan> what I'm supposed to see [big5 chinese], but then don't
Dan> expect that I can save and then see the file again
Dan> correctly.]

LSD> Something  wrong  with  the  coding  systems setting.   Have  you  M-x
LSD> set-language-environment?   Apparently, your  process-coding-system is
LSD> correct,    because    the    M-!     output    is    decoded    using
LSD> process-coding-system's value.

It seems my describe-coding-system was already chinese-big5

LSD> For reading  a file, Emacs  could only make  a guess about  the coding
LSD> system.  Since the file you gave started with a short section of ASCII
LSD> only text,  and over half  of the file  contents are ASCII  only, what

I think all it takes is one non {big5/ascii} char to cause the frustration.

LSD> would you  expect the smartest  multi-lingual editor to  do?  Remember
LSD> what Knuth says: Computers are good at following instructions, but not
LSD> reading  your mind.  You're  more intelligent  that the  computer, and
LSD> hence you know it's Big5-encoded.  The computer is stupid and fails to
LSD> discover this.  So, it needs your help: C-x RET c chinese-big5 RET.

that solution is only lasts one view-file long.

Dan> This brings up the point: if the file is 99% big5, then why
Dan> not allow me to still handle it as 100% big5 if I
Dan> want...

LSD> But the  file you gave  was not 99%  big5.  It's less than  50%.  Over
LSD> half of it is ASCII.  (Well... yes, ASCII is a subset of big5, but I'm
LSD> talking about  big5-only characters here.)   I think Emacs  is correct
LSD> here not to conclude  that the file is in big5.  It  could be in other
LSD> encodings as well.

I think only one character can poison the whole file.

LSD> In your file, most of the lines look lie:

LSD>         yong1 央  yong1 氧  yong1 養  yong1 癢  yong1 盎  

LSD> in  which 25  bytes are  ASCII and  10 bytes  are  BIG5-specific.  I'm
LSD> ignoring whitespaces here.  10 out  of (10+25) is just 28.6%.  This is
LSD> a typical line from your file.  So, the actual figure should be around
LSD> this, and  a rough upper  bound is, IMO,  30%.  That's still  far from
LSD> 50%.  How come you claim it's 99% big5?  "big5-specific", I mean.


Dan> Why can't emacs be told "I live in big5 land.  

LSD> Have you already set-language-environment?

yes, describe-language-environment says chinese-BIG5

Dan> Sometimes I
Dan> have a giant file with one or two chars in it that cause
Dan> emacs to doubt that it is a big5 file.

LSD> If you're sure it is a big5-encode file, use C-x RET c ...

Dan> but I can't easily
Dan> because you think you are smarter than me and wont show it to
Dan> me in big5 mode, no matter what buttons i press".

LSD> No, Emacs  thinks its more stupid  than you.  So, instead  of making a
LSD> wild guess  that a file  containing only 30%  of big5-only bytes  is a
LSD> big5-encoded file,  it behaves conservatively.  And  since Emacs knows
LSD> its stupid, it  allows you to override its stupid  decision: C-x RET c
LSD> chinese-big5 RET C-x C-f hagfa99b.txt RET.

anyway, i think even 0.000001% non big5/ascii will mean the user
cannot dream of working comfortably with that file in emacs until the
file is purified.  Thank you for your  C-x RET c
chinese-big5 RET C-x C-f hagfa99b.txt RET suggestion... but the user
would need that each time opening the file... and writing the file is
a whole 'nother story... plus if the file is shared amongst a team,
writing a local variables section might not be allowed...

>>>>> "K" == Kenichi Handa <handa@etl.go.jp> writes:

K> The byte sequence `\231' `G' is not what Emacs treats as
K> Big5 code.  That's why Emacs doesn't detect that file as
K> big5.

OK, but does that mean emacs must be so strict to cause the file to be
unable to be used with emacs until it is first purified of all its bad
characters?

This seems dangerous: this means for a big5 user, one corrupted
character would cause an entire file to become unusable with emacs...

Also if set-coding-systems is turned off in the menu, the user feels
he is trapped with no choices on how to remedy the situation.  Some
alternate remedy path should be provided for the menu user... like
gnus has 'resend bounced mail' etc.  mule should have a 'attempt
coding system remedy....' menu item.
-- 
http://www.geocities.com/jidanni/ Taiwan(04)25854780



reply via email to

[Prev in Thread] Current Thread [Next in Thread]