bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sed error message reports byte position instead of char position whe


From: John Cowan
Subject: Re: sed error message reports byte position instead of char position when program contains UTF-8
Date: Thu, 16 May 2013 10:21:39 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

Eli Zaretskii scripsit:

> Yes, mostly.  But how do you know what is the encoding of the input
> files?

If you don't know that, you don't know how to interpret regular
expressions against the text of the file, because you don't know what
characters it contains.  Even in seemingly trivial cases, like "sed
s/abc/def/", you have no idea what to do if you don't know whether the
file is ASCII or EBCDIC.  For that matter, even "sed 2p" will not work
correctly if you don't know the encoding of the newline character.

So either you do just use the locale, or sed needs an option to specify
the file encoding (in which case it should also provide the command
line encoding).

-- 
Well, I have news for our current leaders       John Cowan
and the leaders of tomorrow: the Bill of        address@hidden
Rights is not a frivolous luxury, in force      http://www.ccil.org/~cowan
only during times of peace and prosperity.
We don't just push it to the side when the going gets tough.  --Molly Ivins



reply via email to

[Prev in Thread] Current Thread [Next in Thread]