[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CVS and unicode
From: |
ai26 |
Subject: |
Re: CVS and unicode |
Date: |
Sun, 11 Sep 2005 00:22:13 +0200 |
In a message of Sat, 10 Sep 2005 17:52:09 +0200
Received on Sat, 10 Sep 2005 18:07:17 +0200
Christian Hujer <address@hidden> wrote to address@hidden
>Am Samstag, 10. September 2005 16:04 schrieb Spiro Trikaliotis:
>> Hello Christian,
>>
>> * On Sat, Sep 10, 2005 at 12:38:19PM +0200 Christian Hujer wrote:
>> > Currently, CVS has extremely tolerant behaviour regarding binary files
>> > which were accidently added as text files. As long as they do not contain
>> > keywords (like $Id...$), they are extremely likely to still be handled
>> > conveniently.
>>
>> This is true for Unix based systems, but not for systems where CR/LF is
>> the usual line ending. Checking in a binary file from a Windows system,
>> you have very good chances to break it if there is a CR/LF anywhere
>> inside of it.
>>
>> For non-CR/LF machines, checking in binary files without -kb does
>> not do any harm even if there are keywords ($Id$, for example) inside of
>> it. CVS checks them in "as-is" and only expands the keywords on
>> checkout. Thus, if you forgot doing the -kb on checkin, just set the
>> state afterwards with cvs admin and check the file out again.
>>
>> As told, this is NOT true for CR/LF based systems.
>It's even true for CRLF. The CRLF byte sequences are:
Christian,
you are still missing the point even with Spiro's explanation. A
non-Unix cvs *client* will convert any CR/LF sequence of the sandbox
file into a plain LF in the ,v repository file on checkin. Therefore, a
binary file not checked in with -kb will loose every 0x0D that preceedes
a 0x0A. And you can't restore them since you don't know which 0x0A was
preceeded by a 0x0D and which one wasn't. It's a binary file after all.
Of course if your binary never had 0x0D 0x0A sequences then you are fine
with admin -kb but you generally can't assume they don't occur.
>ASCII: 0x0D 0x0A.
>UTF-8: 0x0D 0x0A.
>UTF-16 LE: 0x0D 0x00 0x0A 0x00.
>UTF-16 BE: 0x00 0x0D 0x00 0x0A.
>
>CVS will not interfer with any of these.
>UTF-16LE sequence will be split within the LF char. But since the next line
>will be split at exactly the same point, this is not a problem for line
>diffs.
>
>Also, CVS behaves very fine when using CR/LF (though I regard CR/LF being
>deprecated for various other reasons), independently of the encoding (at
>least those encodings discussed here).
>
Michael
RE: CVS and unicode, Arthur Barrett, 2005/09/05
RE: CVS and unicode, Arthur Barrett, 2005/09/06
RE: CVS and unicode, Arthur Barrett, 2005/09/07
Re: CVS and unicode,
ai26 <=
RE: CVS and unicode, Arthur Barrett, 2005/09/12