monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Looking at the code affected in bug 9752 leaves a weird


From: Richard Levitte - VMS Whacker
Subject: [Monotone-devel] Looking at the code affected in bug 9752 leaves a weird taste...
Date: Wed, 18 Aug 2004 02:15:31 +0200 (CEST)

Hi,

I'm currently looking at bug 9752, and it has left me with the feeling
that the code to handle line endings is a hack, and wondering what the
lua hook get_linesep_conv is really needed for (why isn't
get_system_linesep enough?).

>From looking at the lua hook get_linesep_conv, it looks like it's
possible to customize the standard line ending in the database.
However, I think that would be superbly dangerous, when considering
that several databases coming from people upholding different line
ending standards would be synced together...  The thought makes me
cringe.

Then, looking at the function split_into_lines(), it's obvious that it
hasn't been designed with CRLF in mind.  For example, with the line
"foo\r\n", it will end up with the tokens, "foo" and "", interpreted
as two lines.  This is the actual cause of the trouble detected in bug
9752.  Actually, the boost token functions aren't enough to be able to
detect CRLF, LF and CR line endings, so something like
boost::char_separator, but taking a vector of strings instead of a
string of separators is needed.

Unfortunaly, just changing the internals of split_into_lines() isn't
enough.  It really needs to get an external specification of what the
line endings should be on the local system, for each file.  If we
consider the changes that I've made so far, i.e. mostly having
split_into_lines() properly detect CRLF as *one* separator instead of
two, there's still a small difficulty with the following scenario on
systems that uses LF as a separator:

  a file (say, "foo") with the line "foo\n" is changed so the same
  line now reads "foo\r\n".

The calculated manifest will detect that a change has been made,
because it uses read_localized_data(), which won't make any conversion
since I haven't defined the lua hook get_linesep_conv.

The diff, however, will look a bit weird, since the input is split
into lines, using the now changed split_into_lines():

  : ; ~/monotonework/off.net/monotone/monotone.levitte.misc/monotone diff 
  # Old manifest: e61621c74656cb80da9680bf6f9cdc7b15ec0085
  # New manifest: 00ba31c49a263b709e475ce6ac1ab5df644494d2
  # Summary of changes:
  # 
  #   patch foo
  #    from f1d2d2f924e986ac86fdf7b36c94bcdf32beec15
  #      to 855426068ee8939df6bce2c2c4b1e7346532a133
  # 
  --- foo
  +++ foo

It *should* show that the line has changed, as well.  The only way
that I can see getting a correct result is to change
split_into_lines() to take the desired line separator as an argument.

This takes me back to get_linesep_conv.  Why, exactly, whould it be
able to specify what the line separator should be in the database?
Why isn't get_system_linesep enough, especially if it can be given a
file name as argument?

-----
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte                         address@hidden
                                        http://richard.levitte.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]