bug-cssc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-cssc] bug-CSSC post from address@hidden requires approval


From: Joerg Schilling
Subject: Re: [Bug-cssc] bug-CSSC post from address@hidden requires approval
Date: Sun, 08 May 2011 23:13:12 +0200
User-agent: nail 11.22 3/20/05

James Youngman <address@hidden> wrote:

> > In these case, the x flag is probably unimportant and this specific SCO
> > extension is less than 20 years old anyway.
>
> I don't see it like that right now, since somebody actually asked me
> to implement the flag for compatibility with the OpenServer
> implementation.

Wouldn't it be simpler to just chmod +x SCCS/s.file if grep '^^Af x' succeeds?

> > What is typical and what is small?
>
> My experience is that data modification rates are the most important
> factor, so a smaller but faster-mutating source base can be more at
> risk than a larger, static one.    I've never seen data corruption in
> a source repository, but only in other kinds of data.   For other
> data, I've seen failures from both cksum  (trivially easy to
> demonstrate) and Adler32 (mostly much more reliable than any 16-bit
> system, but I've seen this fail on double-bit flips in faulty network
> hardware).

I a not going to argue that there will be no such double flip problem. I just 
did not ever notice one (probably because  there are typicalls small changes 
and because the checksum test us run on every SCCS operation) and I myself like 
to introduce a better checksum in case the history format is changed in an 
incompatible way.


> > I am sure you are mistaken: The time to unpack any release from a SCCS 
> > history
> > file only depends on the size of the history file but not really from the
> > number of deltas. If you have 99999 deltas (which is expected to take more 
> > than
> > 200 years to create), then it may be a bit slower but there is no relation 
> > to the
> > higher time needed by RCS.
>
> Performance in reading an SCCS file is trivially demonstrated to
> depend linearly on the number of deltas because you have to read the
> whole delta table.  But since the metadata is normally smaller than
> the data, the constant factor is almost always small enough that this
> won't dominate.   However, as we read the file, we need to decide if
> any given I/E/D line means we should use the data lines we're reading.
>   That decision for each control line is either going to be O(1) but
> require O(N) setup (i.e. initialising and checking a bitmap) or be
> O(ln N) (i.e. using some kind of tree).
>
> This is probably overcomplicating the issue though.   Basically with
> the SCCS file format there is no way to get sublinear performance.

In former times, CPUs have been slow and file  I/O was subobtimal. 
Tgings look different now.

> > The problem with the SCCS file format is that there have been false and
> > unproven claims from the people behind RCS. Larry McVoy was the first who 
> > did
> > tests and he discovered, that RCS is not faster than SCCS. Even today, many
> > people correctly believe (besides the speed) that the SCCS history file 
> > format is
> > one of the best formats.
>
> Any problems with the SCCS file format are to do with its file format
> only.   Incorrect claims about RCS are irrelevant.

I cannot prove whether the claims from the RCS people have been wrong in the 
mid 1980s. Today they are of course wrong.

In the mid 1980s, CPUS have been 10000x slower than they are today. If you did 
run on a AT&T UNIX with the V7 filesystems the file I/O was very bad also.

On such a machine, it may have been that RCS could have been a bit faster from 
not reading the whole file. With the BSD filesystem (UFS) or newer filesystems 
like ZFS, things are different. There no longer is any difference whether you 
read parts of a history file or the whole file if you look at typical 
constraints in history files.

With the current CPU speeds, it also no longer makes a real difference whether 
you need to check more complex data structures for a warve with many deltas or 
whether you just read the file.

The fact that you need to apply the deltas in a RCS file step by step however 
did not change over the time. 

> > Did you make tests and do you believe that there is a real performance 
> > problem?
>
> Essentially no, not at the individual file level, because the number
> of deltas is small enough that this isn't going to be a killer.  While
> linearity in the total number of deltas is a fundamental limitation of
> the SCCS file format, it's unlikely to be the bottleneck in terms of
> wall-clock execution time.

So it seems that you concur.


Jörg

-- 
 EMail:address@hidden (home) Jörg Schilling D-13353 Berlin
       address@hidden                (uni)  
       address@hidden (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily



reply via email to

[Prev in Thread] Current Thread [Next in Thread]