[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: diff's type used for holding line numbers
From: |
Andreas Gruenbacher |
Subject: |
Re: diff's type used for holding line numbers |
Date: |
Sun, 5 Apr 2009 18:40:20 +0200 |
User-agent: |
KMail/1.9.9 |
On Sunday, 5 April 2009 18:00:21 Bruno Haible wrote:
> Andreas Gruenbacher wrote about the type used to hold line numbers
> in 'diff':
> > > %td is for ptrdiff_t, not off_t.
> >
> > Exactly, and ptrdiff_t should be machine word size, which is what we want
> > here, right?
>
> No. ptrdiff_t may be too small.
>
> ISO C 99 section 6.5.6.(9) says that ptrdiff_t is the type for the
> difference of two pointers into the *same array*. There is no requirement
> that ptrdiff_t is near the size of available RAM. For example, you can have
> platforms where arrays are limited to 4 GB (or 2 GB) in size, but there is
> 64 GB available RAM, and the user wants to diff two files of size 5 GB,
> each of them consisting mostly of newlines. The line numbers will be >
> 2^32.
Alright. Diff uses arrays for storing the files it compares and for computing
the diff so it can only compare files whose line numbers fit into ptrdiff_t.
Still safe.
Patch is the weird beast here though: its Plan B creates an index of line
offsets into the input file and doesn't keep the entire file in memory.
That's crazy, but it seems to work (when trying on small files with
--debug=16 at least).
> off_t is guaranteed to be sufficiently large, because a file cannot
> contain more lines than it contains bytes on disk.
Okay, thanks for explaining!
Andreas