bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#2790: emacs 22.1.1 cannot open 5GB file on 64GB 64-bit GNU/Linux box


From: Stefan Monnier
Subject: bug#2790: emacs 22.1.1 cannot open 5GB file on 64GB 64-bit GNU/Linux box
Date: Sun, 29 Mar 2009 16:10:26 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.91 (gnu/linux)

> The patch below does this:

>> -          || st.st_size > INT_MAX / 4)
>> +          /* Actually, it should test either INT_MAX or LONG_MAX
>> +             depending on which one is used for EMACS_INT.  But in
>> +             any case, in practice, this test is redundant with the
>> +             one above.
>> +             || st.st_size > INT_MAX / 4 */)
>> error ("Maximum buffer size exceeded");

> But what about the commentary immediately preceding the modified code:
>   The calculations below double the file size twice, so check that it
>   can be multiplied by 4 safely.

The patch also adds a comment explaining that this test is actually
redundant in practice (and it will stay redundant as long as our Lisp
integers have at least 2bits of tag).

> I'm not sure to which calculations it alludes, but if you think they
> are no longer relevant, please remove that part of the comment,
> otherwise we will wonder in a couple of years why the code does not do
> what the comment says it should.

Since I'm not sure either, I kept the comment and added another one
explaining why I removed the check anyway.

> Personally, I would change INT_MAX/4 to LONG_MAX/4, because that does
> TRT on all supported platforms, 32-bit and 64-bit alike (long and int
> are both 32-bit wide on 32-bit machines).  That would avoid too
> radical changes during a pretest, which is a Good Thing, IMO.

In that case I'd rather do the check more directly, e.g.:

    (((EMACS_INT)st.st_size)*4)/4 == st.st_size

But as explained, I'm not convinced the check is needed/useful.

>> Note also that when you open large files, it's worthwhile to use
>> find-file-literally to be sure it's opened in unibyte mode;
>> otherwise it gets decoded which takes ages.
> Perhaps the prompt we pop for large file should suggest visiting
> literally as an option.

Yes, that's also what I was thinking.  Together with having different
"large-threshold" values for unibyte and multibyte.

>> Also if the file has many lines (my
>> 800MB file was made up by copying a C file many times, so it had
>> millions of lines), turning off line-number-mode is is needed to recover
>> responsiveness when navigating near the end of the buffer.

> Perhaps we should make the default value of line-number-display-limit
> non-nil, at least in 64-bit builds.

Agreed.  We could even do something better:
- do it more efficiently (once computed for a page, it should be able
  to update the count instantly when paging up/down, whereas it seems
  not to always be able to do that).
- when computing really would take a lot of time (e.g. we're far from
  the closest known line position), display ??? and postpone the actual
  computation to some future idle time.

In any case, large file introduce lots of problem.


        Stefan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]