[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sort: memory exhausted with 50GB file
From: |
Bob Proulx |
Subject: |
Re: sort: memory exhausted with 50GB file |
Date: |
Fri, 25 Jan 2008 14:23:55 -0700 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
Leo Butler wrote:
> -16 -2 -14 -5 1 1 0 0.3080808080808057 0 0.1540404040404028 0.3904338415207971
That should be fine.
> I have a dual processor machine, with each processor being an Intel Core 2
> Duo E6850, rated at 3GHz and cache 4096 kB, with 3.8GB total physical
> memory and 4GB swap space and two partitions on the hdd with 200GB and
> 140GB available space.
Sounds like a very nice machine.
> I am using sort v. 5.2.1 and v. 6.1 & v. 6.9. The former is installed as
> part of the RHEL OS and the latter two were compiled from the source at
> http://ftp.gnu.org/gnu/coreutils/ with the gcc v. 3.4.6 compiler.
All good so far. To nail down two more details, could you provide the
output of these commands?
uname -a
ldd --version | head -n1
file /usr/bin/sort ./sort
That will give us the kernel and libc versions. That last will report
whether the binary programs are 32-bit or 64-bit.
> When I attempt to sort the file, with a command like
>
> ./sort -S 250M -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data
> -T /data2 -o out.sort in.txt
>
> sort rapidly chews up about 40-50% of total physical memory (=1.5-1.9GB) at
> which point the error message 'sort: memory exhausted' appears. This
> appears to be independent of the parameter passed through the -S option.
> ...
> Is this an idiosyncratic problem?
That is very strange. If by idiosyncratic do you mean is this
particular to your system? Probably. Because I have routinely sorted
large files without problem. But that doesn't mean it isn't a bug.
At 50G the data file is very large compared to your 4G of physical
memory. This means that sort cannot sort it in memory. It will open
temporary files and sort a large chunk to one file and then another
and then another as a first pass splitting up the input file into many
sorted chunks. As a second pass it will merge-sort the sorted chunks
together into the output file.
What is the output of this command on your system?
sysctl vm.overcommit_memory
I am asking because by default the linux kernel overcommits memory and
does not return out of memory conditions. Instead the process (or
some other one) is killed by the linux out-of-memory killer. But
enterprise systems will be configured with overcommit disabled for
reliability reasons and that appears to be how your system is
configured because you wouldn't see a message about being out of
memory from sort otherwise. (I always disable overcommit so as to
avoid the out-of-memory killer.)
Do you have user process limits active? What is the output of this
command?
ulimit -a
What does free say on your system?
free
> I have read backlogs of the list and people report sort-ing 100GB
> files. Do you have any ideas?
Without doing a lot of debugging I am wondering if your choice of
locale setting is affecting this. I doubt it because all of the sort
fields are numeric. But because this is easy enough could you try
sorting using LC_ALL=C and see if that makes a difference?
LC_ALL=C sort -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data -T
/data2 -o out.sort in.txt
Also could you determine how large the process is at the time that
sort reports running out of memory? I am wondering if it is at a
magic number size such as 2G or 4G that could provide more insight
into the problem.
Bob
- sort: memory exhausted with 50GB file, Leo Butler, 2008/01/25
- Re: sort: memory exhausted with 50GB file,
Bob Proulx <=
- Re: sort: memory exhausted with 50GB file, Leo Butler, 2008/01/25
- Re: sort: memory exhausted with 50GB file, Paul Eggert, 2008/01/25
- Re: sort: memory exhausted with 50GB file, Jim Meyering, 2008/01/26
- Re: sort: memory exhausted with 50GB file, Leo Butler, 2008/01/26
- Re: sort: memory exhausted with 50GB file, Jim Meyering, 2008/01/26
- Re: sort: memory exhausted with 50GB file, Jim Meyering, 2008/01/26
- Re: sort: memory exhausted with 50GB file, Leo Butler, 2008/01/26
- Re: sort: memory exhausted with 50GB file, Paul Eggert, 2008/01/27
Re: sort: memory exhausted with 50GB file, Paul Eggert, 2008/01/25