bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: efficient version of 'sort | uniq -c | sort -n'?


From: Matthew Woehlke
Subject: Re: efficient version of 'sort | uniq -c | sort -n'?
Date: Mon, 21 May 2007 14:03:17 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.10) Gecko/20070221 Thunderbird/1.5.0.10 Mnenhy/0.7.4.0

James Youngman wrote:
On 5/21/07, Matthew Woehlke <address@hidden> wrote:
Is there an efficient implementation of 'sort | uniq -c | sort -n'? I
have a 4 GB core file I want to run 'strings' on, and the above is
really slow.

I would suggest that the appropriate factorisation would be

countitems | sort -n

Here, countitems could be "sort" with some options or "uniq" with some
options...

I thought about that, but /maximum/ efficiency is only achievable doing everything in one go. Anyway I think 'countitems' would still be a big improvement; I would do that as 'sort --unique-with-count' (preferably aliased 'sort -U') since IMO this is a missing feature of 'sort -u'.

--
Matthew
When in doubt, duct tape!





reply via email to

[Prev in Thread] Current Thread [Next in Thread]