Re: removing blank lines: "grep ." is really slow

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: removing blank lines: "grep ." is really slow

From:	Paolo Bonzini
Subject:	Re: removing blank lines: "grep ." is really slow
Date:	Fri, 16 Apr 2010 09:37:09 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3

On 04/16/2010 02:04 AM, Ivan wrote:

I used to use

grep .

for removing blank lines, until I realized how slow it is for large
numbers of lines. So I switched to

grep -v '^$'

, which is as fast as one would expect (well, not with the grep that
comes with MacOSX 10.5.8 (GNU grep version 2.5.1), but this seems to
have been fixed sometime between 2.5.1 and 2.6.3).

True. You'd need to expand UTF-8 period characters to the appropriatecharacter sets, then you can use the faster single-byte character setmatcher. It's on my todo list.

It wouldn't be exactly as fast as your grep -v solution (which isoptimal and preferred) however, because it will check that a characterin the line is a valid UTF-8 character. In particular it would be slowand have false negatives if you're document is not UTF-8.

You can also use "LC_ALL=C grep .", that would be fast and exactlyequivalent to "grep -v '^$'".


Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

removing blank lines: "grep ." is really slow, Ivan, 2010/04/15
- Re: removing blank lines: "grep ." is really slow, Paolo Bonzini <=
  - Re: removing blank lines: "grep ." is really slow, Ivan, 2010/04/18
    - Re: removing blank lines: "grep ." is really slow, Paolo Bonzini, 2010/04/19
    - Re: removing blank lines: "grep ." is really slow, Paul Eggert, 2010/04/23
    - Re: removing blank lines: "grep ." is really slow, Paolo Bonzini, 2010/04/24

Prev by Date: removing blank lines: "grep ." is really slow
Next by Date: [patch #7147] Recursive should default to current dir instead of stdin
Previous by thread: removing blank lines: "grep ." is really slow
Next by thread: Re: removing blank lines: "grep ." is really slow
Index(es):
- Date
- Thread