bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GREP - reg exp to find words ending with .V and .TO


From: Matthew Woehlke
Subject: Re: GREP - reg exp to find words ending with .V and .TO
Date: Thu, 04 Jan 2007 11:11:32 -0600
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.9) Gecko/20061206 Thunderbird/1.5.0.9 Mnenhy/0.7.4.0

(Sorry if this double-posts, gmane claims to drop all but the last post when one hasn't been authorized yet, and this one seems to have lost.)

swiftguy wrote:
Hello Experts,

Im trying to extract the words ending with .V or .TO  from the following
list
[list snipped]

I tried the following options , but im not able to construct the correct
regular expression
$ grep -o '[ A-Z][a-z].V' StockList30.txt
$ grep -o '[ A-Z][a-z].TO' StockList30.txt

Well, no, that will match a space or uppercase letter (but only if your
locale is "C", otherwise be prepared for surprises), followed by a
lowercase letter (same comment), followed by e.g. ".V" (for the first
example; second will of course match ".TO" instead). If you need to
match WORDS, then you want:

$ grep '[:alpha:]\+\.\(V\|TO\)'

This matches one or more ('\+') letters(*) ('[:alpha:]') followed by '.'
('\.' <-- need to escape because '.' means 'any character'), followed by
either 'V' or 'TO' ('\(V\|TO\)'). The parentheses (which must be escaped
with '\' to be interpreted with special meaning rather than as literal
characters) form a sub-expression. The '|' (same note as the ()'s) means
'or' (you can use as many as needed, e.g. 'V\|TO\|KNR\|XYZVW').

If you need to match whole words, add '\b' (match word breaks, e.g.
whitespace, BOL/EOL, etc.) to the start and/or end of the expression as
needed. To match whole lines, add '^' (match BOL) the the beginning and
'$' (match EOL) to the end.

And read 'info bash' for more on regular expressions. :-)

(* '[:alpha:]' doesn't seem to be working for me, however. '[A-Za-z]'
*should* be safe if your collating order doesn't stick other funny
characters in there, but, as above, don't expect '[A-Z]' to only match
uppercase unless you set LC_ALL=C.)

--
Matthew
Caution: keep out of reach of adults.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]