[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GREP - reg exp to find words ending with .V and .TO
From: |
Matthew Woehlke |
Subject: |
Re: GREP - reg exp to find words ending with .V and .TO |
Date: |
Thu, 04 Jan 2007 11:11:32 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.9) Gecko/20061206 Thunderbird/1.5.0.9 Mnenhy/0.7.4.0 |
(Sorry if this double-posts, gmane claims to drop all but the last post
when one hasn't been authorized yet, and this one seems to have lost.)
swiftguy wrote:
Hello Experts,
Im trying to extract the words ending with .V or .TO from the following
list
[list snipped]
I tried the following options , but im not able to construct the correct
regular expression
$ grep -o '[ A-Z][a-z].V' StockList30.txt
$ grep -o '[ A-Z][a-z].TO' StockList30.txt
Well, no, that will match a space or uppercase letter (but only if your
locale is "C", otherwise be prepared for surprises), followed by a
lowercase letter (same comment), followed by e.g. ".V" (for the first
example; second will of course match ".TO" instead). If you need to
match WORDS, then you want:
$ grep '[:alpha:]\+\.\(V\|TO\)'
This matches one or more ('\+') letters(*) ('[:alpha:]') followed by '.'
('\.' <-- need to escape because '.' means 'any character'), followed by
either 'V' or 'TO' ('\(V\|TO\)'). The parentheses (which must be escaped
with '\' to be interpreted with special meaning rather than as literal
characters) form a sub-expression. The '|' (same note as the ()'s) means
'or' (you can use as many as needed, e.g. 'V\|TO\|KNR\|XYZVW').
If you need to match whole words, add '\b' (match word breaks, e.g.
whitespace, BOL/EOL, etc.) to the start and/or end of the expression as
needed. To match whole lines, add '^' (match BOL) the the beginning and
'$' (match EOL) to the end.
And read 'info bash' for more on regular expressions. :-)
(* '[:alpha:]' doesn't seem to be working for me, however. '[A-Za-z]'
*should* be safe if your collating order doesn't stick other funny
characters in there, but, as above, don't expect '[A-Z]' to only match
uppercase unless you set LC_ALL=C.)
--
Matthew
Caution: keep out of reach of adults.