[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word
From: |
John W. Eaton |
Subject: |
[Octave-bug-tracker] [bug #59992] regexp: behaviour of \> (end of a word) inconsistent with MATLAB |
Date: |
Tue, 2 Feb 2021 13:33:44 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0 |
Follow-up Comment #1, bug #59992 (project octave):
It looks like Octave is doing what Emacs does:
‘\>’
matches the empty string, but only at the end of a word.
‘\>’ matches at the end of the buffer only if the contents
end with a word-constituent character.
‘\w’
matches any word-constituent character.
The syntax table determines which characters these are.
While Matlab says
expr\>
Matches: The end of a word.
Example: '\w*e\>' matches any words ending with e.
The set of word-constituent characters in both Octave and Matlab appear to be
the set [a-zA-Z_0-9], but I guess Matlab allows an arbitrary character to be
considered as the final character in the word? Can it be more than one? For
example, what do the following expressions do?
[b, e] = regexp ('foo!+bar', '\w+\>')
[b, e] = regexp ('foo?!+bar', 'foo?!\>')
Is there an easy way to get PCRE to work differently here and allow any
character(s) to be treated specially as the end of a word?
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?59992>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/