bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#11216: 23.4; parenthesis matching breaks on certain complex expressi


From: Nathan Trapuzzano
Subject: bug#11216: 23.4; parenthesis matching breaks on certain complex expressions
Date: Tue, 10 Apr 2012 20:18:47 -0400

Here's a complex regular expression that breaks parenthesis matching
(and yes, that's a real regular expression generated from a real perl
program).


M[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*H[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*\=[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*N(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c])
 (?<!S\d)(?<!\-\ [@"]\d\ 
[\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])(?:A\)[\/\\]|\*\)[\/\\]A[\=\/\\]?)[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*I[\=\/\\]?[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*D[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[\=\/\\]?(?![\x21\x27\x2a\x2d\x2f\x3d\x41-\x5a\x5c\x61-\x7a\x7c])
 (?<!S\d)(?<!\-\ [@"]\d\ 
[\x80-\xff])(?<!\-[\x80-\xff][\x80-\xff])(?<!\-[\x80-\xff])(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c]\*)(?<![\x27-\x29\x2f\x3d\x41-\x5a\x7c])Q[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*E[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*A[?+]?(?:-(?:[\x01-\x7f]*[\x00\x80-\xff]+))?[\x02-\x19\x22-\x27\x28-\x2e\x30-\x3c\x3e-\x40\x5b\x5d-\x7b\x7d-\xff]*[\/\\]

Matching gets messed up with the open parenthesis immediately following
the first (?<!S\d). I suspect this is due to the opening double-quote
about 10 characters later.

I noticed that the behavior of show-paren-mode changes depending on the
major mode. For example, the behavior described above happens in
fundamental mode, whereas when I switch to text mode, quotation marks
are ignored. However, switching to text mode also causes paren-matching
to ignore back-slashes and thus escaped parentheses/brackets. I think
the best fix would be to enable customization of show-paren-mode so
that the user can specify which characters should be ignored when
matching parentheses.

I've also attached a file containing the regexp in question in case the long 
line gets broken up over mail transmission.

Attachment: regexp.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]