help-source-highlight
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-source-highlight] regexp in javascript.lang (3rd try!)


From: Lorenzo Bettini
Subject: Re: [Help-source-highlight] regexp in javascript.lang (3rd try!)
Date: Sun, 21 Dec 2008 18:39:54 +0100
User-agent: Thunderbird 2.0.0.18 (X11/20081125)

address@hidden wrote:
Last time I suggested an ugly regexp definition for
javascript.lang to avoid matching /* */ comments:

http://lists.gnu.org/archive/html/help-source-highlight/2008-09/msg00000.html

On second thought (or third thought) I don't like this because it
matches cases where there are two division operators in a single
expression, such as:

document.write('<table><tr><td>25% = '+(25/100)+'</td></tr></table>');

Here is a proposed javascript.lang to fix these problems.  It does the
following:
* first check if the input matches a comment
* next check if it matches a division operator, which can occur only
after a number, an identifier, or certain symbols
* finally check if it matches a regular expression

Note that it is no longer based on the java.lang because the order of
the definitions is important. (Hence, this would not work with source-highlight 2.10, where the matching algorithm was different, but does work with source-highlight 2.11.)

The disadvantages:
* it no longer reuses java.lang
* the division operator definitions are ugly
The advantages:
* it works in all possible cases (I hope)
* it simplifies the regexp definition

What do you think?

include "c_comment.lang"

keyword =
"abstract|break|case|catch|class|const|continue|debugger|default|delete|do|else|enum|export|extends|false|final|finally|for|function|goto|if|implements|in|instanceof|interface|native|new|null|private|protected|prototype|public|return|static|super|switch|synchronized|throw|throws|this|transient|true|try|typeof|var|volatile|while|with"

(symbol,normal,symbol) = `(\+\+|--|\)|\])(\s*)(/=?(?![*/]))`
(number,normal,symbol) =
`(0x[[:xdigit:]]+|(?:[[:digit:]]*\.)?[[:digit:]]+(?:[eE][+-]?[[:digit:]]+)?)(\s*)(/(?![*/]))`
(normal,symbol) = `([[:alpha:]$_][[:alnum:]$_]*\s*)(/=?(?![*/]))`

regexp = '/(\\.|[^*\\/])(\\.|[^\\/])*/[gim]*'

include "number.lang"

include "c_string.lang"

include "symbols.lang"

cbracket = "{|}"

include "function.lang"


Actually it works also this way, and it reuses most of java.lang (see the attached file);

what do you think?

cheers
        Lorenzo

--
Lorenzo Bettini, PhD in Computer Science, DI, Univ. Torino
ICQ# lbetto, 16080134     (GNU/Linux User # 158233)
HOME: http://www.lorenzobettini.it MUSIC: http://www.purplesucker.com
http://www.myspace.com/supertrouperabba
BLOGS: http://tronprog.blogspot.com  http://longlivemusic.blogspot.com
http://www.gnu.org/software/src-highlite
http://www.gnu.org/software/gengetopt
http://www.gnu.org/software/gengen http://doublecpp.sourceforge.net
# Javascript lang definition file

# first check if the input matches a comment
include "c_comment.lang"

# next check if it matches a division operator, which can occur only
# after a number, an identifier, or certain symbols
(symbol,normal,symbol) = 
        `(\+\+|--|\)|\])(\s*)(/=?(?![*/]))`
(number,normal,symbol) =
        
`(0x[[:xdigit:]]+|(?:[[:digit:]]*\.)?[[:digit:]]+(?:[eE][+-]?[[:digit:]]+)?)(\s*)(/(?![*/]))`
(normal,symbol) = 
        `([[:alpha:]$_][[:alnum:]$_]*\s*)(/=?(?![*/]))`

# finally check if it matches a regular expression 
regexp = '/(\\.|[^*\\/])(\\.|[^\\/])*/[gim]*' 

include "java.lang"

subst keyword = 
"abstract|break|case|catch|class|const|continue|debugger|default|delete|do|else|enum|export|extends|false|final|finally|for|function|goto|if|implements|in|instanceof|interface|native|new|null|private|protected|prototype|public|return|static|super|switch|synchronized|throw|throws|this|transient|true|try|typeof|var|volatile|while|with"

reply via email to

[Prev in Thread] Current Thread [Next in Thread]