[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug classpath/25976] New: Regex tokenizing
From: |
mike at saxonica dot com |
Subject: |
[Bug classpath/25976] New: Regex tokenizing |
Date: |
26 Jan 2006 15:22:17 -0000 |
Consider the following program:
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TokenizeTest {
public static void main(String[] args) {
String input = "The cat sat on the mat";
int prevEnd = 0;
Matcher matcher = Pattern.compile("[\\x20\\n\\r\\t]+").matcher(input);
while (matcher.find()) {
System.err.println("Match at " + matcher.start() + ": " +
input.substring(prevEnd, matcher.start()));
prevEnd = matcher.end();
}
System.err.println("Remainder: " + input.substring(prevEnd));
}
}
With Sun JRE the output is:
Match at 3: The
Match at 7: cat
Match at 11: sat
Match at 14: on
Match at 18: the
Remainder: mat
With GNU Classpath it is:
Remainder: The cat sat on the mat
Michael Kay
--
Summary: Regex tokenizing
Product: classpath
Version: 0.20
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: classpath
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: mike at saxonica dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25976
- [Bug classpath/25976] New: Regex tokenizing,
mike at saxonica dot com <=