classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: java.net.URI implementation


From: Giannis Georgalis
Subject: Re: java.net.URI implementation
Date: 11 Feb 2003 02:39:54 +0200
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50

Dalibor Topic <address@hidden> writes:

> I doubt adding basic IPv6 parsing to the regexp used
> should pose significant problems.
> 
> > For example the
> > uri : "http://1333.2123.232323.0.9.9~84.1"; is not
> > valid, but can be
> > parsed from this regexp.
> 
> You are mixing things up here. That's a valid URI.
> Sun's JDK 1.4.1_01 on linux prints for a trivial test
> program:
 
> Here's my question for you, as you've said you've read
> the URI RFCs: which section of the URI RFC does the
> URI you considered not valid violate?

I'm sorry, I was thinking of the "host" part of the rule (i missed the
"reg_name"). However any URI that contains spaces where it shouldn't
is considered valid for this regex. To be honest, I only had a quick
look to your implementation and with the above example I thought
that you assigned the host to that 1333.21... number.

> That's nice. But it's overkill. 
> 
> You can achieve the same effect by using the regexp to
> separate URI components and doing some post-processing
> (preferably using simple regexps) on the generated
> Strings to ensure they contain only allowed
> characters, to get the port number of hierarchical
> URIs etc.

This regexp only breaks down the problem in smaller parts. And the
post-processing you are talking about, is parsing that small parts to
ensure that the uri is valid (and to fill the host, userinfo,
etc. Strings).

> I could have implemented URI parsing using a parser
> generator, but it seemed to me like the wrong solution
> to the problem: instead of simple regexp and 20 lines,
> you get a compile time dependency on a parser
> generator, x lines for the grammar + y lines for the
> generated code. I think your grammar alone is bigger
> than my parsing code.

Yes I decided to make the parser hand-written. I agree that my
initial thought was not wise. But as you state, your implementation
was not full, so do not compare it with a proper parser because it is
not.

> I would humbly propose using my code and fixing its
> shortcomings, but I can't force anyone to use it ;) I
> know fully well that it is not a full implementation
> of java.net.URI (and I think I've stated that in the
> mail accompanying the patch), but it is a good
> starting point, in my opinion. It's certainly good
> enough to run Saxon 7.3 on kaffe ;)

Provided that classpath's java.util.regex.Matcher,
java.util.regex.Pattern classes are not implemented, I wouldn't want
to use regexps. Though I can follow the same philosophy and break the
initial problem in the smaller parts you are suggesting and then go
on from there, so it would be easy, when the regex classes are
implemented and proved more efficient, to switch.
No hard feelings ;-)

-- 
 Object-oriented programming is an exceptionally bad 
idea which could only have originated in California.
    - Edsger Dijkstra (attributed)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]