classpath
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: java.net.URI implementation


From: Stephen Crawley
Subject: Re: java.net.URI implementation
Date: Tue, 11 Feb 2003 03:32:10 +1000

> After some digging in various RFCs I have written a (complete)
> grammar (in BNF) for parsing URIs (I'll append the grammar at the end
> of this message).

While the complete URI grammar looks a complex, a URI string typically
doesn't need to be fully parsed.  You only need to fully parse the
components that are requested. 

Note that the JDK 1.4 spec for the URI(String) constructor states
that its parsing more relaxed than the BNF in RFC 2396.  How relaxed
it is can only be determined by black box testing against the JDK 1.4
implementation.  If I was doing this, my first step would be to build
some extensive Mauve test cases ...

> So the URI parser can be implemented in either native (c code) or
> java. Implementing it in java, will be quite hard and difficult to
> maintain and keep up with potential URI changes. On the other hand,
> if it is implemented in c, it will be *very* easy to implement and
> maintain as I'll use flex and maximum parsing speed will be
> achieved. Additionally, provided that the URI grammar is very simple,
> bison (yacc) is not needed. It would be easy to implement the URI
> parser in java if jlex is used (that's another option I'm
> considering).

I'd recommend hand building a pure Java parser. That way, the Classpath
build process doesn't depend on an external parser or lexer generator,
and the source code will be easier to understand.

A hand-built parser for grammar as simple as this should be easy to
implement / maintain.  Especially considering Sun's documented deviations
from the RFC grammar, and possible undocumented deviations.

Finally, the chance that the RFC URI syntax will change radically is pretty
small, IMO.

-- Steve





reply via email to

[Prev in Thread] Current Thread [Next in Thread]