classpath-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[cp-patches] Re: Absolute URL parsing bug


From: Per Bothner
Subject: [cp-patches] Re: Absolute URL parsing bug
Date: Tue, 05 Jul 2005 14:54:35 -0700
User-agent: Mozilla Thunderbird 1.0.2-6 (X11/20050513)

Andrew Haley wrote:
The case that we get wrong is this one:

$ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
'jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl

None of these are valid URLs as documented in the javadoc. It matches the URI specification if you view everything after the scheme (i.e. "file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl") as an 'opaque_part'. There is no concept of a "nested URL" and "jar:file:" is not a valid scheme.

 $ gij TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
'jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:/ejbjars/ws.jar!/META-INF/wsdl/file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl


So, in that case the context should be ignored.
In fact, it seems like the context is ignored for all "jar:file" and "file:" 
URLs:

We should ignore the context for *all* absolute URLs - i.e. any URL that has a scheme - accordingto the URI spec. But the Javadoc doc URL(URL, String) says "If the scheme component is defined in the given spec and does not match the scheme of the context, then the new URL is created as an absolute URL based on the spec alone." I assume the "and does not match" clause is documenting a hack (bug) in the implementation - which doesn't match the URI spec. (Hacks like this may be one reason why they decided to start from scratch with a new URI class.)

I.e. java TestURLs http:/a/b/c http:d/e
prints:
http:/a/b/d/e
but according to the RFC spec it should be:
http:d/e
The URI.resolve method gets this right.

A suggested (untested) fix might be something like:

   int colon = spec.indexOf(':');
   int slash = spec.indexOf('/');
   if (colon > 0
       && (colon < slash || slash < 0)
       && (protocol == null
           || protocol.length() <= colon
           || ! protocol.equalsIgnoreCase(spec.substring(0, colon))))
     context = null;

But in the case of a URL that is not qualified with any protocol at
all we need the context:

Er, no.  See the 'TestURLs  http:/a/b/c http:d/e' example.

 $ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' foo
jar:file:/ejbjars/ws.jar!/META-INF/wsdl/foo

Classpath gets that right too.  So, the only thing we seem to get
wrong is the parsing of a 'jar:file:' URL when a 'jar:' context is
supplied.

Do you have any reason to believe that "jar:" or "file:" URLs get any special treatment? I see none.
--
        --Per Bothner
address@hidden   http://per.bothner.com/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]