[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LYNX-DEV Request for Assistance
From: |
Subir Grewal |
Subject: |
Re: LYNX-DEV Request for Assistance |
Date: |
Sun, 15 Dec 1996 10:00:23 -0800 (PST) |
Dear address@hidden,
A Lynx user recently sent mail to the lynx-dev list asking for assitance
with the following problem:
On Sat, 14 Dec 1996, David Mischel wrote:
:My name is David Mischel. My e mail address is address@hidden
:I am sightless. I have been using lynx through an internet provider
:called hermes. For 6 months I have been able to access the Washington
:Post without any problems. Apparently they have updated their site. Now,
:although I can stil get in, I can no longer access the articles. I can go
:into the various sections, but when I hit on an article and try to call it
:up, I get an error message telling me that it's unable to connect. People
:using another browser can get the articles, so the problem is with the
:compabability between lynx and the new set up. The internet provider is
:using lynx version 2.6 which I understand is the latest. Someone
:suggested that perhaps the scripts to get to the individual articles were
:too long. I am not a technical expert, so I'm not sure what that means.
:If anyone has any suggestions, I would appreciate them. I can be reached
:by phone at 202-554-8079 or at the e mail address shown above. thanks in
:advance for any assistance.
My rather lengthy analysis of the problem(s) follows.
I tried to trace the problem and seem to have found two separate problems
with the implementation of cgi's on the Washington Post server. The first
is rather straightforward and involves the manner in which your CGIs
perform searches for keywords.
For example, in the following document:
Linkname: WashingtonPost.com: Federal Community News
URL:
http://www.washingtonpost.com/wp-srv/national/longterm/fedcom/c
ausey/causey.htm
activating the following link (i.e. initiating a search for all recent
articles written by Michael Causey):
Linkname: Causey's latest
Filename:
http://www.washingtonpost.com/cgi-bin/search?RELEVANCE_RANK=0&T
OTAL_HITLIST=20&DB_NAME=WPlate&ALL=causey%3Abyline
results in the following document:
Linkname: Washington Post: Search Results
URL:
http://www.washingtonpost.com/cgi-bin/search?RELEVANCE_RANK=0&T
OTAL_HITLIST=20&DB_NAME=WPlate&ALL=causey%3Abyline
which is fine and dandy, except that all the links in that document are of
the following form:
Linkname: Healthy Incentive to Retire
Filename:
http://www.washingtonpost.com/../wp-srv/WPlate/1996-12/15/110L-
121596-idx.html
The .. throws the server off-course when Lynx tries to GET that URL. The
problem seems to be with the CGI which is searrching the site, since it
resides in the /cgi-bin directory, it's taking Unix paths (using .. to
refer to the parent directory) and using them as URLs. Or this is my
interpretation of the situation. Kindly see if anything can be done about
this. I wasn't able to find bad URLs of this sort on your site itself, so
I'd say its only a problem with the manner in which the searches are
implemented, it should be reasonably straightforward to fix.
The second problem is more complex (relatively) and occurs on the main
page itself (though there are similar problems all over the site). It
involves the "pop-up boxes" that permit users to select one particular
section of the paper. For example in the following document:
Linkname: Welcome to WashingtonPost.com
URL: http://www.washingtonpost.com/
the following link exhibits this problem:
Linkname: Go
Method: POST
Action: http://www.washingtonpost.com/cgi-bin/navigate.py
What happens when this link is activated is that the currently selected
item in the pop-up box is sent to the server. The CGI churns away,
processes the request, and returns a code 302 "Moved temporarily". This is
fine, but Lynx in compliance with the HTTP 1.0 protocol (which your server
advertises itself as) then proceeds to redirect the post content, after
having asked the user whether they wish to do this. Unfortunately, the
CGI appears to be reelying on the expressly incorrect behaviour of some
user-agents (browsers) to change post redirects into GET requests. Lynx
2.6 will not do this. Ideally, for the functionality you desire the CGI
should return a code 303, which is a new implimentation in the draft HTTP
1.1 designed especially for this purpose (POST scripts that redirect the
user-agent to a real document). I've excerpted the relevant portions of
HTTP 1.0 and HTTP 1.1 for your information:
-------- HTTP 1.0 <URL:http://ds.internic.net/rfc/rfc1945.txt> ----------
9.3 Redirection 3xx
This class of status code indicates that further action needs to be
taken by the user agent in order to fulfill the request. The action
required may be carried out by the user agent without interaction
with the user if and only if the method used in the subsequent
request is GET or HEAD. A user agent should never automatically
redirect a request more than 5 times, since such redirections usually
indicate an infinite loop.
300 Multiple Choices
This response code is not directly used by HTTP/1.0 applications,
but serves as the default for interpreting the 3xx class of
responses.
The requested resource is available at one or more locations.
Unless it was a HEAD request, the response should include an entity
containing a list of resource characteristics and locations from
which the user or user agent can choose the one most appropriate.
If the server has a preferred choice, it should include the URL in
a Location field; user agents may use this field value for
automatic redirection.
301 Moved Permanently
The requested resource has been assigned a new permanent URL and
any future references to this resource should be done using that
URL. Clients with link editing capabilities should automatically
relink references to the Request-URI to the new reference returned
by the server, where possible.
The new URL must be given by the Location field in the response.
Unless it was a HEAD request, the Entity-Body of the response
should contain a short note with a hyperlink to the new URL.
If the 301 status code is received in response to a request using
the POST method, the user agent must not automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Berners-Lee, et al Informational [Page 34]
RFC 1945 HTTP/1.0 May 1996
Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing user agents will
erroneously change it into a GET request.
302 Moved Temporarily
The requested resource resides temporarily under a different URL.
Since the redirection may be altered on occasion, the client should
continue to use the Request-URI for future requests.
The URL must be given by the Location field in the response. Unless
it was a HEAD request, the Entity-Body of the response should
contain a short note with a hyperlink to the new URI(s).
If the 302 status code is received in response to a request using
the POST method, the user agent must not automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Note: When automatically redirecting a POST request after
receiving a 302 status code, some existing user agents will
erroneously change it into a GET request.
304 Not Modified
If the client has performed a conditional GET request and access is
allowed, but the document has not been modified since the date and
time specified in the If-Modified-Since field, the server must
respond with this status code and not send an Entity-Body to the
client. Header fields contained in the response should only include
information which is relevant to cache managers or which may have
changed independently of the entity's Last-Modified date. Examples
of relevant header fields include: Date, Server, and Expires. A
cache should update its cached entity to reflect any new field
values given in the 304 response.
-------------------------------------------------------------------------
-------- HTTP 1.1 <URL:http://www.w3.org/pub/WWW/Protocols/> ------------
10.3 Redirection 3xx
This class of status code indicates that further action needs to be
taken by the user agent in order to fulfill the request. The action
required MAY be carried out by the user agent without interaction with
the user if and only if the method used in the second request is GET or
HEAD. A user agent SHOULD NOT automatically redirect a request more than
5 times, since such redirections usually indicate an infinite loop.
10.3.1 300 Multiple Choices
The requested resource corresponds to any one of a set of
representations, each with its own specific location, and agent-driven
negotiation information (section 12) is being provided so that the user
(or user agent) can select a preferred representation and redirect its
request to that location.
Unless it was a HEAD request, the response SHOULD include an entity
containing a list of resource characteristics and location(s) from which
the user or user agent can choose the one most appropriate. The entity
format is specified by the media type given in the Content-Type header
field. Depending upon the format and the capabilities of the user agent,
selection of the most appropriate choice may be performed automatically.
However, this specification does not define any standard for such
automatic selection.
If the server has a preferred choice of representation, it SHOULD
include the specific URL for that representation in the Location field;
user agents MAY use the Location field value for automatic redirection.
This response is cachable unless indicated otherwise.
10.3.2 301 Moved Permanently
The requested resource has been assigned a new permanent URI and any
future references to this resource SHOULD be done using one of the
returned URIs. Clients with link editing capabilities SHOULD
automatically re-link references to the Request-URI to one or more of
the new references returned by the server, where possible. This response
is cachable unless indicated otherwise.
If the new URI is a location, its URL SHOULD be given by the Location
field in the response. Unless the request method was HEAD, the entity of
the response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
Fielding, et al [Page 54]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
If the 301 status code is received in response to a request other than
GET or HEAD, the user agent MUST NOT automatically redirect the request
unless it can be confirmed by the user, since this might change the
conditions under which the request was issued.
Note: When automatically redirecting a POST request after receiving
a 301 status code, some existing HTTP/1.0 user agents will
erroneously change it into a GET request.
10.3.3 302 Moved Temporarily
The requested resource resides temporarily under a different URI. Since
the redirection may be altered on occasion, the client SHOULD continue
to use the Request-URI for future requests. This response is only
cachable if indicated by a Cache-Control or Expires header field.
If the new URI is a location, its URL SHOULD be given by the Location
field in the response. Unless the request method was HEAD, the entity of
the response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
If the 302 status code is received in response to a request other than
GET or HEAD, the user agent MUST NOT automatically redirect the request
unless it can be confirmed by the user, since this might change the
conditions under which the request was issued.
Note: When automatically redirecting a POST request after receiving
a 302 status code, some existing HTTP/1.0 user agents will
erroneously change it into a GET request.
10.3.4 303 See Other
The response to the request can be found under a different URI and
SHOULD be retrieved using a GET method on that resource. This method
exists primarily to allow the output of a POST-activated script to
redirect the user agent to a selected resource. The new URI is not a
substitute reference for the originally requested resource. The 303
response is not cachable, but the response to the second (redirected)
request MAY be cachable.
If the new URI is a location, its URL SHOULD be given by the Location
field in the response. Unless the request method was HEAD, the entity of
the response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
Fielding, et al [Page 55]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
10.3.5 304 Not Modified
If the client has performed a conditional GET request and access is
allowed, but the document has not been modified, the server SHOULD
respond with this status code. The response MUST NOT contain a message-
body.
The response MUST include the following header fields:
o Date
o ETag and/or Content-Location, if the header would have been sent in
a 200 response to the same request
o Expires, Cache-Control, and/or Vary, if the field-value might
differ from that sent in any previous response for the same variant
If the conditional GET used a strong cache validator (see section
13.3.3), the response SHOULD NOT include other entity-headers. Otherwise
(i.e., the conditional GET used a weak validator), the response MUST NOT
include other entity-headers; this prevents inconsistencies between
cached entity-bodies and updated headers.
If a 304 response indicates an entity not currently cached, then the
cache MUST disregard the response and repeat the request without the
conditional.
If a cache uses a received 304 response to update a cache entry, the
cache MUST update the entry to reflect any new field values given in the
response.
The 304 response MUST NOT include a message-body, and thus is always
terminated by the first empty line after the header fields.
10.3.6 305 Use Proxy
The requested resource MUST be accessed through the proxy given by the
Location field. The Location field gives the URL of the proxy. The
recipient is expected to repeat the request via the proxy.
-------------------------------------------------------------------------
I apologize for the ength of this message, but hopefully it will clarify a
few issues and highlight our concerns better than a short missive would.
I personally think everyone at the Post is doing a wonderful job with teh
On-line edition. However, it is important to keep in mind that
visually-impaired users often have no other alternative but reading
(hearing) a digital version of newspapers. This is of course one of the
great promises the medium makes to us. I hope you will be able to correct
the problems I've pointed out. If you'd like any assistance with this, I
would urge you to send a message to address@hidden where the Lynx
Developers will try their best to assist. I am not an expert on HTTP, but
others here are and will be better able to answer detailed technical
questions.
I trust I've correctly identified David's problem. If I haven't, I will
be getting back to you with another note later this week.
Thanks for you time and attention,
Subir Grewal
address@hidden + Lynx 2.6 + PGP + http://www.crl.com/~subir/
Liar, n.:
A lawyer with a roving commission.
-- Ambrose Bierce, "The Devil's Dictionary"
;
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.
;