savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-hackers-public] Please disallow www-commits in robots.txt


From: Alfred M. Szmidt
Subject: Re: [Savannah-hackers-public] Please disallow www-commits in robots.txt
Date: Wed, 10 May 2023 14:50:53 -0400

   > You've not explained the actual problem.  What are you trying to
   > solve?

   "it" is the www-commits list, which registers all changes to the www
   directory, including to pages that are not published yet. I suspect most
   of the other *-commits lists deal with source code repositories, which
   are public anyway.

If you wish to disallow access to pages, do not publish them --
www-commits is a public list, the www repository is a public
repository -- any commit is by definition published.  It is no
different than getting bug reports for commits in a software
repository that still hasn't had a release.

   If you let crawlers access changes to disallowed directories, you are
   defeating the purpose of robots.txt. What was supposed to be unpublished
   is actually published.

The purpose of robots.txt is to avoid overloading a web site, it is
not to disallow access to pages.

This still does not explain what the problem is -- "don't let crawlers
crawl" doesn't explain it.  What are you trying to solve?  That
unpublished articles are not published before they are finished? 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]