savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-hackers-public] Please disallow www-commits in robots.txt


From: Thérèse Godefroy
Subject: Re: [Savannah-hackers-public] Please disallow www-commits in robots.txt
Date: Wed, 10 May 2023 21:04:07 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0

Le 10/05/2023 à 20:50, Alfred M. Szmidt a écrit :
    > You've not explained the actual problem.  What are you trying to
    > solve?

    "it" is the www-commits list, which registers all changes to the www
    directory, including to pages that are not published yet. I suspect most
    of the other *-commits lists deal with source code repositories, which
    are public anyway.

If you wish to disallow access to pages, do not publish them --
www-commits is a public list, the www repository is a public
repository -- any commit is by definition published.  It is no
different than getting bug reports for commits in a software
repository that still hasn't had a release.

    If you let crawlers access changes to disallowed directories, you are
    defeating the purpose of robots.txt. What was supposed to be unpublished
    is actually published.

The purpose of robots.txt is to avoid overloading a web site, it is
not to disallow access to pages.

This still does not explain what the problem is -- "don't let crawlers
crawl" doesn't explain it.  What are you trying to solve?  That
unpublished articles are not published before they are finished?

Yes, basically. The purpose of the staging area is to work on articles
that are not ready yet.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]