sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Sks-devel] robots.txt, grub-client


From: Jason Harris
Subject: [Sks-devel] robots.txt, grub-client
Date: Sat, 18 Dec 2004 15:52:32 -0500
User-agent: Mutt/1.4.2.1i

Is anyone (else) serving robots.txt from pks and SKS and watching the
User-Agent: headers on incoming requests?  I've noticed a lot (30 and
counting, since yesterday afternoon) of requests from grub-client-2.3
to my pks server, which is wrong because I've been serving robots.txt
containing:

  User-agent: *
  Disallow: /

for quite some time now.  grub[.org] seems to be the newest search engine
that doesn't respect robots.txt, but it is also hard to block because it
is a distributed system.  Still, 64.241.242.18=sv-fw.looksmart.com is the
main offender and can be blocked by IP.

Of course, M$ in 65.52.0.0/14 and 207.68.128.0 - 207.68.207.255 and
Yahoo/Inktomi in 66.196.64.0/18 are also blocked by IP due to over-
zealous web crawlers and/or not respecting robots.txt.

Most of the grub requests have been for "Host: skylane.kjsl.com:11371"
as well.  The few for "Host: wwwkeys.pgp.net:11371" are understandable
because it is a DNS RR, of course, but I imagine the remaining servers
in wwwkeys.pgp.net (and other DNS RRs) that don't block these crawlers 
will see their bot-induced load eventually rise to unacceptable levels.

-- 
Jason Harris           |  NIC:  JH329, PGP:  This _is_ PGP-signed, isn't it?
address@hidden _|_ web:  http://keyserver.kjsl.com/~jharris/
          Got photons?   (TM), (C) 2004

Attachment: pgpm8OcEoxEvJ.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]