This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author terry.reedy
Recipients bernie9998, ezio.melotti, osvenskan, petri.lehtinen, terry.reedy
Date 2011-10-29.01:06:23
SpamBayes Score 8.819105e-06
Marked as misclassified No
Message-id <1319850385.04.0.516165867155.issue13281@psf.upfronthosting.co.za>
In-reply-to
Content
Because of the line break, clicking that link gives "Server error 404".
http://www.robotstxt.org/norobots-rfc.txt
works (so please pay attention to formatting). The main page is
http://www.robotstxt.org/robotstxt.html 

The way I read the grammar, 'records' (which start with an agent line) cannot have blank lines and must be separated by blank lines. Other than than, the suggestion seems reasonable, but it also seems like a feature request. Does test/test_robotparser pass with the patch?

I also do not see "Crawl-delay" and "Sitemap" (from whitehouse.gov) in the grammar referenced above. So I wonder if de facto practice has evolved.

Philip S.: do you have any opinions?
(I am asking you because of your comments on #1437699.)
History
Date User Action Args
2011-10-29 01:06:25terry.reedysetrecipients: + terry.reedy, osvenskan, ezio.melotti, bernie9998, petri.lehtinen
2011-10-29 01:06:25terry.reedysetmessageid: <1319850385.04.0.516165867155.issue13281@psf.upfronthosting.co.za>
2011-10-29 01:06:24terry.reedylinkissue13281 messages
2011-10-29 01:06:23terry.reedycreate