Message9423
The robotparser module handles incorrectly empty paths
in the allow/disallow directives.
According to: http://www.robotstxt.org/wc/norobots-
rfc.html, the following rule should be a global
*allow*:
User-agent: *
Disallow:
My reading of the RFC is that an empty path is always
a global allow (for both Allow and Disallow
directives) so that the syntax is backwards
compatible --there was no Allow directive in the
original syntax.
Suggested fix:
robotparser.RuleLine.applies_to() becomes:
def applies_to(self, filename):
if not self.path:
self.allowance = 1
return self.path=="*" or re.match(self.path,
filename) |
|
Date |
User |
Action |
Args |
2007-08-23 13:59:28 | admin | link | issue522898 messages |
2007-08-23 13:59:28 | admin | create | |
|