Message359180
Hi,
Is this ticket still relevant for Python 3.8?
While running some tests with an empty robotstxt file I realized that it was returning "ALLOWED" for any path (as per the current draft of the Robots Exclusion Protocol: https://tools.ietf.org/html/draft-koster-rep-00#section-2.2.1 ")
Code:
from urllib import robotparser
robots_url = "file:///tmp/empty.txt"
rp = robotparser.RobotFileParser()
print(robots_url)
rp.set_url(robots_url)
rp.read()
print( "fetch /", rp.can_fetch(useragent = "*", url = "/"))
print( "fetch /admin", rp.can_fetch(useragent = "*", url = "/admin"))
Output:
$ cat /tmp/empty.txt
$ python -V
Python 3.8.1
$ python test_robot3.py
file:///tmp/empty.txt
fetch / True
fetch /admin True |
|
Date |
User |
Action |
Args |
2020-01-02 03:41:13 | gallicrooster | set | recipients:
+ gallicrooster, terry.reedy, larsfuse |
2020-01-02 03:41:13 | gallicrooster | set | messageid: <1577936473.45.0.756125026518.issue35457@roundup.psfhosted.org> |
2020-01-02 03:41:13 | gallicrooster | link | issue35457 messages |
2020-01-02 03:41:12 | gallicrooster | create | |
|