classification
Title: robotparser crawl_delay and request_rate do not work with no matching entry
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: joseph_myers, orsenthil, remi.lapeyre
Priority: normal Keywords: patch

Created on 2019-02-06 20:33 by joseph_myers, last changed 2019-02-08 16:07 by remi.lapeyre.

Pull Requests
URL Status Linked Edit
PR 11791 open remi.lapeyre, 2019-02-08 16:05
Messages (2)
msg334982 - (view) Author: Joseph Myers (joseph_myers) Date: 2019-02-06 20:33
RobotFileParser.crawl_delay and RobotFileParser.request_rate raise AttributeError for a robots.txt with no matching entry for the given user-agent, including no default entry, rather than returning None which would be correct according to the documentation.  E.g.:

>>> from urllib.robotparser import RobotFileParser
>>> parser = RobotFileParser()
>>> parser.parse([])
>>> parser.crawl_delay('example')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/urllib/robotparser.py", line 182, in crawl_delay
    return self.default_entry.delay
AttributeError: 'NoneType' object has no attribute 'delay'
msg335093 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2019-02-08 16:07
Thanks for your report Joseph, I opened a new PR to fix this.
History
Date User Action Args
2019-02-08 16:07:45remi.lapeyresetnosy: + orsenthil
messages: + msg335093
2019-02-08 16:05:33remi.lapeyresetkeywords: + patch
stage: patch review
pull_requests: + pull_request11796
2019-02-06 20:37:38remi.lapeyresetnosy: + remi.lapeyre
2019-02-06 20:33:03joseph_myerscreate