Title: robotparser crawl_delay and request_rate do not work with no matching entry
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6
Assigned To: Nosy List: joseph_myers, orsenthil, remi.lapeyre
Priority: normal Keywords: patch

Created on 2019-02-06 20:33 by joseph_myers, last changed 2019-02-08 16:07 by remi.lapeyre.

PR 11791 open remi.lapeyre, 2019-02-08 16:05
msg334982 - (view) Author: Joseph Myers (joseph_myers) Date: 2019-02-06 20:33
RobotFileParser.crawl_delay and RobotFileParser.request_rate raise AttributeError for a robots.txt with no matching entry for the given user-agent, including no default entry, rather than returning None which would be correct according to the documentation.  E.g.:

>>> from urllib.robotparser import RobotFileParser
>>> parser = RobotFileParser()
>>> parser.parse([])
>>> parser.crawl_delay('example')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/urllib/", line 182, in crawl_delay
    return self.default_entry.delay
AttributeError: 'NoneType' object has no attribute 'delay'
msg335093 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2019-02-08 16:07
Thanks for your report Joseph, I opened a new PR to fix this.
