Message312711
When processing an ill-formed robots.txt file (like https://tiny.tobast.fr/robots-file ), the RobotFileParser.parse method does not instantiate the entries or the default_entry attributes.
In my opinion, the method should raise an exception when no valid User-agent entry (or if there exists an invalid User-agent entry) is found in the robots.txt file.
Otherwise, the only method available is to check the None-liness of default_entry, which is not documented in the documentation (https://docs.python.org/dev/library/urllib.robotparser.html).
According to your opinion on this, I can implement what is necessary and create a PR on Github. |
|
Date |
User |
Action |
Args |
2018-02-24 10:53:13 | Guinness | set | recipients:
+ Guinness |
2018-02-24 10:53:13 | Guinness | set | messageid: <1519469593.78.0.467229070634.issue32936@psf.upfronthosting.co.za> |
2018-02-24 10:53:13 | Guinness | link | issue32936 messages |
2018-02-24 10:53:13 | Guinness | create | |
|