Message 312711 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Guinness
Recipients	Guinness
Date	2018-02-24.10:53:13
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1519469593.78.0.467229070634.issue32936@psf.upfronthosting.co.za>
In-reply-to

Content
When processing an ill-formed robots.txt file (like https://tiny.tobast.fr/robots-file ), the RobotFileParser.parse method does not instantiate the entries or the default_entry attributes. In my opinion, the method should raise an exception when no valid User-agent entry (or if there exists an invalid User-agent entry) is found in the robots.txt file. Otherwise, the only method available is to check the None-liness of default_entry, which is not documented in the documentation (https://docs.python.org/dev/library/urllib.robotparser.html). According to your opinion on this, I can implement what is necessary and create a PR on Github.

When processing an ill-formed robots.txt file (like https://tiny.tobast.fr/robots-file ), the RobotFileParser.parse method does not instantiate the entries or the default_entry attributes.

In my opinion, the method should raise an exception when no valid User-agent entry (or if there exists an invalid User-agent entry) is found in the robots.txt file.

Otherwise, the only method available is to check the None-liness of default_entry, which is not documented in the documentation (https://docs.python.org/dev/library/urllib.robotparser.html).

According to your opinion on this, I can implement what is necessary and create a PR on Github.

History
Date	User	Action	Args
2018-02-24 10:53:13	Guinness	set	recipients: + Guinness
2018-02-24 10:53:13	Guinness	set	messageid: <1519469593.78.0.467229070634.issue32936@psf.upfronthosting.co.za>
2018-02-24 10:53:13	Guinness	link	issue32936 messages
2018-02-24 10:53:13	Guinness	create