Author xtreak
Recipients berker.peksag, gallicrooster, larsfuse, terry.reedy, xtreak
Date 2020-01-02.08:36:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1577954216.19.0.0622182140321.issue35457@roundup.psfhosted.org>
In-reply-to
Content
There is a behavior change. parse() sets the modified time and unless the modified time is set the can_fetch method returns false. In Python 2 the parse method was called only when the file is non-empty [0] but in Python 3 it's always called though the file is empty [1] . The change was done with 1afc1696167547a5fa101c53e5a3ab4717f8852c to always read parse and then in 122541beceeccce4ef8a9bf739c727ccdcbf2f28 modified function was always called during parse thus setting the modified_time to return True from can_fetch in the end.

I think the behavior of robotparser for empty file was undefined allowing these changes and it will be good to have a test for this behavior.

[0] https://github.com/python/cpython/blob/f82e59ac4020a64c262a925230a8eb190b652e87/Lib/robotparser.py#L66-L67
[1] https://github.com/python/cpython/blob/149175c6dfc8455023e4335575f3fe3d606729f9/Lib/urllib/robotparser.py#L69-L70
History
Date User Action Args
2020-01-02 08:36:56xtreaksetrecipients: + xtreak, terry.reedy, berker.peksag, larsfuse, gallicrooster
2020-01-02 08:36:56xtreaksetmessageid: <1577954216.19.0.0622182140321.issue35457@roundup.psfhosted.org>
2020-01-02 08:36:56xtreaklinkissue35457 messages
2020-01-02 08:36:55xtreakcreate