This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dualbus
Recipients dualbus
Date 2012-09-02.18:36:03
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I found that returns 403 if the provided user agent is in a specific blacklist.

And since robotparser doesn't provide a mechanism to change the default user agent used by the opener, it becomes unusable for that site (and sites that have a similar policy).

I think the user should have the possibility to set a specific user agent string, to better identify their bot.

I attach a patch that allows the user to change the opener used by RobotFileParser, in case the need of some specific behavior arises.

I also attach a simple example of how it solves the issue, at least with wikipedia.
Date User Action Args
2012-09-02 18:36:04dualbussetrecipients: + dualbus
2012-09-02 18:36:04dualbussetmessageid: <>
2012-09-02 18:36:04dualbuslinkissue15851 messages
2012-09-02 18:36:03dualbuscreate