Message171711
Robotparser doesn't support two quite important optional parameters from the robots.txt file. I have implemented those in the following way:
(Robotparser should be initialized in the usual way:
rp = robotparser.RobotFileParser()
rp.set_url(..)
rp.read
)
crawl_delay(useragent) - Returns time in seconds that you need to wait for crawling
if none is specified, or doesn't apply to this user agent, returns -1
request_rate(useragent) - Returns a list in the form [request,seconds].
if none is specified, or doesn't apply to this user agent, returns -1 |
|
Date |
User |
Action |
Args |
2012-10-01 12:58:25 | XapaJIaMnu | set | recipients:
+ XapaJIaMnu |
2012-10-01 12:58:25 | XapaJIaMnu | set | messageid: <1349096305.17.0.983395980337.issue16099@psf.upfronthosting.co.za> |
2012-10-01 12:58:25 | XapaJIaMnu | link | issue16099 messages |
2012-10-01 12:58:25 | XapaJIaMnu | create | |
|