Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(100)

#16099: robotparser doesn't support request rate and crawl delay parameters

Can't Edit
Can't Publish+Mail
Start Review
Created:
5 years ago by nheart
Modified:
3 years, 5 months ago
Reviewers:
berker.peksag
CC:
rhettinger, orsenthil, christian.heimes, devnull_psf.upfronthosting.co.za, berkerpeksag, hynek, XapaJIaMnu
Visibility:
Public.

Patch Set 1 #

Patch Set 2 #

Total comments: 14

Patch Set 3 #

Total comments: 8

Patch Set 4 #

Unified diffs Side-by-side diffs Delta from patch set Stats Patch
Doc/library/urllib.robotparser.rst View 1 2 3 2 chunks +24 lines, -1 line 0 comments Download
Lib/test/test_robotparser.py View 1 2 3 17 chunks +70 lines, -21 lines 0 comments Download
Lib/urllib/robotparser.py View 1 2 3 4 chunks +35 lines, -0 lines 0 comments Download

Messages

Total messages: 2
berkerpeksag
http://bugs.python.org/review/16099/diff/6206/Doc/library/urllib.robotparser.rst File Doc/library/urllib.robotparser.rst (right): http://bugs.python.org/review/16099/diff/6206/Doc/library/urllib.robotparser.rst#newcode56 Doc/library/urllib.robotparser.rst:56: .. method:: crawl_delay(useragent) Is crawl_delay used for search engines? ...
3 years, 10 months ago #1
berkerpeksag
3 years, 9 months ago #2
The default branch is in feature freeze mode right now. All new features like
this will be committed to the default branch when 3.4 will be released.

See http://docs.python.org/devguide/devcycle.html#summary for more information.

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
File Doc/library/urllib.robotparser.rst (right):

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:56: .. versionadded:: 3.5
Please move this directive to the end of crawl_delay() function. For example:

    .. method:: crawl_delay(useragent)

       The method's doc is here...

       .. versionadded:: 3.5

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:59: Returns the value of the ``Crawl-delay``:
parameter from ``robots.txt``
``Crawl-delay``: -> remove :

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:61: doesn't apply to the *useragent*
specified or the robots.txt entry
``robots.txt``

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:64: .. versionadded:: 3.5
Please move this directive to the end of request_rate() function. For example:

    .. method:: request_rate(useragent)

       The method's doc is here...

       .. versionadded:: 3.5

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:67: Returns the contents of the
``Request-rate``: parameter from ``robots.txt``
``Request-rate``: -> remove :

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:68: in the form of a `namedtuple`_
``(requests, seconds)``. If there is no such parameter
Please wrap this line to 80 chars.

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:68: in the form of a `namedtuple`_
``(requests, seconds)``. If there is no such parameter
You can use :func:`~collections.namedtuple` instead of `namedtuple`_.

http://bugs.python.org/review/16099/diff/10273/Doc/library/urllib.robotparser...
Doc/library/urllib.robotparser.rst:69: or it doesn't apply to the *useragent*
specified or the robots.txt entry
``robots.txt``
Sign in to reply to this message.

RSS Feeds Recent Issues | This issue
This is Rietveld 894c83f36cb7