classification
Title: Issues with request rate in robotparser
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: req_rate is a namedtuple type rather than instance
View: 31325
Assigned To: Nosy List: XapaJIaMnu, berker.peksag, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-10-02 06:14 by serhiy.storchaka, last changed 2017-10-02 10:32 by serhiy.storchaka. This issue is now closed.

Messages (5)
msg303508 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-02 06:14
There are issues in implementing support of request rate in robotparser.

    req_rate = collections.namedtuple('req_rate',
                                      'requests seconds')
    entry.req_rate = req_rate
    entry.req_rate.requests = int(numbers[0])
    entry.req_rate.seconds = int(numbers[1])

First, a new namedtuple type is created for every entry. This is slow even with recent namedtuple optimizations, and is much slower in 3.6. This wastes a memory, since new type is created for every entry. This is definitely wrong, since req_rate is set to a namedtuple type instead of namedtuple instance. And there is a question why a namedtuple is used here at all. Other classes in this module are not namedtuples.
msg303509 - (view) Author: Nikolay Bogoychev (XapaJIaMnu) Date: 2017-10-02 06:26
Hey Serhiy,

The use of namedtuple was requested specifically at a review, I didn't implement it like this initially: https://bugs.python.org/review/16099/#ps6205

I wasn't aware of the performance implications. Could you please explain to me the type vs instance in terms of performance (or point me to a resource, a quick googling didn't yield anything? How was I supposed to have coded it properly?

Cheers,

Nick
msg303510 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-02 07:02
For the performance of namedtuple type creation see issue28638 and https://mail.python.org/pipermail/python-dev/2017-July/148592.html. For difference between types and instance see https://docs.python.org/3/tutorial/classes.html.
msg303526 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-10-02 10:17
Thanks for the report. Your analysis is correct and this is a duplicate of issue 31325. I'll take care of the PR 3259.
msg303529 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-02 10:32
Oh, I missed this. My report is based on the same comment on news.ycombinator.com.
History
Date User Action Args
2017-10-02 10:32:24serhiy.storchakasetmessages: + msg303529
2017-10-02 10:17:25berker.peksagsetstatus: open -> closed
superseder: req_rate is a namedtuple type rather than instance
messages: + msg303526

resolution: duplicate
stage: resolved
2017-10-02 07:02:08serhiy.storchakasetmessages: + msg303510
2017-10-02 06:26:17XapaJIaMnusetmessages: + msg303509
2017-10-02 06:14:07serhiy.storchakacreate