classification
Title: req_rate is a namedtuple type rather than instance
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: berker.peksag Nosy List: berker.peksag, gvx, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-09-01 15:15 by gvx, last changed 2017-11-23 23:58 by rhettinger. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3259 closed gvx, 2017-09-01 15:15
PR 4529 merged berker.peksag, 2017-11-23 19:23
PR 4533 merged python-dev, 2017-11-23 23:40
Messages (10)
msg301127 - (view) Author: Robin (gvx) * Date: 2017-09-01 15:15
> Finally, urllib/robotparser.py appears to contain a bug in the
> following:
>
>   req_rate = collections.namedtuple('req_rate',
>                                    'requests seconds')
>   entry.req_rate = req_rate
>   entry.req_rate.requests = int(numbers[0])
>   entry.req_rate.seconds = int(numbers[1])
> 
> As I read it, this should fail as the req_rate is immutable.

Ref: https://news.ycombinator.com/item?id=15136961

They are mistaken in the nature of the bug, which is that req_rate is
a namedtuple type, rather than an instance. As such, it is actually
mutable, causing the assignments to not fail. Every entry creates a
separate req_rate type, so it all works, but not in the way it should.
msg301134 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-01 16:19
* The named tuple class should begin with a capital letter and be fully self-documenting: "RequestRate".

* The creation of the named tuple class should be done only once, not on every call.  Instead only a new instance should be creating on every call:

     entry.req_rate = req_rate(RequestRate)

* There needs to be a test.

* The docstring should be updated to include the name of the class refer to the term named tuple instead of the namedtuple() factory function:

     - Returns the contents of the ``Request-rate`` parameter from
     - ``robots.txt`` in the form of a :func:`~collections.namedtuple`
     - ``(requests, seconds)``.  If there is no such parameter or it doesn't

     + Returns the contents of the ``Request-rate`` parameter from
     + ``robots.txt`` as a :term:`named tuple` ``RequestRate(requests, seconds)``.
     + If there is no such parameter or it doesn't
msg301148 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-09-01 19:32
Good catch and thank you for turning the bug report in the HN thread to a pull request! I agree with all of Raymond's comments.

I have two more comments:

* Please follow our commit style at https://devguide.python.org/committing/#commit-messages
* We need a NEWS entry. You can find details at https://devguide.python.org/committing/#what-s-new-and-news-entries (we don't need to add an entry in Doc/whatsnew/ so you can safely ignore it)
msg301152 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-01 20:22
There was a typo in my previous message.  The instantiation code should be:

   entry.req_rate = RequestRate(int(numbers[0]), int(numbers[1]))
msg303528 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-02 10:30
What is a reason of making req_rate a named tuple?
msg303550 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-10-02 16:57
> What is a reason of making req_rate a named tuple?

I don't know the original reason but it seems like a normal use of named tuples to make the data more self-describing to help with debugging and also to support field access by name, rr.requests or rr.seconds is clearer than rr[0] or rr[1].
msg303553 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-02 17:12
This is a normal use of named tuples for adding access by name to tuple results. But req_rate never was a tuple. Nobody used rr[0].
msg303561 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-10-02 19:27
There is no rule that something had to be a tuple at some point in its history before becoming a named tuple.  This use seems perfectly reasonable to me.
msg306861 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-11-23 23:40
New changeset 3df02dbc8e197053105f9dffeae40b04ec66766e by Raymond Hettinger (Berker Peksag) in branch 'master':
bpo-31325: Fix usage of namedtuple in RobotFileParser.parse() (#4529)
https://github.com/python/cpython/commit/3df02dbc8e197053105f9dffeae40b04ec66766e
msg306862 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-11-23 23:58
New changeset ff847d1ac7e6a8ee1fb6f8883cfb4aec4b4a9b03 by Raymond Hettinger (Miss Islington (bot)) in branch '3.6':
bpo-31325: Fix usage of namedtuple in RobotFileParser.parse() (GH-4529) (#4533)
https://github.com/python/cpython/commit/ff847d1ac7e6a8ee1fb6f8883cfb4aec4b4a9b03
History
Date User Action Args
2017-11-23 23:58:51rhettingersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-11-23 23:58:00rhettingersetmessages: + msg306862
2017-11-23 23:40:40python-devsetpull_requests: + pull_request4469
2017-11-23 23:40:28rhettingersetmessages: + msg306861
2017-11-23 19:23:37berker.peksagsetkeywords: + patch
pull_requests: + pull_request4465
2017-10-02 19:27:06rhettingersetmessages: + msg303561
2017-10-02 17:12:18serhiy.storchakasetmessages: + msg303553
2017-10-02 16:57:25rhettingersetmessages: + msg303550
2017-10-02 10:30:03serhiy.storchakasetmessages: + msg303528
2017-10-02 10:27:28serhiy.storchakasetnosy: + serhiy.storchaka
2017-10-02 10:17:25berker.peksaglinkissue31661 superseder
2017-09-01 20:22:48rhettingersetmessages: + msg301152
2017-09-01 19:32:00berker.peksagsetmessages: + msg301148
stage: patch review
2017-09-01 16:20:04rhettingersetversions: - Python 2.7, Python 3.3, Python 3.4, Python 3.5
2017-09-01 16:19:43rhettingersetassignee: berker.peksag

messages: + msg301134
nosy: + rhettinger, berker.peksag
2017-09-01 15:15:22gvxcreate