classification
Title: collections.UserString.__rmod__() raises NameError
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jcgoble3, martin.panter, miss-islington, pitrou, r.david.murray, rhettinger, serhiy.storchaka, xiang.zhang
Priority: normal Keywords: patch

Created on 2015-11-18 05:16 by jcgoble3, last changed 2019-05-21 20:27 by miss-islington.

Files
File name Uploaded Description Edit
collections.UserString xiang.zhang, 2015-11-18 06:49 review
collections.UserString1 xiang.zhang, 2015-11-18 07:19 review
userstringerror.py jcgoble3, 2017-05-10 00:35 demonstration of error in action
Pull Requests
URL Status Linked Edit
PR 13326 merged BTaskaya, 2019-05-14 18:41
Messages (21)
msg254830 - (view) Author: Jonathan Goble (jcgoble3) * Date: 2015-11-18 05:16
In an instance of collections.UserString, any call to the __rmod__() method will raise NameError, due to the undefined "args" name in the method.
msg254832 - (view) Author: Jonathan Goble (jcgoble3) * Date: 2015-11-18 06:38
c06b2480766d appears to be the offending changeset.
msg254833 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2015-11-18 06:49
It seems this is a typo. The args should be self.data. And since unicodeobject doesn't support __rmod__, at least I think we should str(format) or remove __rmod__ totally. I attach a patch.
msg254835 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2015-11-18 07:19
Quite sorry for making some mistakes. unicodeobject does get a __radd__ method. It is added by addoperators though I cannot see it in PyNumberMethods. Now I think just fixing the typo is enough.
msg261222 - (view) Author: Jonathan Goble (jcgoble3) * Date: 2016-03-05 19:35
*ping* Can this be reviewed? It's a simple fix to a problem that happens consistently, and it would be nice to get it into the next bugfix release.
msg261223 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-03-05 19:48
I'll apply this shortly.
msg261235 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-06 07:08
Would be nice to add some tests. For example: 0 % collections.UserString('').
msg261239 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-03-06 08:14
The UserString1 patch is incorrect as it leads to infinite recursion because the __rmod__ operation only gets called when the other argument doesn't have __mod__.

One possible fix is to coerce the template to a regular str: 

    def __rmod__(self, template):
        return self.__class__(str(template) % self)

That would get the following test to pass:

    class S:
        'strlike class without a __mod__'
        def __init__(self, value):
            self.value = value
        def __str__(self):
            return str(self.value)

    assert S('say %s') % UserString('hello') == 'say hello'

That said, the goal of UserString is to parallel what a regular string would do.  In this case, a TypeError is raised:

    >>> print(S('answer is %s') % 'hello')
    Traceback (most recent call last):
      ...
    TypeError: unsupported operand type(s) for %: 'S' and 'str'

Serhiy, what do you think should be done, coerce to a str or remove the __rmod__ method entirely?
msg261251 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-06 14:10
In normal case there is no infinite recursion in collections.UserString1, since UserString.__rmod__(S) falls back to S.__mod__(str) or str.__rmod__(S).

>>> S('say %s') % UserString('hello')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/serhiy/py/cpython/Lib/collections/__init__.py", line 1156, in __rmod__
    return self.__class__(format % self.data)
TypeError: unsupported operand type(s) for %: 'S' and 'str'

Infinite recursion is possible if we make recursive UserString (us.data = us), but in this case we will get infinite recursion in all operations.

collections.UserString1 LGTM at the same degree as UserString.__le__ (it would be better to handle NotImplemented correctly).

There is yet one consideration. In 2.x UserString simulates both 8-bit and Unicode strings. In 3.x UserString simulates both str and bytes (except decode(), bytes formatting, etc). I think it is worth once to make UserString to support bytes formatting. In this case it would be incorrect to coerce the template to a regular str.
msg261271 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-03-06 20:34
Antoine, do you have any thoughts on this one?   I'm inclined to remove __rmod__ entirely.  That would match what str does and it avoids having to guess what the user meant when writing "A(s) % UserString(t)" when A does not have a __mod__ formatting method.
msg261272 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-03-06 20:40
I admit I don't understand what __rmod__ is meant to achieve here, and following str's behaviour sounds like a reasonable decision.
msg261273 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-06 20:48
What about such example?

>>> class F:
...     def __init__(self, value):
...         self.value = value
...     def __mod__(self, other):
...         if isinstance(other, str):
...             return self.value % other
...         return NotImplemented
... 
>>> from collections import UserString
>>> F('say %s') % UserString('hello')
'say hello'
msg270106 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-07-10 17:24
This was set to commit review when Raymond was ready to commit it, but subsequent discussion indicates the patch isn't yet ready for commit.  In particular, there ought to be some tests, even if the decision is to remove __rmod__.

The fact that UserString passes its tests with this buggy method indicates that there is a lack somewhere in the general string tests that test_userstring relies on.
msg270139 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-10 23:23
Jonathon: Do you have a use case for __rmod__(), or did you find this bug by code analysis?

UserString.__rmod__() was added as part of Issue 22189. The main reason seems to be so that UserString has all the methods listed by dir(str). The result was that UserString.__rmod__() is documented as existing since 3.5.

Ideally I would agree with removing UserString.__rmod__(), but perhaps it would be safer to just have it return NotImplemented. As long as UserString properly supports all combinations of user_string_a % (user_string_b, user_string_c) etc, whether or how __rmod__() is implemented is an implementation detail. The existance of str.__rmod__() does not seem to be documented. It just seems to be a side-effect of how the C-level nb_remainder slot works.

For Serhiy’s F class, which explicitly only works with a single str argument, I think it is reasonable that using UserString instead would raise TypeError. This is what Python 2’s UserString does.
msg270140 - (view) Author: Jonathan Goble (jcgoble3) * Date: 2016-07-10 23:39
Code analysis, if it can even be called that. I was simply looking through the source of the collections module one day out of curiosity (mainly to see how various things in it were implemented) and this bug jumped out at me as I was reading the code. I do not have a use case for the __rmod__() method.
msg280000 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-03 15:59
Issue28598 may be related.
msg293366 - (view) Author: Jonathan Goble (jcgoble3) * Date: 2017-05-10 00:35
Any decision on this? I recently played around and found a reasonable use case where UserString.__rmod__ does get called; run the attached userstringerror.py to see it in action.

Basically, it seems the idea of UserString is to subclass it, tweak as desired, and use your subclass as necessary. If you then subclass your subclass (e.g. for a portion of your code with a specialized need) and extend __mod__ and __rmod__ in that sub-subclass (calling the parent method with super()), then any case of "subclass instance % sub-subclass instance" results in Python calling the sub-subclass's __rmod__ method directly without trying a __mod__ method. If that method then calls super().__rmod__ (e.g. it just needed to pre-process the data, such as to normalize it, before the formatting operation), then UserString.__rmod__ will be called and result in the NameError.
msg293378 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-10 04:53
Unless there are objections, I think the wisest course is to remove __rmod__ entirely (refuse the temptation to guess at what the user would expect the semantics to be).
msg293398 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-10 10:29
I think any solution is better than keeping a bug in the current code.
msg293451 - (view) Author: Jonathan Goble (jcgoble3) * Date: 2017-05-10 19:38
I would prefer to keep __rmod__ and fix the bug, given that the use case I described above would otherwise create an inconsistency in subclasses, which would be able to easily extend __mod__ by calling super(), but would be forced to fully implement __rmod__ themselves.

That said, I will defer to those who are more experienced here.
msg343091 - (view) Author: miss-islington (miss-islington) Date: 2019-05-21 20:27
New changeset 7abf8c60819d5749e6225b371df51a9c5f1ea8e9 by Miss Islington (bot) (Batuhan Taşkaya) in branch 'master':
bpo-25652: Fix __rmod__ of UserString (GH-13326)
https://github.com/python/cpython/commit/7abf8c60819d5749e6225b371df51a9c5f1ea8e9
History
Date User Action Args
2019-05-21 20:27:42miss-islingtonsetnosy: + miss-islington
messages: + msg343091
2019-05-14 18:41:50BTaskayasetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request13238
2017-05-10 19:38:53jcgoble3setmessages: + msg293451
2017-05-10 10:29:50serhiy.storchakasetmessages: + msg293398
2017-05-10 04:53:16rhettingersetmessages: + msg293378
2017-05-10 00:35:19jcgoble3setfiles: + userstringerror.py

messages: + msg293366
versions: + Python 3.7, - Python 3.5
2016-11-03 15:59:47serhiy.storchakasetmessages: + msg280000
2016-07-10 23:39:23jcgoble3setmessages: + msg270140
2016-07-10 23:23:34martin.pantersetnosy: + martin.panter
messages: + msg270139
2016-07-10 17:24:01r.david.murraysetnosy: + r.david.murray

messages: + msg270106
stage: commit review -> needs patch
2016-03-12 07:56:09serhiy.storchakasetassignee: serhiy.storchaka ->
2016-03-06 20:48:11serhiy.storchakasetmessages: + msg261273
2016-03-06 20:40:53pitrousetmessages: + msg261272
2016-03-06 20:34:19rhettingersetnosy: + pitrou
messages: + msg261271
2016-03-06 14:10:09serhiy.storchakasetmessages: + msg261251
2016-03-06 08:14:13rhettingersetassignee: rhettinger -> serhiy.storchaka
messages: + msg261239
2016-03-06 07:08:32serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg261235
2016-03-05 19:48:14rhettingersetmessages: + msg261223
stage: commit review
2016-03-05 19:35:17jcgoble3setmessages: + msg261222
versions: + Python 3.6
2015-11-18 16:23:02rhettingersetassignee: rhettinger
2015-11-18 07:19:37xiang.zhangsetfiles: + collections.UserString1

messages: + msg254835
2015-11-18 06:49:03xiang.zhangsetfiles: + collections.UserString
nosy: + xiang.zhang
messages: + msg254833

2015-11-18 06:38:56jcgoble3setmessages: + msg254832
2015-11-18 05:16:09jcgoble3create