classification
Title: collections.UserString encode method returns a string
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Mariatta, cheryl.sabella, dfortunov, mblahay, rhettinger, trey, xtreak
Priority: normal Keywords: easy, patch

Created on 2019-04-09 23:38 by trey, last changed 2019-08-28 05:00 by rhettinger. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 13138 merged dfortunov, 2019-05-06 21:19
PR 15557 merged miss-islington, 2019-08-28 04:38
Messages (9)
msg339818 - (view) Author: Trey Hunner (trey) * Date: 2019-04-09 23:38
It looks like the encode method for UserString incorrectly wraps its return value in a str call.

```
>>> from collections import UserString
>>> UserString("hello").encode('utf-8') == b'hello'
False
>>> UserString("hello").encode('utf-8')
"b'hello'"
>>> type(UserString("hello").encode('utf-8'))
<class 'collections.UserString'>
```
msg339824 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-04-10 05:22
Trey, would you like to submit a PR to fix this?  (Be sure to add a test case).
msg341550 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-05-06 15:55
I think this is an easy issue. The relevant code is at https://github.com/python/cpython/blob/cec01849f142ea96731b4725975b89d3af757656/Lib/collections/__init__.py#L1210 where the encoded result has to be fixed. Trey, if you haven't started working on it I think it's a good first issue for sprints.

A simple unittest patch that fails on master. This can have additional tests with both encoding and errors present and both of them absent hitting all three code paths in the function.

diff --git a/Lib/test/test_userstring.py b/Lib/test/test_userstring.py
index 71528223d3..81a4908dbd 100644
--- a/Lib/test/test_userstring.py
+++ b/Lib/test/test_userstring.py
@@ -39,6 +39,11 @@ class UserStringTest(
         # we don't fix the arguments, because UserString can't cope with it
         getattr(object, methodname)(*args)

+    def test_encode(self):
+        data = UserString("hello")
+        self.assertEqual(data.encode(encoding='utf-8'), b'hello')

 if __name__ == "__main__":
     unittest.main()
msg341563 - (view) Author: Daniel Fortunov (dfortunov) * Date: 2019-05-06 16:55
I'll pick this up in the PyCon US 2019 sprint this afternoon.
msg341593 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-06 18:38
I will pick this on up
msg341594 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-06 18:42
My mistake, dfortunov is already working on this one.
msg341649 - (view) Author: Daniel Fortunov (dfortunov) * Date: 2019-05-06 21:23
PR submitted here:
https://github.com/python/cpython/pull/13138

Rather than adding three different tests for the different code paths I chose to collapse the three different code paths by surfacing the underlying str.encode() defaults in the method signature of UserString.encode(), taking it down to a one-line implementation.

@xtreak: Thanks for the super-helpful triage and failing test case!
msg350653 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-28 04:38
New changeset 2a16eea71f56c2d8f38c295c8ce71a9a9a140aff by Raymond Hettinger (Daniel Fortunov) in branch 'master':
bpo-36582: Make collections.UserString.encode() return bytes, not str (GH-13138)
https://github.com/python/cpython/commit/2a16eea71f56c2d8f38c295c8ce71a9a9a140aff
msg350654 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-28 04:59
New changeset 2cb82d2a88710b0af10b9d9721a9710ecc037e72 by Raymond Hettinger (Miss Islington (bot)) in branch '3.8':
bpo-36582: Make collections.UserString.encode() return bytes, not str (GH-13138) (GH-15557)
https://github.com/python/cpython/commit/2cb82d2a88710b0af10b9d9721a9710ecc037e72
History
Date User Action Args
2019-08-28 05:00:29rhettingersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-08-28 04:59:57rhettingersetmessages: + msg350654
2019-08-28 04:38:55rhettingersetversions: + Python 3.9, - Python 3.7
2019-08-28 04:38:24miss-islingtonsetpull_requests: + pull_request15230
2019-08-28 04:38:13rhettingersetmessages: + msg350653
2019-05-06 21:23:42dfortunovsetmessages: + msg341649
2019-05-06 21:19:49dfortunovsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request13051
2019-05-06 18:42:31mblahaysetmessages: + msg341594
2019-05-06 18:38:51mblahaysetnosy: + mblahay
messages: + msg341593
2019-05-06 16:55:11dfortunovsetnosy: + dfortunov
messages: + msg341563
2019-05-06 15:56:32serhiy.storchakasetkeywords: + easy
stage: needs patch
2019-05-06 15:55:10xtreaksetnosy: + xtreak, cheryl.sabella, Mariatta
messages: + msg341550
2019-04-10 05:22:31rhettingersetassignee: rhettinger
type: behavior
messages: + msg339824
versions: + Python 3.8
2019-04-09 23:51:04xtreaksetnosy: + rhettinger
2019-04-09 23:38:53treycreate