Title: collections.UserString encode method returns a string
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.8, Python 3.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Mariatta, cheryl.sabella, dfortunov, mblahay, rhettinger, trey, xtreak
Priority: normal Keywords: easy, patch

Created on 2019-04-09 23:38 by trey, last changed 2019-05-06 21:23 by dfortunov.

Pull Requests
URL Status Linked Edit
PR 13138 open dfortunov, 2019-05-06 21:19
Messages (7)
msg339818 - (view) Author: Trey Hunner (trey) * Date: 2019-04-09 23:38
It looks like the encode method for UserString incorrectly wraps its return value in a str call.

>>> from collections import UserString
>>> UserString("hello").encode('utf-8') == b'hello'
>>> UserString("hello").encode('utf-8')
>>> type(UserString("hello").encode('utf-8'))
<class 'collections.UserString'>
msg339824 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-04-10 05:22
Trey, would you like to submit a PR to fix this?  (Be sure to add a test case).
msg341550 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-05-06 15:55
I think this is an easy issue. The relevant code is at where the encoded result has to be fixed. Trey, if you haven't started working on it I think it's a good first issue for sprints.

A simple unittest patch that fails on master. This can have additional tests with both encoding and errors present and both of them absent hitting all three code paths in the function.

diff --git a/Lib/test/ b/Lib/test/
index 71528223d3..81a4908dbd 100644
--- a/Lib/test/
+++ b/Lib/test/
@@ -39,6 +39,11 @@ class UserStringTest(
         # we don't fix the arguments, because UserString can't cope with it
         getattr(object, methodname)(*args)

+    def test_encode(self):
+        data = UserString("hello")
+        self.assertEqual(data.encode(encoding='utf-8'), b'hello')

 if __name__ == "__main__":
msg341563 - (view) Author: Daniel Fortunov (dfortunov) * Date: 2019-05-06 16:55
I'll pick this up in the PyCon US 2019 sprint this afternoon.
msg341593 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-06 18:38
I will pick this on up
msg341594 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-06 18:42
My mistake, dfortunov is already working on this one.
msg341649 - (view) Author: Daniel Fortunov (dfortunov) * Date: 2019-05-06 21:23
PR submitted here:

Rather than adding three different tests for the different code paths I chose to collapse the three different code paths by surfacing the underlying str.encode() defaults in the method signature of UserString.encode(), taking it down to a one-line implementation.

@xtreak: Thanks for the super-helpful triage and failing test case!
