This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Unicode Objects in Tuples
Type: behavior Stage:
Components: Windows Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Stephen_Tucker, eric.smith, loewis, peter.otten, pitrou, r.david.murray
Priority: normal Keywords:

Created on 2013-10-09 16:39 by Stephen_Tucker, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
UnicodeTupleTestIDLEOutput.txt Stephen_Tucker, 2013-10-10 09:04
UnicodeTupleTest.py Stephen_Tucker, 2013-10-10 09:04
Messages (11)
msg199308 - (view) Author: Stephen Tucker (Stephen_Tucker) Date: 2013-10-09 16:39
If a tuple consists of a single unicode object with non-ASCII characters in it, the printing of the tuple causes the non-ASCII characters to appear correctly as characters.

If the tuple contains such a unicode object and anything else (even if it contains nothing else but two or more such unicode objects), the printing of the tuple causes all non-ASCII characters in the objects to appear as their "\uxxxx" escapes instead of as their characters.

The same thing happens when writing such tuples to a file that has been opened using codecs.open (<filename>, 'w', 'utf-8').
msg199314 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-10-09 17:10
Can you provide some code which demonstrates this?

It's easier to address this if we have known working (or non-working) examples.

Thanks.
msg199316 - (view) Author: Peter Otten (peter.otten) * Date: 2013-10-09 17:46
Be aware that for a 1-tuple the trailing comma is mandatory:

>>> print (u"äöü") # this is a string despite the suggestive parens
äöü
>>> print (u"äöü",) # this is a tuple
(u'\xe4\xf6\xfc',)
msg199321 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-10-09 18:23
This isn't strictly related to printing a tuple. It's the difference between str() and repr():

>>> print (u"äöü")      # uses str
äöü
>>> print repr(u"äöü")
u'\xe4\xf6\xfc'

When the tuple is printed, it uses the repr of its constituent parts.
msg199353 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-09 22:26
Indeed, this is a feature, even though it may seem an odd one.
msg199374 - (view) Author: Stephen Tucker (Stephen_Tucker) Date: 2013-10-10 09:04
Dear All (Eric Smith in particular),

I see the issue has been closed - I guess that I have to use e-mail to
continue this discussion.

I attach a source file that demonstrates the "feature", and the output from
IDLE that it generated.

Yours,

Stephen Tucker.

On Wed, Oct 9, 2013 at 6:10 PM, Eric V. Smith <report@bugs.python.org>wrote:

>
> Eric V. Smith added the comment:
>
> Can you provide some code which demonstrates this?
>
> It's easier to address this if we have known working (or non-working)
> examples.
>
> Thanks.
>
> ----------
> nosy: +eric.smith
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue19210>
> _______________________________________
>
msg199379 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2013-10-10 11:21
Stephen: do you agree that your example actually doesn't demonstrate the issue you originally reported?

Your first to print statements don't actually print a tuple, whereas the latter two do, and the string gets always escaped in the tuple, and never when printed directly.
msg199380 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-10-10 11:29
As Martin points out, your first example is printing a string, not a tuple. The parens here are not building a tuple, they are just used for grouping in the expression, and are not doing anything in this example.

So my explanation still holds: everything is working as designed. The output of the first two print statements uses str(astring), the second two use str(atuple).

repr of a tuple is effectively:
"(" + ", ".join(repr(item) for item in tuple) + ")"

So when printing your tuples, Python is using the repr of each string in the tuple.

Since this is not a bug or feature request, it's probably best to continue the discussion on python-list.
msg199407 - (view) Author: Stephen Tucker (Stephen_Tucker) Date: 2013-10-10 19:45
Martin: Yes, I agree this does not demonstrate the issue I reported - so
far as  print  is concerned. The other issue in my original report was that
the same behaviour is exhibited when tuples are read from a utf-8 - encoded
file where a tuple which has a unicode string in it with a non-ASCII
character is displayed with its non-ASCII characters as escapes.

Eric:  I am in a quandary. I am not convinced that the appearance of such
strings (under either circumstance) should be governed by whether they are
in tuples or not. It still seems remarkably like a bug to me. However, I am
happy to continue this discussion on Python-list, if you consider it better
to do that. Please, can you tell me, how do I do that?

On Thu, Oct 10, 2013 at 12:29 PM, Eric V. Smith <report@bugs.python.org>wrote:

>
> Eric V. Smith added the comment:
>
> As Martin points out, your first example is printing a string, not a
> tuple. The parens here are not building a tuple, they are just used for
> grouping in the expression, and are not doing anything in this example.
>
> So my explanation still holds: everything is working as designed. The
> output of the first two print statements uses str(astring), the second two
> use str(atuple).
>
> repr of a tuple is effectively:
> "(" + ", ".join(repr(item) for item in tuple) + ")"
>
> So when printing your tuples, Python is using the repr of each string in
> the tuple.
>
> Since this is not a bug or feature request, it's probably best to continue
> the discussion on python-list.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue19210>
> _______________________________________
>
msg199409 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-10-10 20:13
python-list is a mailing list, so you would subscribe and post your questions and examples there.  There are very good reasons for the existing behavior, and python-list would be a good place for you to learn about them (by asking questions).

The file case is the same: you are using print to create the output, and it is print's rules that are being used to generate that output, before it ever gets written to the file.

(And by the way, you are free to post to a closed issue.  Having it closed just means it doesn't show up on our list of issues we need to fix :)
msg199461 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2013-10-11 11:32
It's at https://mail.python.org/mailman/listinfo/python-list
History
Date User Action Args
2022-04-11 14:57:51adminsetgithub: 63409
2013-10-11 11:32:04loewissetmessages: + msg199461
2013-10-10 20:13:13r.david.murraysetnosy: + r.david.murray
messages: + msg199409
2013-10-10 19:45:05Stephen_Tuckersetmessages: + msg199407
2013-10-10 11:29:42eric.smithsetmessages: + msg199380
2013-10-10 11:21:28loewissetnosy: + loewis
messages: + msg199379
2013-10-10 09:04:17Stephen_Tuckersetfiles: + UnicodeTupleTestIDLEOutput.txt, UnicodeTupleTest.py

messages: + msg199374
2013-10-09 22:26:12pitrousetstatus: open -> closed

nosy: + pitrou
messages: + msg199353

resolution: not a bug
2013-10-09 18:23:57eric.smithsetmessages: + msg199321
2013-10-09 17:46:14peter.ottensetnosy: + peter.otten
messages: + msg199316
2013-10-09 17:10:21eric.smithsetnosy: + eric.smith
messages: + msg199314
2013-10-09 16:39:21Stephen_Tuckercreate