This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author cmcqueen1975
Recipients cmcqueen1975, docs@python
Date 2010-07-08.07:07:06
SpamBayes Score 0.0018849501
Marked as misclassified No
Message-id <1278572830.17.0.148747735024.issue9196@psf.upfronthosting.co.za>
In-reply-to
Content
I have just been trying to figure out how string interpolation works for "%s", when Unicode strings are involved. It seems it's a bit complicated, but the Python documentation doesn't really describe it. It just says %s "converts any Python object using str()".

Here is what I have found (I think), and it could be worth improving the documentation of this somehow.

Example 1:
    "%s" % test_object

From what I can tell, in this case:
1. test_object.__str__() is called.
2. If test_object.__str__() returns a string object, then that is substituted.
3. If test_object.__str__() returns a Unicode object (for some reason), then test_object.__unicode__() is called, then _that_ is substituted instead. The output string is turned into Unicode. This behaviour is surprising.

[Note that the call to test_object.__str__() is not the same as str(test_object), because the former can return a Unicode object without causing an error, while the latter, if it gets a Unicode object, will then try to encode('ascii') to a string, possibly generating a UnicodeEncodeError exception.]


Example 2:
    u"%s" % test_object

In this case:
1. test_object.__unicode__() is called, if it exists, and the result is substituted. The output string is Unicode.
2. If test_object.__unicode__() doesn't exist, then test_object.__str__() is called instead, converted to Unicode, and substituted. The output string is Unicode.


Example 3:
    "%s %s" % (u'unicode', test_object)

In this case:
1. The first substitution causes the output string to be Unicode.
2. It seems that (1) causes the second substitution to follow the same rules as Example 2. This is a little surprising.
History
Date User Action Args
2010-07-08 07:07:10cmcqueen1975setrecipients: + cmcqueen1975, docs@python
2010-07-08 07:07:10cmcqueen1975setmessageid: <1278572830.17.0.148747735024.issue9196@psf.upfronthosting.co.za>
2010-07-08 07:07:08cmcqueen1975linkissue9196 messages
2010-07-08 07:07:07cmcqueen1975create