classification
Title: format method: c presentation type broken
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.7, Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: doerwalter, eric.smith (2)
Priority: Keywords

Created on 2009-11-05 16:22 by doerwalter, last changed 2009-11-10 13:58 by doerwalter.

Messages (6)
msg94935 - (view) Author: Walter Dörwald (doerwalter) Date: 2009-11-05 16:22
The c presentation type in the new format method from PEP 3101 seems to
be broken:

Python 2.6.4 (r264:75706, Oct 27 2009, 15:18:04) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> u'{0:c}'.format(256)
u'\x00'

The PEP states:

'c' - Character. Converts the integer to the corresponding Unicode
character before printing, so I would have expected this to return
u'\u0100' instead of u'\x00'.
msg94936 - (view) Author: Eric Smith (eric.smith) Date: 2009-11-05 16:30
I'll look at it.
msg94969 - (view) Author: Eric Smith (eric.smith) Date: 2009-11-06 14:09
This is a bug in the way ints and longs are formatted. They always do
the formatting as str, then convert to unicode. This works everywhere
except with the 'c' presentation type. I'm still trying to decide how
best to handle this.
msg94972 - (view) Author: Walter Dörwald (doerwalter) Date: 2009-11-06 14:52
I'd say that a value >= 128 should generate a Unicode string (as the PEP
explicitely states that the value is a Unicode code point and not a byte
value).

However str.format() doesn't seem to support mixing str and unicode anyway:

>>> '{0}'.format(u'\u3042')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u3042' in
position 0: ordinal not in range(128)

so str.format() might raise an OverflowError for values >= 128 (or >= 256?)
msg95113 - (view) Author: Eric Smith (eric.smith) Date: 2009-11-10 13:20
> so str.format() might raise an OverflowError for values >= 128 (or >=
256?)

Maybe, but the issue you reported is in unicode.format() (not
str.format()), and I think that should be fixed. I'm trying to think of
how best to address it.

As for the second issue you raise (which I think is that str.format()
can't take a unicode argument), would you mind opening a separate issue
for this and assigning it to me? Thanks.
msg95115 - (view) Author: Walter Dörwald (doerwalter) Date: 2009-11-10 13:58
Done: issue 7300.
History
Date User Action Args
2009-11-10 13:58:23doerwaltersetmessages: + msg95115
2009-11-10 13:20:17eric.smithsetmessages: + msg95113
2009-11-06 14:52:30doerwaltersetmessages: + msg94972
2009-11-06 14:09:20eric.smithsetmessages: + msg94969
versions: + Python 2.7
2009-11-05 16:30:22eric.smithsetassignee: eric.smith

messages: + msg94936
nosy: + eric.smith
2009-11-05 16:22:47doerwaltercreate