classification
Title: object.__format__ should reject format strings
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: BreamoreBoy, eric.smith, ezio.melotti, flox, hct, mark.dickinson, meador.inge, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2010-02-22 21:58 by eric.smith, last changed 2014-03-20 00:54 by hct. This issue is now closed.

Files
File name Uploaded Description Edit
issue7994-2.diff eric.smith, 2010-02-23 19:47
issue7994-3.diff meador.inge, 2010-02-26 04:03 Updated patch off of trunk
Messages (28)
msg99847 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-22 21:58
Background:

format(obj, fmt) eventually calls object.__format__(obj, fmt) if obj (or one of its bases) does not implement __format__. The behavior of object.__format__ is basically:

def __format__(self, fmt):
    return str(self).__format__(fmt)

So the caller of format() thought they were passing in a format string specific to obj, but it is interpreted as a format string for str.

This is not correct, or at least confusing. The format string is supposed to be type specific. However in this case the object is being changed (to type str), but the format string which was to be applied to its original type is now being passed to str.

This is an actual problem that occurred in the migration from 3.0 -> 3.1 and from 2.6 -> 2.7 with complex. In the earlier versions, complex did not have a __format__ method, but it does in the latter versions. So this code:
>>> format(1+1j, '10s')
'(1+1j)    '
worked in 2.6 and 3.0, but gives an error in 2.7 and 3.1:
>>> format(1+1j, '10s')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Unknown format code 's' for object of type 'complex'

Proposal:
object.__format__ should give an error if a non-empty format string is specified. In 2.7 and 3.2 make this a PendingDeprecationWarning, in 3.3 make it a DeprecationWarning, and in 3.4 make it an error.

Modify the documentation to make this behavior clear, and let the user know that if they want this behavior they should say:

format(str(obj), '10s')

or the equivalent:

"{0!s:10}".format(obj)

That is, the conversion to str should be explicit.
msg99916 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-23 13:59
Proposed patch attached. I need to add tests and docs.
msg99917 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-23 14:00
issue7994-0.diff is against trunk.
msg99943 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-23 18:08
This version of the patch adds support for classic classes and adds tests. Documentation still needs to be written.

Again, this diff is against trunk.

If anyone wants to review this, in particular the tests that exercise PendingDeprecationWarning, that would be great.
msg99948 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-23 19:47
Patch with Misc/NEWS.
msg100135 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2010-02-26 04:03
The patch looks reasonable.  I built on it with the following changes:

   1. Added some extra test cases to cover Unicode format strings, 
      since the code was changed to handle these as well.
   2. Changed test_builtin.py by 
      s/m[0].message.message/str(w[0].message)/, since 
      BaseException.message was deprecated in 2.6.

I also have the following general comments:

   1. PEP 3101 explicitly defines the string conversion for 
      object.__format__.  What is the rationale behind this?  Should
      we find out before making this change?
   2. I don't think the comments in 'abstract.c' and 'typeobject.c'
      explaining that the warning will eventually become an error are
      needed.  I think it would be better to open separate issues for
      these migration steps as they can be tracked easier and will be 
      more visible.
   3. test_unicode, test_str have cases that trigger the added 
      warning.  Should they be altered now or when (if) this becomes 
      an error?
msg100139 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-26 08:10
I haven't looked at the patch, but:

Thanks for the the additional tests. Missing unicode was definitely a mistake.

str(w[0].message) is an improvement.

The PEP is out of date in many respects. I think it's best to note that in the PEP and continue to keep the documentation up-to-date.

This issue already applies to 3.3, but my plan is to remove that and create a new issue when I close this one. But I'd still like to leave the comments in place.

I'm aware of the existing tests which trigger the warning. I think they should probably be removed, although I haven't really spent much time thinking about it.
msg101921 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-03-30 07:13
Meador: Your patch (-3) looks identical to mine (-2), unless I'm making some mistake. Could you check? I'd like to get this applied in the next few days, before 2.7b1.

Thanks!
msg101943 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2010-03-30 15:27
Hi Eric,

(-2) and (-3) are different.  The changes that I made, however, are pretty minor.  Also, they are all in 'test_builtin.py'.
msg102162 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-04-02 12:35
Committed in trunk in r79596. I'll leave this open until I port to py3k, check the old tests for this usage, and create the issue to make it a DeprecationWarning.
msg116271 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-09-13 01:54
This should be merged before 3.2 beta.
msg116290 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-09-13 08:24
now the PendingDeprecationWarnings are checked in the test suite, with r84772 (for 2.7).
msg116350 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-09-13 20:51
Manually merged to py3k in r84790. I'll leave this open until I create the 3.3 issue to change it to a DeprecationWarning.
msg116414 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-09-14 17:39
See issue 9856 for changing this to a DeprecationWarning in 3.3.
msg211042 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-11 23:34
New changeset f56b98143792 by R David Murray in branch 'default':
whatsnew: object.__format__ raises TypeError on non-empty string.
http://hg.python.org/cpython/rev/f56b98143792
msg214034 - (view) Author: HCT (hct) Date: 2014-03-18 22:44
just found out about this change in the latest official stable release and it's breaking my code all over the place. something like "{:s}".format( self.pc ) used to work in 3.3.4 and prior releases now raise exception rather then return a string 'None' when self.pc was never update to not None (was initialized to None during object init). this means I have to manually go and change every single line that expects smooth formatting to a check to see if the variable is still a 'NoneType'.

should we just create a format for None, alias string format to repr/str on classes without format implementation or put more thought into this
msg214040 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-03-19 00:34
I think the best we could do is have None.__format__ be:

def __format__(self, fmt):
   return str(self).__format__(fmt)

Or its logical equivalent.

But this seems more like papering over a bug, instead of actually fixing a problem. My suggestion is to use:
"{!s}".format(None)
That is: if you want to format a string, then explicitly force the argument to be a string.

I don't think None should be special and be auto-converted to a string.
msg214130 - (view) Author: HCT (hct) Date: 2014-03-19 20:22
I use lots of complicated format such as the following
"{:{:s}{:d}s}".format( self.pcs,self.format_align, self.max_length )

it looks like the way to do it from now on will be
"{!s:{:s}{:d}}".format( self.pcs,self.format_align, self.max_length )
msg214132 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-03-19 20:30
Or:

"{:{:s}{:d}s}".format(str(self.pcs), self.format_align, self.max_length)

You're trying to apply the string format specifier (the stuff after the first colon through the final "s", as expanded) to an object that's not always a string: sometimes it's None. So you need to use one of the two supported ways to convert it to a string. Either str() or !s.

str.format() is very much dependent on the types of its arguments: the format specifier needs to be understood by the object being formatted. Similarly, you couldn't pass in a datetime and expect that to work, either.
msg214154 - (view) Author: HCT (hct) Date: 2014-03-19 23:53
unlike NoneType, datetime doesn't throw exception. is returning the format specifier the intended behaviour of this fix?


>>> import datetime
>>> a=datetime.datetime(1999,7,7)
>>> str(a)
'1999-07-07 00:00:00'
>>> "{:s}".format(a)
's'
>>> "{:7s}".format(a)
'7s'
>>> "{!s}".format(a)
'1999-07-07 00:00:00'
>>>
msg214156 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-20 00:05
Yes.  It is not returning the format specifier, it is filling in the strftime template "s" from the datetime...which equals "s", since it consists of just that constant string.

Try {:%Y-%m-%d}, for example.
msg214157 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-20 00:08
Which, by the way, has been the behavior all along, it is not something affected by this fix, because datetime *does* have a __format__ method.
msg214158 - (view) Author: HCT (hct) Date: 2014-03-20 00:26
None does have __format__, but it raises exception

>>> dir(None)
['__bool__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

>>> None.__format__
<built-in method __format__ of NoneType object at 0x50BB2760>
msg214159 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-03-20 00:35
That's not an exception, you've not actually called the function.

>>> None.__format__('')
'None'
msg214160 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-03-20 00:39
David is correct.

It's often easiest to think about the builtin format() instead of str.format(). Notice below that the format specifier has to make sense for the object being formatted:

>>> import datetime
>>> now = datetime.datetime.now()

>>> format('somestring', '.12s')
'somestring  '

# "works", but not what you want because it calls now.strftime('.12s'):
>>> format(now, '.12s')
'.12s'

# better:
>>> format(now, '%Y-%m-%d')  # better
'2014-03-19'

# int doesn't know what '.12s' format spec means:
>>> format(3, '.12s')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Unknown format code 's' for object of type 'int'

# None doesn't have an __format__, so object.__format__ rejects it:
>>> format(None, '.12s')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: non-empty format string passed to object.__format__

# just like a random class doesn't have an __format__:
>>> class F: pass
... 
>>> format(F(), '.12s')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: non-empty format string passed to object.__format__


Tangentially related:

The best you can do here, given your use case, is to argue that None needs an __format__ that understands str's format specifiers, because you like to mix str and None. But maybe someone else likes to mix int and None. Maybe None should understand int's format specifiers, and not str's:

>>> format(42000, ',d')
'42,000'
>>> format('42000', ',d')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Unknown format code 'd' for object of type 'str'

Why would "format(None, '.12s')" make any more sense than "format(None, ',d')"? Since we can't guess, we chose an error.
msg214161 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-20 00:41
NoneType is a subclass of object.

>>> class Foo(object):
...    pass
... 
>>> f = Foo()
>>> f.__format__
<built-in method __format__ of Foo object at 0xb71543b4>

ie: the exception is being raised by object.__format__, as provided for by this issue.
msg214162 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-03-20 00:47
BreamoreBoy:

This is basically the definition of object.__format__:

def __format__(self, specifier):
  if len(specifier) == 0:
    return str(self)
  raise TypeError('non-empty format string passed to object.__format__')

Which is why it works for an empty specifier.



As a reminder, the point of raising this type error is described in the first message posted in this bug. This caused us an actual problem when we implemented complex.__format__, and I don't see object.__format__ changing.

Implementing NoneType.__format__ and having it understand some string specifiers would be possible, but I'm against it, for reasons I hope I've made clear.


As to why None.__format__ appears to be implemented, it's the same as this:

>>> class Foo: pass
... 
>>> Foo().__format__
<built-in method __format__ of Foo object at 0xb74e6a4c>

That's really object.__format__, bound to a Foo instance.
msg214163 - (view) Author: HCT (hct) Date: 2014-03-20 00:54
I think was confused as I forgot that I was doing str.format where {} being format of str. confusion cleared
History
Date User Action Args
2014-03-20 00:54:32hctsetmessages: + msg214163
2014-03-20 00:47:29eric.smithsetmessages: + msg214162
2014-03-20 00:41:16r.david.murraysetmessages: + msg214161
2014-03-20 00:39:42eric.smithsetmessages: + msg214160
2014-03-20 00:35:56BreamoreBoysetnosy: + BreamoreBoy
messages: + msg214159
2014-03-20 00:26:08hctsetmessages: + msg214158
2014-03-20 00:08:35r.david.murraysetmessages: + msg214157
2014-03-20 00:05:58r.david.murraysetmessages: + msg214156
2014-03-19 23:53:59hctsetmessages: + msg214154
2014-03-19 20:30:56eric.smithsetmessages: + msg214132
2014-03-19 20:22:34hctsetmessages: + msg214130
2014-03-19 00:34:30eric.smithsetmessages: + msg214040
2014-03-18 22:49:36r.david.murraysetnosy: + r.david.murray
2014-03-18 22:44:57hctsetnosy: + hct
messages: + msg214034
2014-02-11 23:34:43python-devsetnosy: + python-dev
messages: + msg211042
2010-09-14 17:39:15eric.smithsetstatus: open -> closed

messages: + msg116414
2010-09-13 20:51:53eric.smithsetkeywords: - needs review

messages: + msg116350
versions: - Python 3.3
2010-09-13 08:24:14floxsetmessages: + msg116290
2010-09-13 01:54:20floxsetnosy: + flox
resolution: accepted
messages: + msg116271
2010-08-07 02:42:41ezio.melottisetnosy: + ezio.melotti
2010-04-02 12:35:39eric.smithsetmessages: + msg102162
stage: patch review -> resolved
2010-03-30 15:27:10meador.ingesetmessages: + msg101943
2010-03-30 07:13:16eric.smithsetmessages: + msg101921
2010-02-26 08:10:48eric.smithsetmessages: + msg100139
2010-02-26 04:03:19meador.ingesetfiles: + issue7994-3.diff
nosy: + meador.inge
messages: + msg100135

2010-02-23 19:47:32eric.smithsetfiles: - issue7994-1.diff
2010-02-23 19:47:27eric.smithsetfiles: - issue7994-0.diff
2010-02-23 19:47:16eric.smithsetfiles: + issue7994-2.diff
2010-02-23 19:47:05eric.smithsetmessages: + msg99948
2010-02-23 18:47:18mark.dickinsonsetnosy: + mark.dickinson
2010-02-23 18:08:46eric.smithsetkeywords: - easy
files: + issue7994-1.diff
messages: + msg99943
2010-02-23 14:00:41eric.smithsetkeywords: + easy, needs review

messages: + msg99917
2010-02-23 13:59:54eric.smithsetstage: needs patch -> patch review
2010-02-23 13:59:37eric.smithsetfiles: + issue7994-0.diff
keywords: + patch
2010-02-23 13:59:15eric.smithsetmessages: + msg99916
2010-02-22 21:59:55eric.smithsetversions: + Python 3.3
2010-02-22 21:58:57eric.smithcreate