classification
Title: locale.format() input regression
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, barry, berker.peksag, docs@python, eric.smith, lemburg, lukasz.langa, r.david.murray, wolma
Priority: normal Keywords: easy

Created on 2010-11-09 23:33 by barry, last changed 2017-04-20 04:39 by berker.peksag. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 259 merged garvitdelhi, 2017-02-23 19:29
PR 1145 merged berker.peksag, 2017-04-15 01:16
Messages (19)
msg120905 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-09 23:33
@mission[~:1001]% python2.7 -c "import locale; print locale.format('%.0f KB', 100)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/locale.py", line 189, in format
    "format specifier, %s not valid") % repr(percent))
ValueError: format() must be given exactly one %char format specifier, '%.0f KB' not valid
@mission[~:1002]% python2.6 -c "import locale; print locale.format('%.0f KB', 100)"
100 KB
msg120907 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-11-09 23:56
This was changed by issue2522 on purpose; no suffix is allowed in locale.format().
msg120908 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-09 23:58
Okay, so line 187 of locale.py has this test:

    if not match or len(match.group())!= len(percent):

the problematic part is the len test.  When format string is '%.0f KB' match.group() is '%.0f' but of course percent is the full string.

This seems like a bogus test, since clearly the given input is a valid format string.  I'm not sure what the intent of this test is.  The Python 2.6 test is:

    if percent[0] != '%':

which is perhaps too naive.

I guess I don't understand why this test is here.  Wouldn't it make more sense to either just let any TypeError from _format() to percolate up, or to catch that TypeError and transform it into the ValueError?  Why try to replicate the logic of str.__mod__()?
msg120909 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-10 00:01
Hmm.  So I guess the answer is to use locale.format_string() instead.  But the documentation for locale.format() is not entirely clear about the prohibition on trailing text.
msg120912 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-11-10 00:08
I agree the documentation isn't terribly clear on what a "%char specifier" or "whole format string" is.

FWIW, this is also a 3.1 and greater issue.
msg120921 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-11-10 15:24
Yeah, obviously that language can be improved.  'exactly' was meant to imply 'nothing but', but clearly it doesn't.

If we want to restore more stringent backward compatibility and allow trailing text, it would be possible to make format an alias for format_string.  I'm not sure this is a good idea, but it is the most sensible way to restore backward compatibility while still fixing the original bug that I can think of.

Or...perhaps there is little need of both 'format' and 'format_string' as public APIs, and we could deprecate (without removing) one of them.

On the other hand, I believe the original bug affects the Ubuntu code that triggered this report...in other words, absent this fix chances are there would eventually have been a bug report against that code that would have necessitated that it change to use format_string anyway in order to get the correct locale-specific number formatting.
msg120922 - (view) Author: Ɓukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-10 16:07
Please use the deprecation process when possible. That would mean creating an alias for the function you want to remove somewhat like this (taken from configparser):

        def readfp(self, fp, filename=None):
            """Deprecated, use read_file instead."""
            warnings.warn(
                "This method will be removed in future versions.  "
                "Use 'parser.read_file()' instead.",
                PendingDeprecationWarning, stacklevel=2
            )
            self.read_file(fp, source=filename)
msg120978 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-11 22:42
The bug has been fixed upstream by replacing .format() with .format_string().  I'm not sure I understand why there are two different methods - .format() seems kind of pointless to me, but then I don't use the locale module enough to say what's useful.

For Python 2.7 I think the only thing we can do is to update the docs so that the distinction and restrictions are clear.
msg120989 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-11-12 00:15
Well, the distinction is that, before the bug fix that caused your issue, the 'format_string' method would use a regex to extract the % specifiers from the input string, and call 'format' to replace that % specifier with a properly localized result string.  That is, 'format' was designed to handle a single % specifier with no extra text, basically as a helper method for format_string.  The fact that it didn't reject extra text was, according to an internal comment, a defect of the implementation.  (Passing any extra text would cause the implementation to fail to do the internationalization that was the entire reason for calling it.)

When I fixed the bug I extracted the 'replace a single % specifier' code into an internal method, and made the format method live up to what I perceived to be its documented interface by rejecting extra input characters so that it could safely call the new internal substitution routine.

Now, from the perspective of a *user* of the locale module, I fail to see the point in having both 'format' and 'format_string' exposed.  If you want to format a single % specifier, just pass it to format_string.  Thus my suggestion to make them both do the same thing (to cater to other code that may be calling format incorrectly) and then deprecate one of them (presumably format).  

To bad I didn't think of that when I fixed the original bug.
msg120994 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-12 00:25
On Nov 12, 2010, at 12:15 AM, R. David Murray wrote:

>Now, from the perspective of a *user* of the locale module, I fail to see the
>point in having both 'format' and 'format_string' exposed.  If you want to
>format a single % specifier, just pass it to format_string.  Thus my
>suggestion to make them both do the same thing (to cater to other code that
>may be calling format incorrectly) and then deprecate one of them (presumably
>format).

+1

>To bad I didn't think of that when I fixed the original bug.

Dang.
msg185792 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2013-04-02 02:12
msg120978 "The bug has been fixed upstream...".  Have I missed something as on Windows Vista...?

c:\Users\Mark\MyPython>python
Python 3.3.1rc1 (v3.3.1rc1:92c2cfb92405, Mar 25 2013, 22:39:19) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale; print(locale.format('%.0f KB', 100))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\python33\lib\locale.py", line 193, in format
    "format specifier, %s not valid") % repr(percent))
ValueError: format() must be given exactly one %char format specifier, '%.0f KB' not valid
msg185828 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-04-02 11:32
Barry meant that the upstream program that triggered this error has been changed to call format_string() instead of format(). The bug still exists in format().

My suggestion is to have format() be an alias for format_string(). Deprecating format() is an optional step, but may not be worth the hassle.
msg185838 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-04-02 13:35
On Apr 02, 2013, at 11:32 AM, Eric V. Smith wrote:

>My suggestion is to have format() be an alias for
>format_string(). Deprecating format() is an optional step, but may not be
>worth the hassle.

Agreed on both counts.
msg185840 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-04-02 15:04
So I guess the question is: would this be a bug fix and applied to 2.7 and 3.3, or just an enhancement for 3.4?

I think it would be a bug fix and thus should be backported. It's not like we'd be breaking any working code, unless it was expecting the exception.
msg185841 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-04-02 15:38
On Apr 02, 2013, at 03:04 PM, Eric V. Smith wrote:

>I think it would be a bug fix and thus should be backported. It's not like
>we'd be breaking any working code, unless it was expecting the exception.

That would be my preference.
msg290734 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-03-28 15:43
New changeset 1cf93a76c2cf307f2e1e514a8944864f746337ea by R. David Murray (Garvit Khatri) in branch 'master':
bpo-10379: add 'monetary' to format_string, deprecate format
https://github.com/python/cpython/commit/1cf93a76c2cf307f2e1e514a8944864f746337ea
msg290735 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-03-28 15:46
Oops.  I merged the patch without coming back here first :(.  Still getting used to the new workflow.

It turns out that format has a parameter, monetary, that isn't supported by format_string.  So what we did was add that parameter to format_string and deprecate format.

If there is objection to this solution I will revert the merge.
msg291686 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-04-15 01:18
I think the solution in PR 259 is great, but I still find the documentation a little bit vague. I've just opened PR 1145 to add some examples specifiers.
msg291945 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-04-20 04:38
New changeset 6dbdedb0b18a5ca850ab8ce512fda24d5a9d0688 by Berker Peksag in branch 'master':
bpo-10379: Add %char examples to locale.format() docs (GH-1145)
https://github.com/python/cpython/commit/6dbdedb0b18a5ca850ab8ce512fda24d5a9d0688
History
Date User Action Args
2017-04-20 04:39:00berker.peksagsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-04-20 04:38:45berker.peksagsetmessages: + msg291945
2017-04-15 01:18:44berker.peksagsetversions: + Python 3.7, - Python 2.7, Python 3.5, Python 3.6
nosy: + berker.peksag

messages: + msg291686

stage: needs patch -> patch review
2017-04-15 01:16:17berker.peksagsetpull_requests: + pull_request1276
2017-03-28 15:46:37r.david.murraysetmessages: + msg290735
2017-03-28 15:43:40r.david.murraysetmessages: + msg290734
2017-03-28 14:41:35r.david.murraysetstage: backport needed -> needs patch
2017-03-28 14:33:09r.david.murraysetstage: needs patch -> backport needed
2017-02-23 19:29:23garvitdelhisetpull_requests: + pull_request229
2016-11-24 00:38:00eric.smithsetkeywords: + easy
assignee: docs@python ->
type: behavior
stage: needs patch
2016-11-23 22:15:56eric.smithsetversions: + Python 3.5, Python 3.6, - Python 3.3, Python 3.4
2016-11-23 20:57:26wolmasetnosy: + wolma
2014-02-03 18:35:48BreamoreBoysetnosy: - BreamoreBoy
2013-04-02 15:38:06barrysetmessages: + msg185841
2013-04-02 15:04:58eric.smithsetmessages: + msg185840
versions: + Python 3.3, Python 3.4, - Python 3.1, Python 3.2
2013-04-02 13:35:48barrysetmessages: + msg185838
2013-04-02 11:32:54eric.smithsetmessages: + msg185828
2013-04-02 02:12:46BreamoreBoysetnosy: + BreamoreBoy
messages: + msg185792
2010-11-12 00:25:35barrysetmessages: + msg120994
2010-11-12 00:15:25r.david.murraysetmessages: + msg120989
2010-11-11 22:42:51barrysetmessages: + msg120978
2010-11-10 16:07:49lukasz.langasetnosy: + lukasz.langa
messages: + msg120922
2010-11-10 15:24:56r.david.murraysetnosy: + lemburg
messages: + msg120921
2010-11-10 00:08:19eric.smithsetassignee: docs@python

components: + Documentation
nosy: + docs@python
2010-11-10 00:08:00eric.smithsetmessages: + msg120912
versions: + Python 3.1, Python 3.2
2010-11-10 00:01:55barrysetmessages: + msg120909
2010-11-09 23:59:30eric.smithsetnosy: + eric.smith
2010-11-09 23:58:52barrysetmessages: + msg120908
2010-11-09 23:56:06amaury.forgeotdarcsetnosy: + amaury.forgeotdarc, r.david.murray
messages: + msg120907
2010-11-09 23:33:25barrycreate