classification
Title: locale.format() problems with decimal separator
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, georg.brandl, mishok13, pitrou, r.david.murray
Priority: low Keywords: patch

Created on 2008-03-31 17:08 by mishok13, last changed 2010-11-10 00:06 by barry. This issue is now closed.

Files
File name Uploaded Description Edit
locale.diff mishok13, 2008-04-01 14:02 path that (partly) fixes incorrect locale.format() behavior with malformed strings
issue2522.patch r.david.murray, 2009-03-29 17:18 patch and tests
Messages (10)
msg64787 - (view) Author: Andrii V. Mishkovskyi (mishok13) Date: 2008-03-31 17:08
locale.format() doesn't insert correct decimal separator to string
representation when 'format' argument has '\r' or '\n' symbols in it.
This bug has been reproduced on Python 2.5.2 and svn-trunk.

Python 2.4.5 (#2, Mar 12 2008, 14:42:24)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "ru_RU.UTF-8")
'ru_RU.UTF-8'
>>> a = 1.234
>>> print locale.format("%f", a)
1,234000
>>> print locale.format("%f\n", a)
1,234000

>>> print locale.format("%f\r", a)
1,234000


Python 2.6a1+ (trunk:62083, Mar 31 2008, 19:24:56)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "ru_RU.UTF-8")
'ru_RU.UTF-8'
>>> a = 1.234
>>> print locale.format("%f", a)
1,234000
>>> print locale.format("%f\n", a)
1.234000

>>> print locale.format("%f\r", a)
1.234000
Python 2.5.2 (r252:60911, Mar 12 2008, 13:36:25)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "ru_RU.UTF-8")
'ru_RU.UTF-8'
>>> a = 1.234
>>> print locale.format("%f", a)
1,234000
>>> print locale.format("%f\n", a)
1.234000

>>> print locale.format("%f\r", a)
1.234000
msg64810 - (view) Author: Andrii V. Mishkovskyi (mishok13) Date: 2008-04-01 14:02
I've uploaded a patch that fixes this concrete issue, though
locale.format() continues to silently ignore other types of malformed
strings (e.g. locale.format('%fSPAMf')).
I don't think this is correct behavior. Maybe there should be reg-exp
that locale.format() will use to avoid such issues.
msg84340 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-03-28 22:51
This bug is more subtle than it first appears.  As far as I've been able
to figure out, there is in fact no way to reliably detect that there is
non-format text after the format specifier short of completely parsing
the format specifier.  I went through several possibilities and found a
counter example for each that shows it would introduce new bugs:

See if last character of formatted string is different from last char of
formatter: format('%s', 'things') would then incorrectly be an error.

Make sure last character of format string is a valid format character: 
format('%fx', a) would be valid, but it has a 'x' on the end which is
not part of the format string.  (The suggested patch has a false
negative in this case as well.)

Check for a decimal in the formatted string and if it didn't get
transformed, complain:  format('%s', '1.234') would fail incorrectly.

One could argue that at least \n and \r should be checked for since the
output from those cases is least obviously "wrong", but I don't think
that is a strong argument.  The extra control character is almost as
"visible" in the output as the trailing 'x' would be in the above
example, and the effects of the trailing x are equally mysterious.

To fix this correctly would require reimplementing format parsing in the
locale module, which would be a maintenance headache.

I'm inclined to close this "won't fix", unless someone can come up with
a heuristic that won't give false positives or false negatives.
msg84343 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-28 23:03
AFAIK, locale.format() is supposed to be used with a single format
specifier, not a complete format string. It's up to you to concatenate
the various parts afterwards.
msg84347 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-03-28 23:08
That is true, however the code contains the comment "this is only for
one-percent-specifier strings and this should be checked", implying
that the intent is to make sure only a single format specifier
has been passed.  I don't think it is reasonable to perfect that
check, however.
msg84416 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-03-29 17:18
It occured to me last night that it could be checked using a regular
expression, and indeed the locale module already has a regular
expression that matches percent codes.  I've uploaded a patch that uses
this regex to fix this issue.  I've removed 2.6 and 3.0 as this change
could break existing code that is misusing format.

I added georg.brandl to the nosy list since svn blame shows him as the
author of the code being modified.
msg84555 - (view) Author: Andrii V. Mishkovskyi (mishok13) Date: 2009-03-30 14:46
Nice to see this moving forward. Your patch looks nicer than my naive
approach and I hope it's going to be applied. Thanks for investigation. :)
msg84974 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-04-01 03:51
Fixed in r70936/r70938.
msg120910 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-10 00:06
Hmm.  See bug 10379 for fallout from this change.  I'm not saying it should be reverted but see that issue for further discussion.
msg120911 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-11-10 00:06
I mean issue 10379
History
Date User Action Args
2010-11-10 00:06:52barrysetmessages: + msg120911
2010-11-10 00:06:34barrysetnosy: + barry
messages: + msg120910
2009-04-01 03:51:14r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg84974

stage: patch review -> resolved
2009-03-30 14:46:26mishok13setmessages: + msg84555
2009-03-29 17:18:24r.david.murraysetstatus: pending -> open
files: + issue2522.patch

versions: - Python 2.6, Python 3.0
nosy: + georg.brandl

messages: + msg84416
resolution: wont fix -> (no value)
stage: resolved -> patch review
priority: low
2009-03-28 23:08:55r.david.murraysetmessages: + msg84347
2009-03-28 23:03:19pitrousetnosy: + pitrou
messages: + msg84343
2009-03-28 22:51:13r.david.murraysetstatus: open -> pending

versions: + Python 3.0, Python 3.1, Python 2.7, - Python 2.5
nosy: + r.david.murray

messages: + msg84340
resolution: wont fix
stage: resolved
2008-04-01 14:03:00mishok13setfiles: + locale.diff
keywords: + patch
messages: + msg64810
components: + Library (Lib)
2008-03-31 17:08:08mishok13create