This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author skrah
Recipients eric.smith, mark.dickinson, skrah
Date 2009-11-15.10:29:27
SpamBayes Score 0.0054876674
Marked as misclassified No
Message-id <1258280971.1.0.983226181609.issue7327@psf.upfronthosting.co.za>
In-reply-to
Content
This issue affects the format functions of float and decimal.

When calculating the padding necessary to reach the minimum width,
UTF-8 separators and decimal points are calculated by their byte
lengths. This can lead to printed representations that are too short.


Real world example (separator):

>>> import locale
>>> from decimal import *
>>> locale.setlocale(locale.LC_NUMERIC, "cs_CZ.UTF-8")
'cs_CZ.UTF-8'
>>> s = format(Decimal("-1.5"),  ' 019.18n')
>>> len(s)
19
>>> len(s.decode('utf-8'))
16
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> 
>>> 
>>> s = format(-1.5,  ' 019.18n')
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> len(s.decode('utf-8'))
16
>>> 


Constructed example (separator and decimal point):

>>> u = {'decimal_point' : "\xc2\xbf",  'grouping' : [3, 3, 0],
'thousands_sep': "\xc2\xb4"}
>>> def get_fmt(x, locale, fmt='n'):
...     return Decimal.__format__(Decimal(x), fmt, _localeconv=locale)
... 
>>> s = get_fmt(Decimal("1.5"), u, "020n")
>>> s
'00\xc2\xb4000\xc2\xb4000\xc2\xb4001\xc2\xbf5'
>>> len(s.decode('utf-8'))
16
History
Date User Action Args
2009-11-15 10:29:31skrahsetrecipients: + skrah, mark.dickinson, eric.smith
2009-11-15 10:29:31skrahsetmessageid: <1258280971.1.0.983226181609.issue7327@psf.upfronthosting.co.za>
2009-11-15 10:29:29skrahlinkissue7327 messages
2009-11-15 10:29:27skrahcreate