Message 95283 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	skrah
Recipients	eric.smith, mark.dickinson, skrah
Date	2009-11-15.10:29:27
SpamBayes Score	0.0054876674
Marked as misclassified	No
Message-id	<1258280971.1.0.983226181609.issue7327@psf.upfronthosting.co.za>
In-reply-to

Content
This issue affects the format functions of float and decimal. When calculating the padding necessary to reach the minimum width, UTF-8 separators and decimal points are calculated by their byte lengths. This can lead to printed representations that are too short. Real world example (separator): >>> import locale >>> from decimal import * >>> locale.setlocale(locale.LC_NUMERIC, "cs_CZ.UTF-8") 'cs_CZ.UTF-8' >>> s = format(Decimal("-1.5"), ' 019.18n') >>> len(s) 19 >>> len(s.decode('utf-8')) 16 >>> s '-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5' >>> >>> >>> s = format(-1.5, ' 019.18n') >>> s '-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5' >>> len(s.decode('utf-8')) 16 >>> Constructed example (separator and decimal point): >>> u = {'decimal_point' : "\xc2\xbf", 'grouping' : [3, 3, 0], 'thousands_sep': "\xc2\xb4"} >>> def get_fmt(x, locale, fmt='n'): ... return Decimal.__format__(Decimal(x), fmt, _localeconv=locale) ... >>> s = get_fmt(Decimal("1.5"), u, "020n") >>> s '00\xc2\xb4000\xc2\xb4000\xc2\xb4001\xc2\xbf5' >>> len(s.decode('utf-8')) 16

This issue affects the format functions of float and decimal.

When calculating the padding necessary to reach the minimum width,
UTF-8 separators and decimal points are calculated by their byte
lengths. This can lead to printed representations that are too short.


Real world example (separator):

>>> import locale
>>> from decimal import *
>>> locale.setlocale(locale.LC_NUMERIC, "cs_CZ.UTF-8")
'cs_CZ.UTF-8'
>>> s = format(Decimal("-1.5"),  ' 019.18n')
>>> len(s)
19
>>> len(s.decode('utf-8'))
16
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> 
>>> 
>>> s = format(-1.5,  ' 019.18n')
>>> s
'-0\xc2\xa0000\xc2\xa0000\xc2\xa0001,5'
>>> len(s.decode('utf-8'))
16
>>> 


Constructed example (separator and decimal point):

>>> u = {'decimal_point' : "\xc2\xbf",  'grouping' : [3, 3, 0],
'thousands_sep': "\xc2\xb4"}
>>> def get_fmt(x, locale, fmt='n'):
...     return Decimal.__format__(Decimal(x), fmt, _localeconv=locale)
... 
>>> s = get_fmt(Decimal("1.5"), u, "020n")
>>> s
'00\xc2\xb4000\xc2\xb4000\xc2\xb4001\xc2\xbf5'
>>> len(s.decode('utf-8'))
16

History
Date	User	Action	Args
2009-11-15 10:29:31	skrah	set	recipients: + skrah, mark.dickinson, eric.smith
2009-11-15 10:29:31	skrah	set	messageid: <1258280971.1.0.983226181609.issue7327@psf.upfronthosting.co.za>
2009-11-15 10:29:29	skrah	link	issue7327 messages
2009-11-15 10:29:27	skrah	create