Message 320635 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	ezio.melotti, vstinner
Date	2018-06-27.22:35:49
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1530138949.59.0.56676864532.issue33954@psf.upfronthosting.co.za>
In-reply-to

Content
Aha, the problem occurs when the thousands separator code point is greater than 255. On my Fedora 28 (glibc 2.27), it's U+202f: vstinner@apu$ ./python Python 3.8.0a0 (heads/master-dirty:492572715a, Jun 28 2018, 00:18:54) >>> import locale >>> locale.setlocale(locale.LC_ALL, '') 'fr_FR.UTF-8' >>> locale.localeconv()['thousands_sep'] '\u202f' The bug is in _PyUnicode_InsertThousandsGrouping(): if thousands_sep kind is different than unicode kind, "data = _PyUnicode_AsKind(unicode, thousands_sep_kind);" is used, but later this memory is released. So the function writes into a temporary buffer which is then released. It doesn't work... It seems like I introduced the regression 6 years ago in bpo-13706: commit 90f50d4df9e21093f006427fd7ed11a0d704f792 Author: Victor Stinner <victor.stinner@haypocalc.com> Date: Fri Feb 24 01:44:47 2012 +0100 Issue #13706: Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF)

Aha, the problem occurs when the thousands separator code point is greater than 255.

On my Fedora 28 (glibc 2.27), it's U+202f:

vstinner@apu$ ./python
Python 3.8.0a0 (heads/master-dirty:492572715a, Jun 28 2018, 00:18:54) 
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'fr_FR.UTF-8'
>>> locale.localeconv()['thousands_sep']
'\u202f'

The bug is in _PyUnicode_InsertThousandsGrouping(): if thousands_sep kind is different than unicode kind, "data = _PyUnicode_AsKind(unicode, thousands_sep_kind);" is used, but later this memory is released. So the function writes into a temporary buffer which is then released. It doesn't work...

It seems like I introduced the regression 6 years ago in bpo-13706:

commit 90f50d4df9e21093f006427fd7ed11a0d704f792
Author: Victor Stinner <victor.stinner@haypocalc.com>
Date:   Fri Feb 24 01:44:47 2012 +0100

    Issue #13706: Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF)

History
Date	User	Action	Args
2018-06-27 22:35:49	vstinner	set	recipients: + vstinner, ezio.melotti
2018-06-27 22:35:49	vstinner	set	messageid: <1530138949.59.0.56676864532.issue33954@psf.upfronthosting.co.za>
2018-06-27 22:35:49	vstinner	link	issue33954 messages
2018-06-27 22:35:49	vstinner	create