Author vstinner
Recipients ezio.melotti, vstinner
Date 2018-06-27.22:35:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1530138949.59.0.56676864532.issue33954@psf.upfronthosting.co.za>
In-reply-to
Content
Aha, the problem occurs when the thousands separator code point is greater than 255.

On my Fedora 28 (glibc 2.27), it's U+202f:

vstinner@apu$ ./python
Python 3.8.0a0 (heads/master-dirty:492572715a, Jun 28 2018, 00:18:54) 
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'fr_FR.UTF-8'
>>> locale.localeconv()['thousands_sep']
'\u202f'

The bug is in _PyUnicode_InsertThousandsGrouping(): if thousands_sep kind is different than unicode kind, "data = _PyUnicode_AsKind(unicode, thousands_sep_kind);" is used, but later this memory is released. So the function writes into a temporary buffer which is then released. It doesn't work...

It seems like I introduced the regression 6 years ago in bpo-13706:

commit 90f50d4df9e21093f006427fd7ed11a0d704f792
Author: Victor Stinner <victor.stinner@haypocalc.com>
Date:   Fri Feb 24 01:44:47 2012 +0100

    Issue #13706: Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF)
History
Date User Action Args
2018-06-27 22:35:49vstinnersetrecipients: + vstinner, ezio.melotti
2018-06-27 22:35:49vstinnersetmessageid: <1530138949.59.0.56676864532.issue33954@psf.upfronthosting.co.za>
2018-06-27 22:35:49vstinnerlinkissue33954 messages
2018-06-27 22:35:49vstinnercreate