Message244181
The upgrade from 2.7.9 to 2.7.10 resulted in test__locale failing.
This test had previously succeeded. The difference is that the
thousands-separator for the fr_FR locale in known_numerics was
changed from '' (i.e., unknown) to ' ' (i.e. space). But on Solaris,
'\xa0' (i.e., non-break space in ISO8859-1) is what the fr_FR locale
returns for LC_NUMERIC's thousands-separator. I inquired with our
Globalization experts, who replied:
---
The short answer is that CLDR defines the group separator as no-break
space (U+00A0): http://st.unicode.org/cldr-apps/v#/fr/Symbols/
so the solaris locale fr_FR (=fr_FR.ISO8859-1) is correct.
The long answer is that the situation is confusing, the fr_FR.ISO8859-1
defines the thousands_sep as no-break space, but fr_FR.UTF-8 defines
the thousands_sep as space (U+0020). There is no technical limit, but
combination of POSIX [1] and C language [2] limits the thousands_sep
to single byte character. The no-break space is single byte character
in ISO8859-1, but multibyte in UTF-8.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03_04
[2] http://en.cppreference.com/w/c/locale/lconv
&&
http://en.cppreference.com/w/c/language/character_constant
---
The attached patch fixes the test on Solaris. It is not clear if this
is the Right Answer for all platforms, but I offer the attached patch
in case it helps anyone else. |
|
Date |
User |
Action |
Args |
2015-05-27 15:43:31 | jbeck | set | recipients:
+ jbeck |
2015-05-27 15:43:31 | jbeck | set | messageid: <1432741411.37.0.656292350471.issue24299@psf.upfronthosting.co.za> |
2015-05-27 15:43:31 | jbeck | link | issue24299 messages |
2015-05-27 15:43:31 | jbeck | create | |
|