This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dcoles
Recipients dcoles, lemburg, loewis, pitrou, vstinner
Date 2011-05-06.21:34:51
SpamBayes Score 0.0
Marked as misclassified No
Message-id <BANLkTikAcwmm70QiG-2JSLXVoGXgVb+_1A@mail.gmail.com>
In-reply-to <4DC45C15.2060905@egenix.com>
Content
On Fri, May 6, 2011 at 1:31 PM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
> wchar_t should be fairly portable these days. I think the main
> problem is that we never assumed sizeof(wchar_t) == 1 to be a
> possibility. On Windows, wchar_t was 16 bit and the glibc started
> out with 32 bits.

Well a 1 byte wchar_t is a bit "ass backwards". I think it's very much
an edge case. :)

> Note that HAVE_USABLE_WCHAR_T is only used to check whether
> Python can use wchar_t as alias for Py_UNICODE. Python's Unicode
> implementation needs Py_UNICODE to be an unsigned type with
> either 2 bytes or 4 bytes. If wchar_t does not provide these
> sizes or is a signed type, Python cannot use it for Py_UNICODE
> and must instead use "unsigned short".

Right. That makes sense. In that case it's probably sensible to keep around.

> If the configure script does not detect this case, then a patch
> would be helpful.

Yup. I'll put something together that causes configure to bail out if
you're either missing HAVE_WCHAR_H or if SIZEOF_WCHAR_T is less than
16 bits.

> Python should not use wchar_t for Py_UNICODE on such platforms
> and instead go with "unsigned short".
>
> I would assume that the wchar_t C lib routines work based on UTF-8
> with sizeof(wchar_t) == 1, so the PyUnicode_*WideChar*() APIs would
> need to be adjusted to work more or less like the UTF-8 codecs.

Yes. Using UTF-8 would be the sensible solution. Sadly it looks like
all the wide character functions <2.3 are undefined, so in this case
Android saying it has wchar_t support is worse than useless.

On Fri, May 6, 2011 at 1:37 PM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
> With none of the wide-char functions working in Android <2.3, I don't
> think you have a good chance of getting Python 3.x working, unless
> you remove all their uses in the code and replace them with standard
> char* functions.

I agree. In my case I should be able to bump the required version
number without too much fuss. It seems a bit silly to write in support
for a platform that no longer supports said feature.

> The last paragraph doesn't sound very promising either. I wonder
> what they mean with "better representation". The C standard doesn't
> have any better representation for Unicode at the moment.

In C I guess the only sensible alternative would be UTF-8 char strings
(or maybe using uint32_t), but in Python's case it really depends on
how the underlying OS represents internationalized characters. Perhaps
in other projects you would use an external library like ICU, but
that's out the scope of my experience. :)
History
Date User Action Args
2011-05-06 21:34:52dcolessetrecipients: + dcoles, lemburg, loewis, pitrou, vstinner
2011-05-06 21:34:52dcoleslinkissue12010 messages
2011-05-06 21:34:51dcolescreate