Author lemburg
Recipients amaury.forgeotdarc, bupjae, ezio.melotti, lemburg, vstinner
Date 2009-02-03.18:47:24
SpamBayes Score 2.77556e-16
Marked as misclassified No
Message-id <49889139.7080005@egenix.com>
In-reply-to <1233669013.15.0.414055262607.issue5127@psf.upfronthosting.co.za>
Content
On 2009-02-03 14:50, Amaury Forgeot d'Arc wrote:
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
> 
>> That would cause major breakage in the C API 
> 
> Not if you recompile. I don't see how this breaks the API at the C level.

Well, then try to look at such a change from a C extension
writer's perspective.

They'd have to change all their function calls and routines to work
with Py_UCS4.

Supporting both the old API and the new one would
be nearly impossible and require either an adapter API or a lot
of #ifdef'ery.

Please remember that the public Python C API is not only meant for
Python developers. It's main purpose is for it to be used by
other developers extending or embedding Python and those developers
use different release cycles and want to support more than just the
bleeding edge Python version.

Python has a long history of providing very stable APIs, both in
C and in Python.

FWIW: The last major change in the C API (the change to Py_ssize_t
from Python 2.4 to 2.5) has not even propogated to all major C
extensions yet. It's only now that people start to realize problems
with this, since their extensions start failing with segfaults
on 64-bit machines.

That said, we can of course provide additional UCS4 APIs for
certain things and also provide conversion helpers between
Py_UNICODE and Py_UCS4 where needed.

>> and is not inline with the intention of having a Py_UNICODE 
>> type in the first place.
> 
> Py_UNICODE is still used as the allocation unit for unicode strings.
> 
> To get correct results, we need a way to access the whole unicode
> database even on ucs2 builds; it's possible with the unicodedata module,
> why not from C?

I must be missing some detail, but what does the Unicode database
have to do with the unicodeobject.c C API ?

> My motivation for the change is this post:
> http://mail.python.org/pipermail/python-dev/2008-July/080900.html

There are certainly other ways to make Python deal with surrogates
in more cases than the ones we already support.
History
Date User Action Args
2009-02-03 18:47:32lemburgsetrecipients: + lemburg, amaury.forgeotdarc, vstinner, ezio.melotti, bupjae
2009-02-03 18:47:29lemburglinkissue5127 messages
2009-02-03 18:47:25lemburgcreate