Message 81081 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	amaury.forgeotdarc, bupjae, ezio.melotti, lemburg, vstinner
Date	2009-02-03.18:47:24
SpamBayes Score	2.7755576e-16
Marked as misclassified	No
Message-id	<49889139.7080005@egenix.com>
In-reply-to	<1233669013.15.0.414055262607.issue5127@psf.upfronthosting.co.za>

Content
On 2009-02-03 14:50, Amaury Forgeot d'Arc wrote: > Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment: > >> That would cause major breakage in the C API > > Not if you recompile. I don't see how this breaks the API at the C level. Well, then try to look at such a change from a C extension writer's perspective. They'd have to change all their function calls and routines to work with Py_UCS4. Supporting both the old API and the new one would be nearly impossible and require either an adapter API or a lot of #ifdef'ery. Please remember that the public Python C API is not only meant for Python developers. It's main purpose is for it to be used by other developers extending or embedding Python and those developers use different release cycles and want to support more than just the bleeding edge Python version. Python has a long history of providing very stable APIs, both in C and in Python. FWIW: The last major change in the C API (the change to Py_ssize_t from Python 2.4 to 2.5) has not even propogated to all major C extensions yet. It's only now that people start to realize problems with this, since their extensions start failing with segfaults on 64-bit machines. That said, we can of course provide additional UCS4 APIs for certain things and also provide conversion helpers between Py_UNICODE and Py_UCS4 where needed. >> and is not inline with the intention of having a Py_UNICODE >> type in the first place. > > Py_UNICODE is still used as the allocation unit for unicode strings. > > To get correct results, we need a way to access the whole unicode > database even on ucs2 builds; it's possible with the unicodedata module, > why not from C? I must be missing some detail, but what does the Unicode database have to do with the unicodeobject.c C API ? > My motivation for the change is this post: > http://mail.python.org/pipermail/python-dev/2008-July/080900.html There are certainly other ways to make Python deal with surrogates in more cases than the ones we already support.

On 2009-02-03 14:50, Amaury Forgeot d'Arc wrote:
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
> 
>> That would cause major breakage in the C API 
> 
> Not if you recompile. I don't see how this breaks the API at the C level.

Well, then try to look at such a change from a C extension
writer's perspective.

They'd have to change all their function calls and routines to work
with Py_UCS4.

Supporting both the old API and the new one would
be nearly impossible and require either an adapter API or a lot
of #ifdef'ery.

Please remember that the public Python C API is not only meant for
Python developers. It's main purpose is for it to be used by
other developers extending or embedding Python and those developers
use different release cycles and want to support more than just the
bleeding edge Python version.

Python has a long history of providing very stable APIs, both in
C and in Python.

FWIW: The last major change in the C API (the change to Py_ssize_t
from Python 2.4 to 2.5) has not even propogated to all major C
extensions yet. It's only now that people start to realize problems
with this, since their extensions start failing with segfaults
on 64-bit machines.

That said, we can of course provide additional UCS4 APIs for
certain things and also provide conversion helpers between
Py_UNICODE and Py_UCS4 where needed.

>> and is not inline with the intention of having a Py_UNICODE 
>> type in the first place.
> 
> Py_UNICODE is still used as the allocation unit for unicode strings.
> 
> To get correct results, we need a way to access the whole unicode
> database even on ucs2 builds; it's possible with the unicodedata module,
> why not from C?

I must be missing some detail, but what does the Unicode database
have to do with the unicodeobject.c C API ?

> My motivation for the change is this post:
> http://mail.python.org/pipermail/python-dev/2008-July/080900.html

There are certainly other ways to make Python deal with surrogates
in more cases than the ones we already support.

History
Date	User	Action	Args
2009-02-03 18:47:32	lemburg	set	recipients: + lemburg, amaury.forgeotdarc, vstinner, ezio.melotti, bupjae
2009-02-03 18:47:29	lemburg	link	issue5127 messages
2009-02-03 18:47:25	lemburg	create