Author lemburg
Recipients Rhamphoryncus, amaury.forgeotdarc, bupjae, ezio.melotti, lemburg, vstinner
Date 2009-10-05.14:16:24
SpamBayes Score 3.33067e-16
Marked as misclassified No
Message-id <4AC9FFB7.1050204@egenix.com>
In-reply-to <1254745724.36.0.0195575889541.issue5127@psf.upfronthosting.co.za>
Content
Amaury Forgeot d'Arc wrote:
> 
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
> 
>> We'd need to expose the UCS4 APIs *in addition*
>> to those APIs and have the UCS2 APIs redirect to the UCS4 ones.
> 
> Why have two names for the same function? it's Python 3, after all.

It's not the same function... the UCS2 version would take a
Py_UNICODE parameter, the UCS4 version a Py_UCS4 parameter.

I don't understand the comment about Python 3.x. FWIW, we're no
longer in the backwards incompatible changes are allowed mode
for 3.x.

> Or is this "no recompile" feature so important (as long as changes are
> clearly shown to the user)? It does not work on Windows, FWIW.

There are generally two options for API changes within a
major release branch:

 1. the changes are API backwards compatible and only the Python API
    version is changed

 2. the changes are not API backwards compatible; in such a case,
    Python has to reject imports of old module (as it always
    does on Windows), so the Python API version has to be changed
    *and* the import mechanism must reject the import

The second option was used when transitioning from 2.4 to 2.5 due
to the Py_ssize_t changes.

We could do the same for 2.7/3.2, but if it's just needed for this
one change, then I'd rather stick to implementing the first option.

>> I haven't checked, but it's certainly possible to have a code point
>> use a non-BMP lower/upper/title case mapping, so this should be
>> made possible as well, if we're going to make changes to the type
>> database.
> 
> OK, here is a new patch.  Even if this does not happen with unicodedata
> up to 5.1, the table has only 175 entries so memory usage is not
> dramatically increased.
> Py_UNICODE is no more used at all in unicodectype.c.

Sorry, but this doesn't work: the functions have to return Py_UNICODE
and raise an exception if the return value doesn't fit.

Otherwise, you'd get completely wrong values in code downcasting
the return value to Py_UNICODE on narrow builds.

Another good reason to use two sets of APIs. The new set could
indeed return Py_UCS4 values.
History
Date User Action Args
2009-10-05 14:16:27lemburgsetrecipients: + lemburg, amaury.forgeotdarc, Rhamphoryncus, vstinner, ezio.melotti, bupjae
2009-10-05 14:16:25lemburglinkissue5127 messages
2009-10-05 14:16:24lemburgcreate