Message93594
Amaury Forgeot d'Arc wrote:
>
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
>
>> we should make sure that it's not possible to load an extension
>> compiled with 3.1 in 3.2 to prevent segfaults and buffer overruns.
>
> This is the case with this patch: today all these functions
> (_PyUnicode_IsAlpha, _PyUnicode_ToLowercase) are actually #defines to
> _PyUnicodeUCS2_* or _PyUnicodeUCS4_*.
> The patch removes the #defines: 3.1 modules that call
> _PyUnicodeUCS4_IsAlpha wouldn't load into a 3.2 interpreter.
True, but we can do better. For narrow builds, the API currently
exposes the UCS2 APIs. We'd need to expose the UCS4 APIs *in addition*
to those APIs and have the UCS2 APIs redirect to the UCS4 ones.
For wide builds, we don't need to change anything.
>> The change affects the Unicode type database which is implemented
>> in unicodectype.c, not the Unicode database, which already uses UCS4.
>
> Are you referring to the _PyUnicode_TypeRecord structure?
> The first three fields only contains values up to 65535, so they could
> use "unsigned short" even for UCS4 builds.
I haven't checked, but it's certainly possible to have a code point
use a non-BMP lower/upper/title case mapping, so this should be
made possible as well, if we're going to make changes to the type
database. |
|
Date |
User |
Action |
Args |
2009-10-05 11:41:17 | lemburg | set | recipients:
+ lemburg, amaury.forgeotdarc, Rhamphoryncus, vstinner, ezio.melotti, bupjae |
2009-10-05 11:41:15 | lemburg | link | issue5127 messages |
2009-10-05 11:41:15 | lemburg | create | |
|