Message81052
amaury> Since r56395, ord() and chr() accept and return surrogate pairs
amaury> even in narrow builds.
Note: My examples are made with Python 2.x.
> The goal is to remove most differences between narrow and wide unicode
> builds (except for string lengths, indices or slices)
It would be nice to get the same behaviour in Python 2.x and 3.x to help
migration from Python2 to Python3 ;-)
unichr() (in Python 2.x) documentation is correct. But I would approciate to
support surrogates using unichr() which means also changing ord() behaviour.
> To address this problem, I suggest to change all functions in
> unicodectype.c so that they accept Py_UCS4 characters (instead of
> Py_UNICODE).
Why? Using surrogates, you can use 16-bits Py_UNICODE to store non-BMP
characters (code > 0xffff).
--
I can open a new issue if you agree that we can change unichr() / ord()
behaviour on narrow build. We may ask on the mailing list? |
|
Date |
User |
Action |
Args |
2009-02-03 13:14:06 | vstinner | set | recipients:
+ vstinner, lemburg, amaury.forgeotdarc, ezio.melotti, bupjae |
2009-02-03 13:14:04 | vstinner | link | issue5127 messages |
2009-02-03 13:14:03 | vstinner | create | |
|