Message 93636 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	amaury.forgeotdarc
Recipients	Rhamphoryncus, amaury.forgeotdarc, bupjae, ezio.melotti, lemburg, vstinner
Date	2009-10-06.08:52:28
SpamBayes Score	2.82806e-11
Marked as misclassified	No
Message-id	<1254819151.18.0.186597127594.issue5127@psf.upfronthosting.co.za>
In-reply-to

Content
So the discussion is now on 2 points: 1. Is the change backwards compatible? (at the code level, after recompilation). My answer is yes, because all known case transformations stay in the same plane: if you pass a char in the BMP, they return a char in the BMP; if you pass a code >0x1000, you get another code >0x1000. In other words: in narrow builds, when you pass Py_UNICODE, the answer will be correct even when downcasted to Py_UNICODE. If you want, I can add checks to makeunicodedata.py to verify that future Unicode standards don't break this statement. "Naive" code that simply walks the Py_UNICODE* buffer will have identical behavior. (The current unicode methods are in this case. They should be fixed, later) 2. Is this change acceptable for 3.2? I'd say yes, because existing extension modules that use these functions will need to be recompiled; the functions names change, the modules won't load otherwise. There is no need to change the API number for this.

So the discussion is now on 2 points:

1. Is the change backwards compatible? (at the code level, after
recompilation).  My answer is yes, because all known case
transformations stay in the same plane: if you pass a char in the BMP,
they return a char in the BMP; if you pass a code >0x1000, you get
another code >0x1000. In other words: in narrow builds, when you pass
Py_UNICODE, the answer will be correct even when downcasted to
Py_UNICODE.  If you want, I can add checks to makeunicodedata.py to
verify that future Unicode standards don't break this statement.

"Naive" code that simply walks the Py_UNICODE* buffer will have
identical behavior.  (The current unicode methods are in this case. 
They should be fixed, later)

2. Is this change acceptable for 3.2?  I'd say yes, because existing
extension modules that use these functions will need to be recompiled;
the functions names change, the modules won't load otherwise.  There is
no need to change the API number for this.

History
Date	User	Action	Args
2009-10-06 08:52:31	amaury.forgeotdarc	set	recipients: + amaury.forgeotdarc, lemburg, Rhamphoryncus, vstinner, ezio.melotti, bupjae
2009-10-06 08:52:31	amaury.forgeotdarc	set	messageid: <1254819151.18.0.186597127594.issue5127@psf.upfronthosting.co.za>
2009-10-06 08:52:29	amaury.forgeotdarc	link	issue5127 messages
2009-10-06 08:52:28	amaury.forgeotdarc	create