Issue 3092: Wrong unicode size detection in pybench

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/47342

classification

Title:	Wrong unicode size detection in pybench
Type:	behavior	Stage:
Components:	Demos and Tools	Versions:	Python 3.0

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	georg.brandl, lemburg, pitrou
Priority:	normal	Keywords:	patch

Created on 2008-06-12 19:49 by pitrou, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
pybench_ucs.patch	pitrou, 2008-06-12 19:49

Messages (9)
msg68076 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-06-12 19:49
In py3k, pybench wrongly detects UCS2 builds as UCS4. Patch attached.
msg68082 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2008-06-12 20:26
On 2008-06-12 21:50, Antoine Pitrou wrote: > New submission from Antoine Pitrou <pitrou@free.fr>: > > In py3k, pybench wrongly detects UCS2 builds as UCS4. Patch attached. Why is that ? Doesn't chr(100000) raise an exception in UCS2 builds ? unichr(100000) does raise an exception in Py2.x. Note that sys.maxunicode is not available in Python 2.1 which is why I chose try-except approach.
msg68087 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-06-12 20:58
Le jeudi 12 juin 2008 à 20:26 +0000, Marc-Andre Lemburg a écrit : > Doesn't chr(100000) raise an exception in UCS2 builds ? No, it returns a 2-character string. > Note that sys.maxunicode is not available in Python 2.1 > which is why I chose try-except approach. I understand, but is the py3k version of pybench still compatible with Python 2.1?
msg68089 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2008-06-12 21:00
> No, it returns a 2-character string. Which hopefully is the proper surrogate sequence :)
msg68094 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-06-12 21:33
> Which hopefully is the proper surrogate sequence :) Well at least that's what the doc string says!
msg68095 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2008-06-12 21:41
On 2008-06-12 22:58, Antoine Pitrou wrote: > Antoine Pitrou <pitrou@free.fr> added the comment: > > Le jeudi 12 juin 2008 à 20:26 +0000, Marc-Andre Lemburg a écrit : >> Doesn't chr(100000) raise an exception in UCS2 builds ? > > No, it returns a 2-character string. Interesting... I wonder how applications will deal with this. They'd normally expect to get a length 1 string from chr() or unichr(). I think chr() should only behave in this way if given an option. Otherwise, it will definitely introduce hard to find bugs in ported applications (and probably even in newly written ones). Something like chr(x, surrogates=True) to enable returning 2 code points instead of raising an exception. >> Note that sys.maxunicode is not available in Python 2.1 >> which is why I chose try-except approach. > > I understand, but is the py3k version of pybench still > compatible with Python 2.1? You're right: probably not. Would be great to have the test on the Py2.x version as well - to see the difference in performance. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 12 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-07-07: EuroPython 2008, Vilnius, Lithuania 24 days to go :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611
msg68897 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-06-28 19:33
> You're right: probably not. Would be great to have the test on the > Py2.x version as well - to see the difference in performance. I'm not following you, what test are you talking about? The patch is only about reporting of the Python build characteristics, it does not (AFAIK) change anything to how or what tests are run.
msg69004 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2008-06-30 12:44
On 2008-06-28 21:33, Antoine Pitrou wrote: > Antoine Pitrou <pitrou@free.fr> added the comment: > >> You're right: probably not. Would be great to have the test on the >> Py2.x version as well - to see the difference in performance. > > I'm not following you, what test are you talking about? > The patch is only about reporting of the Python build characteristics, > it does not (AFAIK) change anything to how or what tests are run. Sorry for the confusion. I was thinking of the new test added to the Py3k version of pybench - doesn't have anything to do with the Unicode size detection.
msg70161 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-07-22 18:03
Fixed in r65186

History
Date	User	Action	Args
2022-04-11 14:56:35	admin	set	github: 47342
2008-07-22 18:03:24	pitrou	set	status: open -> closed resolution: fixed messages: + msg70161
2008-06-30 12:44:16	lemburg	set	messages: + msg69004
2008-06-28 19:33:40	pitrou	set	messages: + msg68897
2008-06-12 21:41:33	lemburg	set	messages: + msg68095
2008-06-12 21:33:24	pitrou	set	messages: + msg68094
2008-06-12 21:00:43	georg.brandl	set	nosy: + georg.brandl messages: + msg68089
2008-06-12 20:58:15	pitrou	set	messages: + msg68087
2008-06-12 20:26:50	lemburg	set	nosy: + lemburg messages: + msg68082
2008-06-12 19:49:57	pitrou	create