This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Wrong unicode size detection in pybench
Type: behavior Stage:
Components: Demos and Tools Versions: Python 3.0
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, lemburg, pitrou
Priority: normal Keywords: patch

Created on 2008-06-12 19:49 by pitrou, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pybench_ucs.patch pitrou, 2008-06-12 19:49
Messages (9)
msg68076 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-12 19:49
In py3k, pybench wrongly detects UCS2 builds as UCS4. Patch attached.
msg68082 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2008-06-12 20:26
On 2008-06-12 21:50, Antoine Pitrou wrote:
> New submission from Antoine Pitrou <pitrou@free.fr>:
> 
> In py3k, pybench wrongly detects UCS2 builds as UCS4. Patch attached.

Why is that ?

Doesn't chr(100000) raise an exception in UCS2 builds ?

unichr(100000) does raise an exception in Py2.x.

Note that sys.maxunicode is not available in Python 2.1
which is why I chose try-except approach.
msg68087 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-12 20:58
Le jeudi 12 juin 2008 à 20:26 +0000, Marc-Andre Lemburg a écrit :
> Doesn't chr(100000) raise an exception in UCS2 builds ?

No, it returns a 2-character string.

> Note that sys.maxunicode is not available in Python 2.1
> which is why I chose try-except approach.

I understand, but is the py3k version of pybench still
compatible with Python 2.1?
msg68089 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-06-12 21:00
> No, it returns a 2-character string.

Which hopefully is the proper surrogate sequence :)
msg68094 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-12 21:33
> Which hopefully is the proper surrogate sequence :)

Well at least that's what the doc string says!
msg68095 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2008-06-12 21:41
On 2008-06-12 22:58, Antoine Pitrou wrote:
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
> Le jeudi 12 juin 2008 à 20:26 +0000, Marc-Andre Lemburg a écrit :
>> Doesn't chr(100000) raise an exception in UCS2 builds ?
> 
> No, it returns a 2-character string.

Interesting... I wonder how applications will deal with this. They'd
normally expect to get a length 1 string from chr() or unichr().

I think chr() should only behave in this way if given an option.
Otherwise, it will definitely introduce hard to find bugs in
ported applications (and probably even in newly written ones).

Something like chr(x, surrogates=True) to enable returning
2 code points instead of raising an exception.

>> Note that sys.maxunicode is not available in Python 2.1
>> which is why I chose try-except approach.
> 
> I understand, but is the py3k version of pybench still
> compatible with Python 2.1?

You're right: probably not. Would be great to have the test on the
Py2.x version as well - to see the difference in performance.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 12 2008)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2008-07-07: EuroPython 2008, Vilnius, Lithuania            24 days to go

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::

    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
            Registered at Amtsgericht Duesseldorf: HRB 46611
msg68897 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-28 19:33
> You're right: probably not. Would be great to have the test on the
> Py2.x version as well - to see the difference in performance.

I'm not following you, what test are you talking about?
The patch is only about reporting of the Python build characteristics,
it does not (AFAIK) change anything to how or what tests are run.
msg69004 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2008-06-30 12:44
On 2008-06-28 21:33, Antoine Pitrou wrote:
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>> You're right: probably not. Would be great to have the test on the
>> Py2.x version as well - to see the difference in performance.
> 
> I'm not following you, what test are you talking about?
> The patch is only about reporting of the Python build characteristics,
> it does not (AFAIK) change anything to how or what tests are run.

Sorry for the confusion. I was thinking of the new test added to
the Py3k version of pybench - doesn't have anything to do with the
Unicode size detection.
msg70161 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-22 18:03
Fixed in r65186
History
Date User Action Args
2022-04-11 14:56:35adminsetgithub: 47342
2008-07-22 18:03:24pitrousetstatus: open -> closed
resolution: fixed
messages: + msg70161
2008-06-30 12:44:16lemburgsetmessages: + msg69004
2008-06-28 19:33:40pitrousetmessages: + msg68897
2008-06-12 21:41:33lemburgsetmessages: + msg68095
2008-06-12 21:33:24pitrousetmessages: + msg68094
2008-06-12 21:00:43georg.brandlsetnosy: + georg.brandl
messages: + msg68089
2008-06-12 20:58:15pitrousetmessages: + msg68087
2008-06-12 20:26:50lemburgsetnosy: + lemburg
messages: + msg68082
2008-06-12 19:49:57pitroucreate