Message 219824 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	gvanrossum
Recipients	Rosuav, docs@python, gvanrossum, ncoghlan, pitrou, serhiy.storchaka, vstinner
Date	2014-06-05.16:32:42
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1401985962.62.0.0707459516558.issue21667@psf.upfronthosting.co.za>
In-reply-to

Content
I don't want the O(1) property explicitly denounced in the reference manual. It's fine if the manual is silent on this -- maybe someone can prove that it isn't a problem based on benchmarks of an alternate implementation, but until then, I'm skeptical -- after all we have a bunch of important APIs (indexing, slicing, find()/index(), the re module) that use integer indexes, and some important algorithms/patterns are based off this behavior. E.g. searching a string for something, returning the position where it's found, and then continuing the search from that position. Even if the basic search uses find() or regex matching, there's still a position being returned and accepted, and if it took O(N) time to find that position in the representation, the whole algorithm could degenerate from O(N) to O(N**2). I am fine with the changes related to code points. For the pedants amongst us, surrogates are also code points. A surrogate pair is two code points that encode a single code point. Fortunately we don't have to deal with those any more outside codecs.

I don't want the O(1) property explicitly denounced in the reference manual. It's fine if the manual is silent on this -- maybe someone can prove that it isn't a problem based on benchmarks of an alternate implementation, but until then, I'm skeptical -- after all we have a bunch of important APIs (indexing, slicing, find()/index(), the re module) that use integer indexes, and some important algorithms/patterns are based off this behavior.

E.g. searching a string for something, returning the position where it's found, and then continuing the search from that position. Even if the basic search uses find() or regex matching, there's still a position being returned and accepted, and if it took O(N) time to find that position in the representation, the whole algorithm could degenerate from O(N) to O(N**2).

I am fine with the changes related to code points.

For the pedants amongst us, surrogates are also code points. A surrogate pair is two code points that encode a single code point. Fortunately we don't have to deal with those any more outside codecs.

History
Date	User	Action	Args
2014-06-05 16:32:42	gvanrossum	set	recipients: + gvanrossum, ncoghlan, pitrou, vstinner, docs@python, Rosuav, serhiy.storchaka
2014-06-05 16:32:42	gvanrossum	set	messageid: <1401985962.62.0.0707459516558.issue21667@psf.upfronthosting.co.za>
2014-06-05 16:32:42	gvanrossum	link	issue21667 messages
2014-06-05 16:32:42	gvanrossum	create