This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients ajaksu2, amaury.forgeotdarc, collinwinter, ezio.melotti, jafo, lemburg, orivej, pitrou, vstinner
Date 2009-05-25.09:24:19
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1243243585.5584.37.camel@localhost>
In-reply-to <1243239476.41.0.694089874892.issue1943@psf.upfronthosting.co.za>
Content
Marc-André, the problem is that all your arguments are fallacious at
best. Let me see:

> Like I said: The current design of the Unicode object implementation
> would benefit more from advances in pymalloc tuning, not from making it
> next to impossible to extend the Unicode objects to e.g. [...]

Saying that is like saying "we shouldn't try to improve ceval.c because
it makes it harder to write a JIT". You are dismissing concrete actual
improvements in favour of pie-in-the-sky improvements that nobody has
seemed to try (you're welcome to prove me wrong) in 10 years of
existence of the unicode type.

Besides, if someone wants to experiment with such improvements, it is
not difficult to switch back to the old representation (my patch is very
short if you discard the mechanic replacement of "self->length" with
"PyUnicode_GET_SIZE(self)", which doesn't have to be undone to switch
representations). So, I fail to see the relevance of that argument.

> Antoine, I have explained the reasons for rejecting the patch. In short,
> it violates a design principle behind the Unicode implementation.

You seem to be the only one thinking this while, AFAIK, you haven't been
the only one to work on that datatype.

> (10% speedup in
> some micro benchmarks is not significant; memory tests need to be run
> without pymalloc and require extra care to work around OS malloc
> optimization strategies).

Actually, running performance or resource consumption tests without
pymalloc is pointless since it makes the test completely artificial and
unrelated to real-world conditions (who runs Python without pymalloc in
real-world conditions?).

>  * reuse existing memory blocks for allocation, 
>  * pointing straight into memory mapped files, 
>  * providing highly efficient ways to tokenize Unicode data,
>  * sharing of data between Unicode objects,
>  etc.

By the way, I haven't seen your patches or experiments for those. Giving
guidance is nice, but proofs of concept, at the minimum, are more
convincing. None of the suggestions above strike me as very /easy/
(actually, they are at least an order of magnitude harder than the
present patch), or even guaranteed to give any tangible benefits.

To be clear, I don't think this proposal is more important than any
other one giving similar results (provided these exist). But your
arguments are never factual and, what's more, while I already did the
same replies as I did here in other messages, you never bothered to be
more factual. I would accept your refusal if your arguments had some
semblance of concrete support for them.
History
Date User Action Args
2009-05-25 09:24:25pitrousetrecipients: + pitrou, lemburg, collinwinter, jafo, amaury.forgeotdarc, vstinner, ajaksu2, orivej, ezio.melotti
2009-05-25 09:24:24pitroulinkissue1943 messages
2009-05-25 09:24:19pitroucreate