Message 394889 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Dennis Sweeney
Recipients	Dennis Sweeney, brandtbucher, rhettinger
Date	2021-06-02.04:46:21
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1622609182.18.0.179745475546.issue44283@roundup.psfhosted.org>
In-reply-to

Content
I agree that we should benchmark for what the value of "two" should be ;) . Maybe only match statements with >=20 consecutive simple cases end up being worth it. I would assume that if "case 1, case 2, ..., case 20" are all in a row then "case 1" wouldn't be that much more likely than "case 20", so a slowdown on case 1 wouldn't matter too much if the average gets better. I'm under the impression that match-case statements would frequently be used only once per function call. If I understand your proposal, that would mean that calling a function with a N-case constant-pattern match would require N hashes and N insertions into a dict (including memory allocation), followed by O(1) lookup. Wouldn't that eliminate any hope of making such a function O(1)? Unless there's a way to cache the populated dict somewhere else? I suggested HAMT because it's a well-tested pre-existing fast immutable mapping that already implements __hash__, so it would be safe to store in co_consts already populated. Re-implementing all of the dict logic seems like an unnecessary pain, which is why I was suggesting either the HAMT or a thin wrapper around dict, not a re-implementation of a new hash table. I was hoping to do the minimum amount of disruption possible to get reasonable O(1) performance, and then seeing where that leaves us.

I agree that we should benchmark for what the value of "two" should be ;) . Maybe only match statements with >=20 consecutive simple cases end up being worth it. I would assume that if "case 1, case 2, ..., case 20" are all in a row then "case 1" wouldn't be *that* much more likely than "case 20", so a slowdown on case 1 wouldn't matter too much if the average gets better.

I'm under the impression that match-case statements would frequently be used only once per function call. If I understand your proposal, that would mean that calling a function with a N-case constant-pattern match would require N hashes and N insertions into a dict (including memory allocation), followed by O(1) lookup. Wouldn't that eliminate any hope of making such a function O(1)? Unless there's a way to cache the populated dict somewhere else?

I suggested HAMT because it's a well-tested pre-existing fast immutable mapping that already implements __hash__, so it would be safe to store in co_consts already populated. Re-implementing all of the dict logic seems like an unnecessary pain, which is why I was suggesting either the HAMT or a thin wrapper around dict, not a re-implementation of a new hash table. I was hoping to do the minimum amount of disruption possible to get reasonable O(1) performance, and then seeing where that leaves us.

History
Date	User	Action	Args
2021-06-02 04:46:22	Dennis Sweeney	set	recipients: + Dennis Sweeney, rhettinger, brandtbucher
2021-06-02 04:46:22	Dennis Sweeney	set	messageid: <1622609182.18.0.179745475546.issue44283@roundup.psfhosted.org>
2021-06-02 04:46:22	Dennis Sweeney	link	issue44283 messages
2021-06-02 04:46:21	Dennis Sweeney	create