classification
Title: Replace if-elif-else structure with match-case (PEP634)
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.11
process
Status: closed Resolution: postponed
Dependencies: Superseder:
Assigned To: Nosy List: Kshitiz17, brandtbucher, rhettinger, steven.daprano, terry.reedy, xtreak
Priority: normal Keywords:

Created on 2021-06-01 05:08 by Kshitiz17, last changed 2021-06-05 05:07 by Kshitiz17. This issue is now closed.

Files
File name Uploaded Description Edit
match_test.py Kshitiz17, 2021-06-02 07:43
if_match.py brandtbucher, 2021-06-02 16:51
Messages (14)
msg394839 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-01 05:08
Replace if-elif-else with match-case for pattern matching, which is generally faster and more intuitive.
msg394840 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2021-06-01 05:16
This is done on a case by case basis and in places necessary in future code. Modifying existing code can pollute git history, make backports hard, might introduce subtle bugs etc. This is similar to proposal to use walrus operator, f-strings, etc. throughout the codebase.
msg394843 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-01 07:15
Pardon my ignorance here but can I start working on this issue? I am a new contributor therefore I am unsure about how to proceed from here
msg394847 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-01 10:22
This is a relatively simple example of how this will improve readability of the code.
(This example is form Lib/json/encoder.py)

Original

-------------------------------------------------------------------------
    if isinstance(value, str):
        yield _encoder(value)
    elif value is None:
        yield 'null'
    elif value is True:
        yield 'true'
    elif value is False:
        yield 'false'
    elif isinstance(value, int):
        yield _intstr(value)
    elif isinstance(value, float):
        yield _floatstr(value)
    else:
        if isinstance(value, (list, tuple)):
            chunks = _iterencode_list(value, _current_indent_level)
        elif isinstance(value, dict):
            chunks = _iterencode_dict(value, _current_indent_level)
        else:
            chunks = _iterencode(value, _current_indent_level)
        yield from chunks 
--------------------------------------------------------------------------

Suggested

--------------------------------------------------------------------------
    match value:
        case str():
            yield _encoder(value)
        case None:
            yield 'null'
        case True:
            yield 'true'
        case False:
            yield 'false'
        case int():
            # see comment for int/float in _make_iterencode
            yield _intstr(value)
        case float():
            # see comment for int/float in _make_iterencode
            yield _floatstr(value)
        case _:
            match value:
                case list()|tuple():
                    chunks = _iterencode_list(value, _current_indent_level)
                case dict():
                    chunks = _iterencode_dict(value, _current_indent_level)
                case _:
                    chunks = _iterencode(value, _current_indent_level)
            yield from chunks
----------------------------------------------------------------------------
msg394850 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2021-06-01 10:40
match-case has not even reached a stable version of Python yet, it is only available in Python 3.10 which is still in beta. Are we sure that it is faster in all cases and how do you know it is more intuitive when the vast majority of Python users have likely not even heard of match-case yet?

Personally, I am looking forward to using match-case, but I can see myself having to write lots of comments explaining what the code is doing for many years to come.
msg394855 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-01 12:21
I guess we will have to wait until Python 3.10 is released and we have more conclusive data about speed and readability of match-case before we can go ahead with this issue.
msg394879 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-06-02 00:06
If the json.encoder code does get updated, it doesn't need two levels of matching.  It can be flattened by eliminating the *chunks* variable.

    match value:
        case str():
            yield _encoder(value)
        case None:
            yield 'null'
        case True:
            yield 'true'
        case False:
            yield 'false'
        case int():
            yield _intstr(value)
        case float():
            yield _floatstr(value)
        case list() | tuple():
            yield from _iterencode_list(value, _current_indent_level)
        case dict():
            yield from _iterencode_dict(value, _current_indent_level)
        case _:
            yield from _iterencode(value, _current_indent_level)
msg394881 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-06-02 01:11
In difflib, there's an example where it would be easy to run performance tests.

            match tag:
                case 'replace':
                    g = self._fancy_replace(a, alo, ahi, b, blo, bhi)
                case 'delete':
                    g = self._dump('-', a, alo, ahi)
                case 'insert':
                    g = self._dump('+', b, blo, bhi)
                case 'equal':
                    g = self._dump(' ', a, alo, ahi)
                case _:
                    raise ValueError('unknown tag %r' % (tag,))
msg394893 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-02 07:30
I have timed the execution of if-else and match-case on Differ().compare from difflib module and here is what I got

When both strings are same

**********************************************************************************************

if-else: 2.504900476196781e-06, match-case: 2.587399649200961e-06, comparisions : 10
if-else: 2.222519979113713e-06, match-case: 1.7874199693324045e-06, comparisions : 100
if-else: 1.954343999386765e-06, match-case: 1.8695319959078916e-06, comparisions : 1000
if-else: 3.2847548005520366e-06, match-case: 1.928162499825703e-06, comparisions : 10000
if-else: 1.2241538699890953e-06, match-case: 7.870903900038684e-07, comparisions : 100000
if-else: 7.950048359998618e-07, match-case: 7.883418589990469e-07, comparisions : 1000000
if-else: 7.941918295000505e-07, match-case: 7.882559125006082e-07, comparisions : 10000000
if-else: 7.928842861700104e-07, match-case: 7.855620772999828e-07, comparisions : 100000000

**********************************************************************************************

When strings have some difference

**********************************************************************************************

if-else: 2.7084999601356687e-06, match-case: 2.6756002625916154e-06, comparisions : 10
if-else: 2.207159996032715e-06, match-case: 1.8606500088935719e-06, comparisions : 100
if-else: 2.139014999556821e-06, match-case: 1.928671001223847e-06, comparisions : 1000
if-else: 2.682303600158775e-06, match-case: 2.626289399631787e-06, comparisions : 10000
if-else: 1.1338948200136655e-06, match-case: 7.989683500636602e-07, comparisions : 100000
if-else: 7.862168830033624e-07, match-case: 7.83532044995809e-07, comparisions : 1000000
if-else: 7.918311449997419e-07, match-case: 7.843428884996683e-07, comparisions : 10000000
if-else: 7.843063791000168e-07, match-case: 7.842913352399773e-07, comparisions : 100000000

**********************************************************************************************
msg394894 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2021-06-02 07:41
How did you do the timing?
msg394895 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-02 07:43
I have used timeit module. I have also attached the test file with this message
msg394927 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-06-02 16:51
Hm, that benchmark seems really noisy. Looking at your code, it appears that you aren't actually iterating over the results.

I've attached a version of your benchmark rewritten to use pyperf. Here are the results on a system with a fresh PGO/LTO build of CPython main, run with CPU isolation:

(py) bucher@is-bucher-p1:~/src/patmaperformance$ sudo $(which pyperf) system tune > /dev/null
(py) bucher@is-bucher-p1:~/src/patmaperformance$ sudo $(which python) if_match.py --rigorous
.........................................
if: Mean +- std dev: 2.16 ms +- 0.13 ms
.........................................
match: Mean +- std dev: 2.13 ms +- 0.07 ms

It appears that there is no significant difference between the two (which is what I expect, given the current implementation). With that said, our work in issue 44283 will likely have a significant impact for cases like this.
msg395120 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-06-04 20:54
To me, Raymond's flattening is a plausible replacement, especially ater a future speedup.  However, I would re-order the patterns to None, False, True, str(), int(), ..., _.  These independent conditions, other than _ (or else in the if chain), can be ordered for best reading and comprehension speed, and I think the above is an improvement.

Ordering for execution speed would instead start with the most frequently true conditions.  An advantage of match over 'if' is that patterns are apparently easier to re-order or otherwise automatically optimize.

I agree that replacement should only be done when there is significant improvement in reading and/or execution speed.  To avoid backport issues, they should best be preceded by a check for bug reports and especially PRs that will or do affect the same area of code.

The proposal for a general, +- global replacement has been and will be rejected. I think that this issue should therefore be closed, fairly soon, as such.
msg395153 - (view) Author: Kshitiz Arya (Kshitiz17) * Date: 2021-06-05 05:07
As Brandt shows us, match-case in its current implementation is not significantly different from an if-else ladder in term of performance, though I still maintain that match-case is much more readable than an if-else ladder. I also agree with Karthikeyan and Terry that global replacement is not a good idea, at least not without a significant performance improvement. Therefore I will close this issue.

Currently, some people at issue 44283 are working on improving the performance of match-case so fingers are crossed. If there is some significant gain in performance then I will open new issues on case to case basis.
History
Date User Action Args
2021-06-05 05:07:28Kshitiz17setstatus: open -> closed
resolution: postponed
messages: + msg395153

stage: resolved
2021-06-04 20:54:54terry.reedysetnosy: + terry.reedy
messages: + msg395120
2021-06-02 16:51:47brandtbuchersetfiles: + if_match.py

messages: + msg394927
2021-06-02 07:43:23Kshitiz17setfiles: + match_test.py

messages: + msg394895
2021-06-02 07:41:27steven.dapranosetmessages: + msg394894
2021-06-02 07:30:41Kshitiz17setmessages: + msg394893
2021-06-02 01:11:06rhettingersetmessages: + msg394881
2021-06-02 00:06:36rhettingersetnosy: + rhettinger
messages: + msg394879
2021-06-01 12:21:44Kshitiz17setmessages: + msg394855
2021-06-01 10:40:32steven.dapranosetnosy: + steven.daprano
messages: + msg394850
2021-06-01 10:22:01Kshitiz17setmessages: + msg394847
2021-06-01 07:15:25Kshitiz17setmessages: + msg394843
2021-06-01 05:24:26brandtbuchersetnosy: + brandtbucher
2021-06-01 05:16:24xtreaksetnosy: + xtreak
messages: + msg394840
2021-06-01 05:08:47Kshitiz17create