classification
Title: Incorrect evaluation order of function arguments with *args
Type: behavior Stage: needs patch
Components: Interpreter Core Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Guido.van.Rossum, Joshua.Landau, NeilGirdhar, SilentGhost, gvanrossum, iritkatriel, r.david.murray, serhiy.storchaka, terry.reedy
Priority: low Keywords:

Created on 2015-01-25 15:46 by Joshua.Landau, last changed 2021-06-22 22:13 by iritkatriel.

Messages (25)
msg234672 - (view) Author: Joshua Landau (Joshua.Landau) * Date: 2015-01-25 15:46
It is claimed that all expressions are evaluated left-to-right, including in functions¹. However,

    f(*a(), b=b())

will evaluate b() before a().

¹ https://docs.python.org/3/reference/expressions.html#evaluation-order
msg234673 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2015-01-25 16:28
It seems, if I read https://docs.python.org/3/reference/expressions.html#calls correctly that the evaluation order of the function arguments is not defined in general, as it depends on your use of keyword argument and exact function signature. Naturally, f(a(), b()) would be evaluated in the order you're expecting.
msg234675 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-01-25 16:47
The resolution of issue 16967 argues that this should probably be considered a bug.  It certainly goes against normal Python expectations.  I think it should also be considered to be of low priority, though.
msg234697 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2015-01-25 22:20
I assume this is the problem:

>>> dis.dis('f(*a(), b=b())')
  1           0 LOAD_NAME                0 (f)
              3 LOAD_NAME                1 (a)
              6 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
              9 LOAD_CONST               0 ('b')
             12 LOAD_NAME                2 (b)
             15 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
             18 CALL_FUNCTION_VAR      256 (0 positional, 1 keyword pair)
             21 RETURN_VALUE

— looks fine.

>>> dis.dis('f(b=b(), *a())')
  1           0 LOAD_NAME                0 (f)
              3 LOAD_NAME                1 (a)
              6 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
              9 LOAD_CONST               0 ('b')
             12 LOAD_NAME                2 (b)
             15 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
             18 CALL_FUNCTION_VAR      256 (0 positional, 1 keyword pair)
             21 RETURN_VALUE

Joshua, we could make function calls take:

x lists
y dictionaries
one optional list
z dictionaries

but we as well do all the merging in advance:

one optional list
one optional dictionary
one optional list
one optional dictionary

which is representable in three bits, but four is easier to decode I think.
msg234698 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2015-01-25 22:28
actually, we accept alternation, so maybe a bit to say whether we start with a list or a dict followed by a length of the alternating sequence?
msg234702 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-01-25 22:38
Neil: I presume you are speaking of your in-progress PEP patch, and not the current python code?  If so, please keep the discussion of handling this in the context of the PEP to the PEP issue.  This issue should be for resolving the bug in the current code (if we choose to do so...if the PEP gets accepted for 3.5 this issue may become irrelevant, as I'm not sure we'd want to fix it in 3.4 for backward compatibility reasons).
msg234703 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2015-01-25 22:50
Yes, sorry David.  I got linked here from the other issue.  In any case, in the current code, the longest alternating sequence possible is 4.  So one way to solve this is to change the CALL_FUNCTION parameters to be lists and dicts only and then process the final merging in CALL_FUNCTION.
msg234704 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2015-01-25 23:02
another option is to add a LIST_EXTEND(stack_difference) opcode that would allow us to take the late iterable and extend a list at some arbitrary stack position.  I had to add something like that for dicts for the other issue, so it would follow that pattern.
msg234756 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2015-01-26 16:43
After thinking about this a bit more, my suggestion is not to fix it.  Instead, I suggest that PEP 8 be modified to suggest that all positional arguments and iterable argument unpackings precede keyword arguments and keyword argument unpackings.  Then, a tool like autopep8 is free to reorganize argument lists.  Such reorganization will not be possible if f(*a(), b=b()) is different than f(b=b(), *a()).
msg234773 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2015-01-26 19:41
(I also suggest that the evaluation order within a function argument list to be defined to be positional and iterable before keyword, otherwise left-to-right — rather than strictly left-to-right).
msg235083 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-01-31 07:32
I expect left to right as documented (and designed by Guido).  His OK or clarification would be to intentionally do differently.
msg395929 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-06-16 15:07
I don't think this is true anymore:

>>> def a(): 
...   print('a')
...   return (1,2)
... 
>>> def b():
...   print('b')
...   return (3,4)
... 
>>> def f(*args, b=None):
...   print('f')
... 
>>> f(*a(), b=b())
a
b
f
>>>
msg395951 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-06-16 17:24
The particular example of left-to-right function evaluation in
https://docs.python.org/3/reference/expressions.html#evaluation-order
is "expr1(expr2, expr3, *expr4, **expr5)".

Joshua's claim, without evidence, that "'f(*a(), b=b())' will evaluate b() before a()" was false, as shown by Neil with dis and is now, as shown by Irit with a code example.

But Neil also showed, again with dis, that a revised discrepancy claim, ""'f(b=b(), *a())' will evaluate a() before b()", is true, because reversing the order of these particular arguments does not reverse the evaluation order.  The dis today is equivalent to the one 7 years ago, with CALL_FUNCTION_VAR expanded to two operations.

dis.dis('f(b=b(), *a())')
  1           0 LOAD_NAME                0 (f)
              2 LOAD_NAME                1 (a)
              4 CALL_FUNCTION            0
              6 LOAD_CONST               0 ('b')
              8 LOAD_NAME                2 (b)
             10 CALL_FUNCTION            0
             12 BUILD_MAP                1
             14 CALL_FUNCTION_EX         1
             16 RETURN_VALUE

Irit's example, carried one step further confirms this.

>>> f(b=b(), *a())
a
b
f

Although I considered SilentGhost's reading of
https://docs.python.org/3/reference/expressions.html#calls
as too 'loose' (evaluation order of function argument *is* defined in general), that section address this exact case.

"A consequence of this is that although the *expression syntax may appear after explicit keyword arguments, it is processed before the keyword arguments ..."

I read this as an acknowledgement that this order violates the general rule.

I do wonder whether exception is specific to using a stack machine and implementation convenience or whether *all* implementations must follow this order.  If C-Python specific, it should be labelled as such.

The doc continues with an example call, for a different 'f', where, unlike Irit's 'f', the result is "TypeError: f() got multiple values for keyword argument 'a'".  It concludes "It is unusual for both keyword arguments and the *expression syntax to be used in the same call, so in practice this confusion does not arise."

The problem with mixing unpacking and named values might be even clearer if "f(b=1, *(2,1))" were added to the example box.  Perhaps *exp after x=exp should be prohibited.

If the exception is not removed from the implementation, then perhaps it should be added to the evaluation order section.  Something like

"The one exception is in function calls with *expression after a keyword argument: f(x=expr2, *expr1)."
msg395953 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-06-16 18:17
> "The one exception is in function calls with *expression after a keyword argument: f(x=expr2, *expr1)."

So the question before us is whether to add this phrase to the docs, or to tweak the compiler to fix it. It is certainly convenient to update the docs rather than expend the resources to change the compiler for what looks like a rare corner case.

How certain are we that this is truly the only exception? Can someone read the compiler source code related to argument order, or experiment with all the different permutations of positional args, keyword args, *args, and **kwargs? (Fortunately the evaluation order is independent from the function definition, so it's really just all permutations of those four things.)
msg395962 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-06-16 21:23
The following does not consider keyword-only args with or without defaults.  My understanding is the the compiler does not know or considered the signature of the function when evaluating the args.

from dis import dis
from itertools import permutations as pm  # My first use of this.

for a1, a2, a3, a4 in pm(('a', 'b=b', '*g', '**k')):
    s = f"e({a1}, {a2}, {a3}, {a4})"
    print(s)
    try:
        cd = compile(s, '', 'eval')
    except SyntaxError as e:
        print(e)
        continue
    dis(cd)

Since positional arg cannot follow keyword arg or keyword arg unpacking and positional arg unpacking cannot follow keyword arg unpacking, only 5 orders of 'a', 'b=b', '*g', and '**k' are legal.

e(a, b=b, *g, **k)  g before b
e(a, *g, b=b, **k)  normal
e(a, *g, **k, b=b)  normal
e(*g, a, b=b, **k)  normal
e(*g, a, **k, b=b)  normal

For whatever reason, the byte code is more complicated without function arguments.

e(a, b=b, *g, **k)
  1           0 LOAD_NAME                0 (e)
              2 LOAD_NAME                1 (a)
              4 BUILD_LIST               1
              6 LOAD_NAME                2 (g)
              8 LIST_EXTEND              1
             10 LIST_TO_TUPLE
             12 LOAD_CONST               0 ('b')
             14 LOAD_NAME                3 (b)
             16 BUILD_MAP                1
             18 LOAD_NAME                4 (k)
             20 DICT_MERGE               1
             22 CALL_FUNCTION_EX         1
             24 RETURN_VALUE

The exceptional case essentially says that positional arg unpacking cannot follow keyword args.  This is consistent with the other 3 prohibitions.  Arguments passed by position form one group, those passed by name another.  The nice, simple, and consistent rule is that positional args (and unpacking thereof) cannot follow keyword args (and unpacking there).

But in this one case, instead of raising SyntaxError like the other 3 prohibited orders, the compiler in effect rewrites* the code in a legal order -- and then compiles as if written that way.  How often does it rewrite code to make it correct?

I think it would have been better to have stuck with 'positional before keyword'.  Can we consider deprecating something rare and easy to get wrong?

Thinking more, I might prefer leaving the Evaluation Order section alone, as it is correct with respect to the internal rewrite, and revise the explanation of the quirk in the function doc. We could deprecate the mis-ordering at least in the doc.  

* After writing this, I notice Neil's suggestion that PEP 8 'prohibit' named params before iterable (positional) unpacking, so that a fixer program could literally rewrite in the correct order.  I am suggesting that it is useful to take the viewpoint that the compiler is doing this.
msg395965 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-16 21:47
See also thread "Order of positional and keyword arguments" on the Python-Dev mailing list (https://mail.python.org/archives/list/python-dev@python.org/thread/I6AUYV6DVEMP7XVYVDWC62N6PK6FHX3H/).
msg395968 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-06-16 22:39
Well, it seems we're stuck -- we can't make the grammar more restrictive (at least, I don't think we should, since it is a backwards incompatibility), so we'll have to live with the exception. Since it is now clear what the rule is, we can update the docs.
msg395970 - (view) Author: Neil Girdhar (NeilGirdhar) * Date: 2021-06-17 05:01
FYI: https://github.com/PyCQA/pylint/issues/4586
msg395972 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-17 05:13
Would not be be better in long term to get rid of irregularities? It would make the grammar simpler and the documentation clearer.

The only use case of such syntax in wrappers, but they always can be rewritten in other style, with natural order of arguments evaluation.

    def wrapper(*args, **kw):
        return wrapped_fn(*args, some_arg=1, **kw)
msg395974 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-06-17 06:12
It should be possible to make the compiler emit code that evaluates the arg values left to right in all cases.
msg395975 - (view) Author: Guido van Rossum (Guido.van.Rossum) Date: 2021-06-17 06:15
Looks like Terry accidentally removed Serhiy. Adding him back.
msg395978 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-06-17 08:05
(The nosy list change was an accident of my local copy not being complete refreshed before posting.)

If (b=b, *c) were evaluated in order, then the byte code for b=b and any subsequent keyword arguments would have to be put aside, such as in a separate buffer, until it was known that there would be no following *exp.  Without lookahead, this is known when either **kw or closing ) is reached.  At that point, the keyword buffer would be copied to the main buffer.

It might actually be easier to remove all order restrictions and compile all keyword values to a side buffer, to be copied to the main buffer when the closing ) is reached. 

One version of the question I am raising is this: given that f(a, b=b) and f(*(a,), b=b) have the same effect (are alternate spellings of the same instruction(s)), why should f(b=b, a) and f(b=b, *(a,)) *not* have the same effect, with one spelling being prohibited and the other not?

The meaning of '*expression' is defined as having the same effect as an equivalent sequence of positional argument expression in the same place as the expression. 

"If the syntax *expression appears in the function call, expression must evaluate to an iterable. Elements from these iterables are treated as if they were additional positional arguments. For the call f(x1, x2, *y, x3, x4), if y evaluates to a sequence y1, …, yM, this is equivalent to a call with M+4 positional arguments x1, x2, y1, …, yM, x3, x4."

The first sentence of the next paragrapsh, allowing only *exp to follow b=exp, contradicts the last line above.  I am sorry I did not read these paregraphs carefully until now.
msg396006 - (view) Author: Guido van Rossum (Guido.van.Rossum) Date: 2021-06-17 15:25
I'm currently not very interested in philosophizing about why we allow one
syntax but not the other.
msg396009 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-06-17 16:08
I was pointing out what to me is a second related contradiction in the doc, between one sentence and the next.
msg396369 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-06-22 22:13
See also issue33492.
History
Date User Action Args
2021-06-22 22:13:28iritkatrielsetmessages: + msg396369
2021-06-17 16:08:45terry.reedysetmessages: + msg396009
2021-06-17 15:25:53Guido.van.Rossumsetmessages: + msg396006
2021-06-17 08:05:54terry.reedysetmessages: + msg395978
2021-06-17 06:15:05Guido.van.Rossumsetnosy: + serhiy.storchaka, Guido.van.Rossum
messages: + msg395975
2021-06-17 06:12:45iritkatrielsetmessages: + msg395974
2021-06-17 05:55:47terry.reedysetnosy: - serhiy.storchaka
2021-06-17 05:13:15serhiy.storchakasetversions: + Python 3.11, - Python 3.4, Python 3.5
2021-06-17 05:13:04serhiy.storchakasetmessages: + msg395972
2021-06-17 05:01:58NeilGirdharsetmessages: + msg395970
2021-06-16 22:39:52gvanrossumsetmessages: + msg395968
2021-06-16 21:47:24serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg395965
2021-06-16 21:23:58terry.reedysetmessages: + msg395962
2021-06-16 18:17:22gvanrossumsetmessages: + msg395953
2021-06-16 17:24:01terry.reedysetnosy: + gvanrossum
messages: + msg395951
2021-06-16 15:07:47iritkatrielsetnosy: + iritkatriel
messages: + msg395929
2015-01-31 07:32:21terry.reedysetnosy: + terry.reedy
messages: + msg235083
2015-01-26 19:41:26NeilGirdharsetmessages: + msg234773
2015-01-26 16:43:30NeilGirdharsetmessages: + msg234756
2015-01-25 23:02:42NeilGirdharsetmessages: + msg234704
2015-01-25 22:50:56NeilGirdharsetmessages: + msg234703
2015-01-25 22:38:47r.david.murraysetmessages: + msg234702
2015-01-25 22:28:14NeilGirdharsetmessages: + msg234698
2015-01-25 22:20:36NeilGirdharsetnosy: + NeilGirdhar
messages: + msg234697
2015-01-25 16:47:43r.david.murraysetpriority: normal -> low

nosy: + r.david.murray
messages: + msg234675

stage: needs patch
2015-01-25 16:28:49SilentGhostsetnosy: + SilentGhost
messages: + msg234673
2015-01-25 15:46:56Joshua.Landaucreate