This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: inspect.getcallargs doesn't properly interpret set comprehension code objects.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: JelleZijlstra, ncoghlan, nedbat, python-dev, r.david.murray, serhiy.storchaka, yselivanov
Priority: normal Keywords: patch

Created on 2013-11-15 12:03 by nedbat, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue19611.patch JelleZijlstra, 2016-06-02 18:43 patch against master review
issue19611.patch JelleZijlstra, 2016-06-02 19:16
issue19611.patch JelleZijlstra, 2016-06-03 16:18 updated patch review
issue19611-decorators.patch JelleZijlstra, 2016-06-04 00:16 review
Messages (21)
msg202944 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2013-11-15 12:03
In 2.7, set comprehensions are compiled to code objects expecting an argument named ".0".  This convention is also used for the unnamed arguments needed by tuple arguments.  inspect.getcallargs understands the tuple argument case, but not the set comprehension case, and throws errors for correct arguments.

This is also true for generator expressions and dictionary comprehensions.

Demonstration:

#-----
import inspect
import sys
import types

def make_set():
    return {z*z for z in range(5)}

print(make_set())

# The set comprehension is turned into a code object expecting a single
# argument called ".0" with should be an iterator over range(5).
if sys.version_info < (3,):
    setcomp_code = make_set.func_code.co_consts[1]
else:
    setcomp_code = make_set.__code__.co_consts[1]
setcomp_func = types.FunctionType(setcomp_code, {})

# We can successfully call the function with the argument it expects.
print(setcomp_func(iter(range(5))))

# But inspect can't figure that out, because the ".0" argument also means
# tuple arguments, which this code object doesn't expect.
print(inspect.getcallargs(setcomp_func, iter(range(5))))

#-----



When run on Python 3.3, this produces:

    {0, 1, 4, 16, 9} 
    {0, 1, 4, 16, 9}
    {'.0': <range_iterator object at 0x10834be70>}

When run on Python 2.7, it produces:

    set([0, 1, 4, 16, 9])
    set([0, 1, 4, 16, 9])
    Traceback (most recent call last):
      File "foo.py", line 17, in <module>
        print(inspect.getcallargs(setcomp_func, iter(range(5))))
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 935, in getcallargs
        assign(arg, value)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 922, in assign
        raise ValueError('too many values to unpack')
    ValueError: too many values to unpack
msg202945 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2013-11-15 12:08
BTW: I don't hold any illusions that this bug is important enough to fix, but I would be interested in hearing ideas about how I could work around it...
msg266909 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2016-06-02 18:43
This doesn't work in Python 3.6 (current dev version) either. Using Ned's example, I get (snipping some of the ipython stack trace):

/home/jelle/cpython-dev/cpython/Lib/inspect.py in __init__(self, name, kind, default, annotation)
   2399         if not name.isidentifier():
-> 2400             raise ValueError('{!r} is not a valid parameter name'.format(name))
   2401 

ValueError: '.0' is not a valid parameter name

print(inspect.Signature.from_callable(setcomp_func).bind(iter(range(5))))
fails with the same error.

However, both work if I take out the isidentifier check.

In Python 2.7, the bug is actually in inspect.getargs:

In [7]: inspect.getargs(setcomp_func.func_code)
Out[7]: Arguments(args=[['z']], varargs=None, keywords=None)

which assumes that any ".0" argument is a tuple.


I'm not sure the bug in 2.7 is worth fixing, since it will require fragile bytecode manipulation to distinguish tuple arguments from set comprehensions, and you can get to this case only by manually creating function objects. 

I think the Python 3 level bug is worth fixing though, since it's an easy fix (just make the .isidentifier call accept names of the form .0). I'm attaching a patch against master with tests. I have signed the contributor agreement.
msg266916 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-06-02 19:02
The patch looks reasonable to me.  I've added Yuri to nosy since he's responsible for the signature code in inspect, and he might have an opinion on the appriateness of the fix.

I think this special case needs to be documented, since currently the name is documented as needing to be a valid python identifier.

Sounds like we don't have a good answer to Ned's question about a 2.7 workaround.  On the other hand, "fragile byte code inspection" is not really fragile, since the 2.7 bytecode will never change.  The real question is if someone cares enough to do the work, since as you say (and Ned admits) the benefit is small.
msg266919 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2016-06-02 19:05
Thanks for the comment!

It is fragile in the sense that I think there is a good chance that a fix (which is going to rely on details of generated bytecode) will have bugs and break someone's working 2.7 code.

The patch's tests have a bug, which I'll fix together with a documentation patch.
msg266923 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2016-06-02 19:16
Fixed patch, added documentation
msg266932 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-06-02 19:35
I wonder if we should do the mention of .0 as a footnote, since it won't matter to most users.  What do you think, Yuri?
msg266934 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2016-06-02 19:49
Jelle, regarding your patch: it would probably be a good idea to only accept ".0"-like names for POSITIONAL_ONLY parameters.  Since functions with positional-only parameters can only be created in C, that should make inspect.Parameter less error prone.

David, maybe this is something that can be fixed using the Arguments Clinic?  I'm not sure how to feel about parsing ".0" for parameter names.
msg266943 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-06-02 20:52
Yes, it feels ugly.  I'd rather "fix" the source of the problem (the weird positional attribute name).  I have no idea if argument clinic would be of any help here, I haven't looked at how these things come to be.
msg266945 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2016-06-02 20:57
> Yes, it feels ugly.  I'd rather "fix" the source of the problem (the weird positional attribute name).

+1.

> I have no idea if argument clinic would be of any help here, I haven't looked at how these things come to be.

Me neither.  Serhiy, what do you think about this?
msg266947 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2016-06-02 21:03
Since this only comes up with manually created functions (through calling types.FunctionType), I think it wouldn't be unreasonable to just not handle this case in the inspect module and close this as wontfix.

Ned, is there a use case for converting comprehension code objects to functions and inspecting them?
msg266966 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-06-02 21:55
The origin of the ".0" parameter name is the compiler's implicit symbol definition - we use the "." prefix for those to ensure we don't accidentally collide with actual identifiers used in the code.

We can see that via dis.dis:

>>> def make_set():
...     return {z*z for z in range(5)}
... 
>>> import dis
>>> dis.dis(make_set)
  2           0 LOAD_CONST               1 (<code object <setcomp> at 0x7f54c5b2cd20, file "<stdin>", line 2>)
              3 LOAD_CONST               2 ('make_set.<locals>.<setcomp>')
              6 MAKE_FUNCTION            0
              9 LOAD_GLOBAL              0 (range)
             12 LOAD_CONST               3 (5)
             15 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             18 GET_ITER
             19 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             22 RETURN_VALUE
>>> dis.dis(make_set.__code__.co_consts[1])
  2           0 BUILD_SET                0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                16 (to 25)
              9 STORE_FAST               1 (z)
             12 LOAD_FAST                1 (z)
             15 LOAD_FAST                1 (z)
             18 BINARY_MULTIPLY
             19 SET_ADD                  2
             22 JUMP_ABSOLUTE            6
        >>   25 RETURN_VALUE

It isn't just set comprehensions that will do this - it's all the implicit nested functions that accept a positional argument (list/dict/set comprehensions, generator expressions)

I chatted to Brett Cannon about it, and we agree inspect should handle the implicit functions generated by the compiler, even if those signatures can't be expressed in normal Python code. The fact those may be emitted by the affected inspect functions can then be documented as a CPython implementation detail, rather than as a language level behaviour.
msg266968 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2016-06-02 22:03
> I chatted to Brett Cannon about it, and we agree inspect should handle the implicit functions generated by the compiler, even if those signatures can't be expressed in normal Python code.

Can we make ".0" parameters positional-only then?  
And rename ".{digit}" into "arg{digit}"?
msg266978 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-06-02 22:48
We definitely can't use a valid identifier in the code generator, since any valid identifier we used might shadow a nonlocal, global or builtin name (and the latter two cases aren't visible to the compiler at compile time).

They're also genuinely not positional only:

>>> print(setcomp_func(iter(range(5))))
{0, 1, 4, 9, 16}
>>> print(setcomp_func(**{".0": iter(range(5))}))
{0, 1, 4, 9, 16}
msg266980 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2016-06-02 22:52
> We definitely can't use a valid identifier in the code generator, since any valid identifier we used might shadow a nonlocal, global or builtin name (and the latter two cases aren't visible to the compiler at compile time).

I wasn't proposing to fix code generator, but rather to transform the name to something that users will understand in inspect.signature code.  ".0" will be extremely hard to google.

> They're also genuinely not positional only:
> >>> print(setcomp_func(**{".0": iter(range(5))}))
> {0, 1, 4, 9, 16}

My opinion on this: this is a very low-level implementation detail of CPython.  If users start abusing Signature.bind for binding ".0" they will write code that will be hard to port to other implementations, and that will also create a requirement for us (CPython devs) to maintain backwards compatibility here.

What I propose, is to rename ".0" args and to classify them as "positional-only". This would still make it possible to use Signature.bind on such callables, but in a safe way.
msg266988 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-06-02 23:35
Oh, I see what you mean, now - we can special case the compiler-generated ".N" names when extracting the signature.

I like it - make the claimed parameter something like "implicit0" instead of ".0", mention that in the note on the docs, and folks may have some chance of working out what's going on if they manage to stumble across this behaviour.
msg267040 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-03 05:14
I have no idea what relations Argument Clinic have with this issue.
msg267109 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2016-06-03 16:18
This new patch makes the inspect module treat .0-type names as positional-only and renames them to implicit0. Includes a documentation patch too.

I'm not too happy with the wording in the documentation, so suggestions for better naming are appreciated.
msg267194 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2016-06-04 00:16
Patch adding @cpython_only decorators to the tests.
msg267324 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-06-04 21:44
New changeset 6247dd0f230e by Nick Coghlan in branch 'default':
Issue #19611: handle implicit parameters in inspect.signature
https://hg.python.org/cpython/rev/6247dd0f230e
msg267326 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-06-04 21:45
Fix committed for 3.6 - this is obscure enough that I'm not going to worry about fixing it on maintenance branches.
History
Date User Action Args
2022-04-11 14:57:53adminsetgithub: 63810
2016-06-04 21:45:58ncoghlansetstatus: open -> closed
versions: - Python 2.7
type: behavior
messages: + msg267326

resolution: fixed
stage: resolved
2016-06-04 21:44:29python-devsetnosy: + python-dev
messages: + msg267324
2016-06-04 00:16:40JelleZijlstrasetfiles: + issue19611-decorators.patch

messages: + msg267194
2016-06-03 16:18:38JelleZijlstrasetfiles: + issue19611.patch

messages: + msg267109
2016-06-03 05:14:52serhiy.storchakasetmessages: + msg267040
2016-06-02 23:35:36ncoghlansetmessages: + msg266988
2016-06-02 22:52:33yselivanovsetmessages: + msg266980
2016-06-02 22:48:00ncoghlansetmessages: + msg266978
2016-06-02 22:03:43yselivanovsetmessages: + msg266968
2016-06-02 21:55:49ncoghlansetmessages: + msg266966
2016-06-02 21:27:03ncoghlansetassignee: ncoghlan

nosy: + ncoghlan
2016-06-02 21:03:27JelleZijlstrasetmessages: + msg266947
2016-06-02 20:57:48yselivanovsetnosy: + serhiy.storchaka
messages: + msg266945
2016-06-02 20:52:46r.david.murraysetmessages: + msg266943
2016-06-02 19:49:16yselivanovsetmessages: + msg266934
2016-06-02 19:35:47r.david.murraysetmessages: + msg266932
2016-06-02 19:16:52JelleZijlstrasetfiles: + issue19611.patch

messages: + msg266923
2016-06-02 19:05:07JelleZijlstrasetmessages: + msg266919
2016-06-02 19:02:07r.david.murraysetnosy: + r.david.murray, yselivanov

messages: + msg266916
versions: + Python 3.6
2016-06-02 18:43:32JelleZijlstrasetfiles: + issue19611.patch

nosy: + JelleZijlstra
messages: + msg266909

keywords: + patch
2013-11-15 12:08:24nedbatsetmessages: + msg202945
2013-11-15 12:03:03nedbatcreate