Message 267241 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	serhiy.storchaka
Date	2016-06-04.07:45:33
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1465026334.02.0.369802561902.issue27213@psf.upfronthosting.co.za>
In-reply-to

Content
Currently there are 4 opcodes (CALL_FUNCTION, CALL_FUNCTION_VAR, CALL_FUNCTION_KW, CALL_FUNCTION_VAR_KW) for calling a function depending of presenting the var-positional and var-keyword arguments: func(arg1, ..., argN, name1=kwarg1, ..., nameM=kwargM) func(arg1, ..., argN, args, name1=kwarg1, ..., nameM=kwargM) func(arg1, ..., argN, name1=kwarg1, ..., nameM=kwargM, kwargs) func(arg1, ..., argN, args, name1=kwarg1, ..., nameM=kwargM, *kwargs) The number of positional and keyword arguments are packed in oparg, and both numbers are limited by 255. Thus the single keyword argument makes oparg not fitting in 8 bit and requires EXTENDED_ARG. The stack contains first positional arguments, then optional args tuple, then keyword arguments, then optional kwargs dict. For every keyword argument the two values are pushed on the stack: argument name (always constant string) and argument value. I collected a statistic about opcodes in compiled code during running Python tests [1] (maybe it is biased, but this is large and multifarious assembly, and I hope it takes some representation about average Python code). According to it about 90% of compiled function calls are calls with the fixed number of positional arguments (CALL_FUNCTION with oparg < 256), the rest 10% are calls with the fixed number of positional and keyword arguments (CALL_FUNCTION with oparg >= 256), and only about 0.5% are calls with the var-positional or var-keyword arguments. I propose to use the different sets of opcodes that corresponds to these cases: func(arg1, ..., argN) func(arg1, ..., argN, name1=kwarg1, ..., nameM=kwargM) func(args, **kwargs) 1. CALL_FUNCTION takes the fixed number of positional arguments. oparg is the number of arguments. The stack contains positional arguments. 2. CALL_FUNCTION_KW takes the fixed number of positional and keyword arguments. oparg is the number of all arguments. The stack contains values of arguments (first positional, then keyword), then a tuple of keyword names (as in proposed new opcode BUILD_CONST_KEY_MAP in issue27140). 3. CALL_FUNCTION_EX takes the variable number of positional and keyword arguments. oparg is 0. The stack contains a tuple of positional arguments and a dict of keyword arguments (they are build in the bytecode with using BUILD_TUPLE, BUILD_CONST_KEY_MAP, BUILD_LIST_UNPACK and BUILD_MAP_UNPACK_WITH_CALL). This is the most general variant, others exist just for the optimization of common cases. Benefits: 1. Calling a function with keyword arguments uses less stack and less LOAD_CONST instructions. 2. Calling a function with keyword arguments no longer needs EXTENDED_ARG. 3. The number of positional and keyword arguments is no longer limited by 255 (at least not in the bytecode). 4. The bytecode looks simpler, oparg always is just the number of arguments taken from the stack. This proposition was discussed on Python-Ideas [2]. [1] http://permalink.gmane.org/gmane.comp.python.ideas/39993 [2] http://comments.gmane.org/gmane.comp.python.ideas/39961

Currently there are 4 opcodes (CALL_FUNCTION, CALL_FUNCTION_VAR, CALL_FUNCTION_KW, CALL_FUNCTION_VAR_KW) for calling a function depending of presenting the var-positional and var-keyword arguments:

    func(arg1, ..., argN, name1=kwarg1, ..., nameM=kwargM)
    func(arg1, ..., argN, *args, name1=kwarg1, ..., nameM=kwargM)
    func(arg1, ..., argN, name1=kwarg1, ..., nameM=kwargM, **kwargs)
    func(arg1, ..., argN, *args, name1=kwarg1, ..., nameM=kwargM, **kwargs)

The number of positional and keyword arguments are packed in oparg, and both numbers are limited by 255. Thus the single keyword argument makes oparg not fitting in 8 bit and requires EXTENDED_ARG. The stack contains first positional arguments, then optional args tuple, then keyword arguments, then optional kwargs dict. For every keyword argument the two values are pushed on the stack: argument name (always constant string) and argument value.

I collected a statistic about opcodes in compiled code during running Python tests [1] (maybe it is biased, but this is large and multifarious assembly, and I hope it takes some representation about average Python code). According to it about 90% of compiled function calls are calls with the fixed number of positional arguments (CALL_FUNCTION with oparg < 256), the rest 10% are calls with the fixed number of positional and keyword arguments (CALL_FUNCTION with oparg >= 256), and only about 0.5% are calls with the var-positional or var-keyword arguments.

I propose to use the different sets of opcodes that corresponds to these cases:

    func(arg1, ..., argN)
    func(arg1, ..., argN, name1=kwarg1, ..., nameM=kwargM)
    func(*args, **kwargs)

1. CALL_FUNCTION takes the fixed number of positional arguments. oparg is the number of arguments. The stack contains positional arguments.

2. CALL_FUNCTION_KW takes the fixed number of positional and keyword arguments. oparg is the number of all arguments. The stack contains values of arguments (first positional, then keyword), then a tuple of keyword names (as in proposed new opcode BUILD_CONST_KEY_MAP in issue27140).

3. CALL_FUNCTION_EX takes the variable number of positional and keyword arguments. oparg is 0. The stack contains a tuple of positional arguments and a dict of keyword arguments (they are build in the bytecode with using BUILD_TUPLE, BUILD_CONST_KEY_MAP, BUILD_LIST_UNPACK and BUILD_MAP_UNPACK_WITH_CALL). This is the most general variant, others exist just for the optimization of common cases.

Benefits:

1. Calling a function with keyword arguments uses less stack and less LOAD_CONST instructions.
2. Calling a function with keyword arguments no longer needs EXTENDED_ARG.
3. The number of positional and keyword arguments is no longer limited by 255 (at least not in the bytecode).
4. The bytecode looks simpler, oparg always is just the number of arguments taken from the stack.

This proposition was discussed on Python-Ideas [2].

[1] http://permalink.gmane.org/gmane.comp.python.ideas/39993
[2] http://comments.gmane.org/gmane.comp.python.ideas/39961

History
Date	User	Action	Args
2016-06-04 07:45:34	serhiy.storchaka	set	recipients: + serhiy.storchaka
2016-06-04 07:45:34	serhiy.storchaka	set	messageid: <1465026334.02.0.369802561902.issue27213@psf.upfronthosting.co.za>
2016-06-04 07:45:33	serhiy.storchaka	link	issue27213 messages
2016-06-04 07:45:33	serhiy.storchaka	create