This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unoptimal calls to PyNumber_Check
Type: enhancement Stage: resolved
Components: IO Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Oren Milman, mark.dickinson, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-03-05 23:11 by Oren Milman, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
patchVer2.diff Oren Milman, 2017-03-12 16:29 patch2 diff file
Pull Requests
URL Status Linked Edit
PR 622 closed Oren Milman, 2017-03-11 22:23
PR 650 merged Oren Milman, 2017-03-12 22:00
Messages (12)
msg289048 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-05 23:11
------------ current state ------------
if (PyNumber_Check(obj)) {
    someVar = PyNumber_AsSsize_t(obj, SomeError);
    if (someVar == -1 && PyErr_Occurred()) {
        return errVal;
    }
}
else {
    PyErr_Format(PyExc_TypeError,
                 "integer argument expected, got '%.200s'",
                 Py_TYPE(obj)->tp_name);
    return errVal;
}

Something similar to this happens in:
    - Modules/mmapmodule.c in mmap_convert_ssize_t
    - Modules/_io/_iomodule.c in _PyIO_ConvertSsize_t
    - Modules/_io/stringio.c in:
        * _io_StringIO_read_impl
        * _io_StringIO_readline_impl
        * _io_StringIO_truncate_impl

(Moreover, in:
    - Objects/bytes_methods.c in parse_args_finds_byte
    - Objects/exceptions.c in oserror_init
PyNumber_AsSsize_t is called only if PyNumber_Check returns true.)

Note that:
    - PyNumber_Check checks whether nb_int != NULL or nb_float != NULL.
    - PyNumber_AsSsize_t calls PyNumber_Index, which, before calling
      nb_index, raises a TypeError (with a similar error message) in case
      nb_index == NULL.
    - The docs say '... when __index__() is defined __int__() should also be
      defined ...'.
So the behavior with and without the call to PyNumber_Check is quite the same.
The only potential advantage of calling PyNumber_Check is skipping the call to
PyNumber_AsSsize_t.
But PyNumber_AsSsize_t would be called also in case
nb_index == NULL and (nb_int != NULL or nb_float != NULL).
Thus, the only case in which the call to PyNumber_Check might be useful, is
when nb_int == nb_float == nb_index == NULL.


------------ proposed changes ------------
Either remove each of these calls to PyNumber_Check, or at least replace it
with a call to PyIndex_Check, which checks whether nb_index != NULL, and thus
would be more useful than PyNumber_Check.

Note that such a change shouldn't affect the behavior, except for a slightly
different wording of the error message in case a TypeError is raised.
msg289051 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-05 23:29
+1 for removing each of these calls to PyNumber_Check.
msg289458 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-11 22:11
after some closer examination, ISTM that in Objects/exceptions.c, we can't
remove PyNumber_Check to optimize or simplify the code, as the argument
'filename' can be either an integer type (only in case the error is a
BlockingIOError), or quite anything else, except for None.

We could replace PyNumber_Check(filename) with PyIndex_Check(filename), but
this would change the behavior of BlockingIOError.
IMHO, this isn't that important, and thus we should leave the code in
Objects/exceptions.c as is.

anyway, I would soon create a pull request to remove all other
aforementioned calls to PyNumber_Check.
msg289483 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-12 07:41
I think it would be better to replace PyNumber_Check() with PyIndex_Check() and change error messages to more specific. In IO related functions the accepted values are integers and None. In bytes/bytearray methods -- integers and bytes-like objects.
msg289488 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-12 09:17
"In bytes/bytearray methods -- integers and bytes-like objects."
(I guess you refer to parse_args_finds_byte (in Objects/bytes_methods.c))
so you suggest that in case (!PyIndex_Check(tmp_subobj)), we also check whether
(PyByteArray_Check(tmp_subobj) or PyBytes_Check(tmp_subobj)), and raise an
error if needed?

anyway, you and Raymond suggest different things. how do we proceed?
msg289489 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-12 09:34
The call of PyNumber_Check() is redundant if we don't bother about error message. But if we want to have accurate error message we should check types before converting.

In parse_args_finds_byte we should check rather PyObject_CheckBuffer() (and maybe PyBytes_Check() for fast path if this makes sense). Perhaps some methods need to check also PyTuple_Check().
msg289502 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-12 16:29
"Perhaps some methods need to check also PyTuple_Check()."
which methods?

anyway, a new patch diff file is attached, according to your other seggustions,
Serhiy.
(I ran the test module, and some tests of my own, and it seems that the patch
doesn't break anything.)
msg289506 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-12 20:00
Please open a PR for review.

Checking PyTuple_Check() is needed if this code is used in startswith() and endswith().

"integer argument or None expected" -- I'm not sure this wording is correct enough. What wording is used in other similar code? I'm sure there other functions that accept int or None.
msg289514 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-12 22:01
I am sorry, but I guess I am missing something about startswith() and
endswith().
ISTM that PyTuple_Check() is already called by unicode_startswith,
unicode_endswith and _Py_bytes_tailmatch, which already raise a TypeError
with an appropriate error message
(e.g. "%s first arg must be bytes or a tuple of bytes, not %s").


I searched the codebase, and found that in most places, if PyIndex_Check(obj)
is true, then obj is described as 'integer' in error messages. rarely, obj
would be described as 'int'.
I found only two relevant TypeError messages of functions that accept an
integer type or None (BTW, I took the term 'integer type' from
https://docs.python.org/3.7/reference/atamodel.html?highlight=__index__#object.__index__.):
    - in Modules/posixmodule.c in dir_fd_converter():
      PyErr_Format(PyExc_TypeError,
                   "argument should be integer or None, not %.200s",
                   Py_TYPE(o)->tp_name);
    - in Modules/_functoolsmodule.c in lru_cache_new():
      PyErr_SetString(PyExc_TypeError, "maxsize should be integer or None");

so I changed the error messages in my patch to the form of
'argument should be ...' (which is, IMHO, much clearer).


also, now that the new PR was created, should I close the old one?
msg289516 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-12 22:38
LGTM. Thank you Oren.
msg289532 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-13 06:10
thanks for the reviews :)
msg290191 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:21
New changeset 004251059b9c5e48d47cb30b94bcb92cb44d3adf by Serhiy Storchaka (Oren Milman) in branch 'master':
bpo-29730: replace some calls to PyNumber_Check and improve some error messages (#650)
https://github.com/python/cpython/commit/004251059b9c5e48d47cb30b94bcb92cb44d3adf
History
Date User Action Args
2022-04-11 14:58:43adminsetgithub: 73916
2017-03-24 22:21:13serhiy.storchakasetmessages: + msg290191
2017-03-13 06:10:05Oren Milmansetmessages: + msg289532
2017-03-12 22:39:00serhiy.storchakasetstatus: open -> closed
messages: + msg289516

assignee: serhiy.storchaka
resolution: fixed
stage: resolved
2017-03-12 22:01:23Oren Milmansetmessages: + msg289514
2017-03-12 22:00:47Oren Milmansetpull_requests: + pull_request537
2017-03-12 20:00:20serhiy.storchakasetmessages: + msg289506
2017-03-12 16:29:57Oren Milmansetfiles: + patchVer2.diff
keywords: + patch
messages: + msg289502
2017-03-12 09:34:48serhiy.storchakasetmessages: + msg289489
2017-03-12 09:17:33Oren Milmansetmessages: + msg289488
2017-03-12 07:41:51serhiy.storchakasetmessages: + msg289483
2017-03-11 22:23:01Oren Milmansetpull_requests: + pull_request510
2017-03-11 22:11:53Oren Milmansetmessages: + msg289458
2017-03-06 06:05:41serhiy.storchakasetnosy: + mark.dickinson, serhiy.storchaka
2017-03-05 23:29:25rhettingersetnosy: + rhettinger
messages: + msg289051
2017-03-05 23:11:18Oren Milmancreate