classification
Title: Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Greg Price, Ma Lin, aeros167, mark.dickinson, rhettinger, sir-sigurd
Priority: normal Keywords: patch

Created on 2019-08-10 21:07 by Greg Price, last changed 2019-08-24 06:34 by aeros167.

Pull Requests
URL Status Linked Edit
PR 15203 closed Greg Price, 2019-08-10 21:22
PR 15216 open Greg Price, 2019-08-11 23:30
PR 15448 open Greg Price, 2019-08-24 05:56
Messages (13)
msg349359 - (view) Author: Greg Price (Greg Price) * Date: 2019-08-10 21:07
In longobject.c we have the following usage a few times:

PyObject *
PyLong_FromLong(long ival)
{
    PyLongObject *v;
    // ... more locals ...

    CHECK_SMALL_INT(ival);

    if (ival < 0) {
        /* negate: can't write this as abs_ival = -ival since that
           invokes undefined behaviour when ival is LONG_MIN */
        abs_ival = 0U-(unsigned long)ival;
        sign = -1;
    }
    else {
    // ... etc. etc.

The CHECK_SMALL_INT macro contains a `return`, so the function can actually return before it reaches any of the other code.

#define CHECK_SMALL_INT(ival) \
    do if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) { \
        return get_small_int((sdigit)ival); \
    } while(0)

That's not even an error condition -- in fact it's the fast, hopefully reasonably-common, path.

An implicit return like this is pretty surprising for the reader. And it only takes one more line (plus a close-brace) to make it explicit:

    if (IS_SMALL_INT(ival)) {
        return get_small_int((sdigit)ival);
    }

so that seems like a much better trade.

Patch written, will post shortly.
msg349360 - (view) Author: Greg Price (Greg Price) * Date: 2019-08-10 21:31
I've just sent GH-15203 which is the first of two patches for this. It's quite small.

The second patch, which completes the change, is also small:
https://github.com/gnprice/cpython/commit/c6b905104

It depends on the first one, so I think the easiest is for me to send it as a PR after the first is merged? Happy to do another workflow if that'd make review easier. (I don't see guidance in the devguide on handling a sequence of patches, so I'm making this up.)
msg349376 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-11 06:18
Thanks for the contribution Greg.  I understand your reasoning but am going to decline PR 15203.  Though the feature is not experimentnal, it is something we want be able to modify, shut-off, or extend at build time.  Sometimes for testing we turn it off in order to identify identity test bugs.  Also, eveonline was controlling this to save memory.
msg349377 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-11 06:18
P.S.  Making the returns explicit is a reasonable change to make, but the core #ifdef logic should remain.
msg349379 - (view) Author: Greg Price (Greg Price) * Date: 2019-08-11 08:06
> Sometimes for testing we turn it off in order to identify identity test
bugs.

Interesting! Well, if the option is useful for testing, that's certainly a
good reason to keep it.

> Also, eveonline was controlling this to save memory.

Also interesting. I feel like there must be something about their setup
that I'm not imagining, because this is a pretty tiny amount of memory (9
kB) on the scale of a typical Python server process in my experience.

Or do you mean they were increasing the numbers? One thought I had looking
at this was actually that it would be interesting to do some measurements
and try to pick new (default) values for these parameters - it looks like
they're the same today as in 1993, and it's likely that with all that's
changed in typical hardware since then (plus the switch from Python ints +
longs to only what we used to call longs), the optimum values today are
different from the optimum in 1993. And certainly one of the effects of
this optimization when its hit rate is good is reducing memory consumption
- you have only one 17 instead of however many you make.

I'll rework the explicit-return patch to send without this change. It'll be
a little trickier but should be entirely doable.

On Sat, Aug 10, 2019, 23:19 Raymond Hettinger <report@bugs.python.org>
wrote:

>
> Raymond Hettinger <raymond.hettinger@gmail.com> added the comment:
>
> P.S.  Making the returns explicit is a reasonable change to make, but the
> core #ifdef logic should remain.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue37812>
> _______________________________________
>
msg349414 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-11 22:11
> I'll rework the explicit-return patch to send without this
> change. It'll be a little trickier but should be entirely doable.

An explicit return would be a nice improvement.

On the other hand, if it is tricky and requires something more than minor surgery, that would be a hint that it isn't worth it.  There is some value in code that is stable and known to be working just fine.  See: https://en.wikipedia.org/wiki/There_are_known_knowns
msg349418 - (view) Author: Greg Price (Greg Price) * Date: 2019-08-11 23:32
> On the other hand, if it is tricky and requires something more than minor surgery, that would be a hint that it isn't worth it.  There is some value in code that is stable and known to be working just fine.

Definitely true!

I think this change turned out quite clean, though. Just sent, as GH-15216 -- please take a look.
msg349420 - (view) Author: Ma Lin (Ma Lin) * Date: 2019-08-12 00:13
How about using a hybrid implementation for `int`.

For example, on 64-bit platform, if (a integer >=-9223372036854775808 and <=9223372036854775807), use a native `signed long` to represent it.

People mostly use +-* operations, maybe using native int is faster, even including the cost of overflow checking.

If operation will overflow or other operation like **, we can transform native int to current form and run in current code path.
msg349425 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-12 01:13
Ma Lin:  Everything in Python is an object, nothing is native.  There is room to reconsider the implementation of integer objects, known internally as PyLong objects.  If you want to tease-out an alternative implementation, please do so on the python-ideas maillist.  Small integer operations (including) indexing used to be smaller and faster in Python 2.7, so there is likely some room for improvement on what we have now.
msg349429 - (view) Author: Ma Lin (Ma Lin) * Date: 2019-08-12 02:51
I sent twice, but it doesn't appear in Python-Ideas list.
I will try to post to Python-Dev tomorrow.
msg349430 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-12 03:11
python-dev likely isn't the right place.  That is a forum for more mature ideas.
msg350352 - (view) Author: Greg Price (Greg Price) * Date: 2019-08-24 06:05
Thanks, Raymond, for the review on GH-15216!

Shortly after posting this issue, I noticed a very similar story in CHECK_BINOP.  I've just posted GH-15448 to similarly make returns explicit there.  It basically consists of a number of repetitions of

-    CHECK_BINOP(self, other);
+    if (!PyLong_Check(self) || !PyLong_Check(other)) {
+        Py_RETURN_NOTIMPLEMENTED;
+    }

with the names `self` and `other` varying from site to site.  Though the expanded version isn't literally a `return` statement, I think it functions almost as well as one for the purposes that explicitness serves here, and much better so than a reference to `CHECK_BINOP`.
msg350353 - (view) Author: Kyle Stanley (aeros167) * (Python triager) Date: 2019-08-24 06:14
> python-dev likely isn't the right place.  That is a forum for more mature ideas.

Agreed, that idea should be posted to python-list if they're having issues posting to python-ideas. Python-dev certainly wouldn't be appropriate for that.
History
Date User Action Args
2019-08-24 06:34:10aeros167settype: enhancement
2019-08-24 06:14:39aeros167setmessages: + msg350353
2019-08-24 06:05:00Greg Pricesetmessages: + msg350352
2019-08-24 05:56:48Greg Pricesetpull_requests: + pull_request15140
2019-08-12 03:11:01rhettingersetmessages: + msg349430
2019-08-12 02:51:14Ma Linsetmessages: + msg349429
2019-08-12 01:13:35rhettingersetmessages: + msg349425
2019-08-12 00:13:09Ma Linsetnosy: + Ma Lin
messages: + msg349420
2019-08-11 23:32:09Greg Pricesetmessages: + msg349418
2019-08-11 23:30:32Greg Pricesetpull_requests: + pull_request14944
2019-08-11 22:11:54rhettingersetmessages: + msg349414
2019-08-11 08:06:28Greg Pricesetmessages: + msg349379
2019-08-11 06:18:58rhettingersetmessages: + msg349377
2019-08-11 06:18:02rhettingersetassignee: rhettinger

messages: + msg349376
nosy: + rhettinger
2019-08-10 21:31:48Greg Pricesetnosy: + mark.dickinson, sir-sigurd, aeros167
messages: + msg349360
2019-08-10 21:22:58Greg Pricesetkeywords: + patch
stage: patch review
pull_requests: + pull_request14932
2019-08-10 21:07:30Greg Pricecreate