classification
Title: make function parameter sentinel value true singletons
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, iritkatriel, rhettinger, serhiy.storchaka, taleinat, vstinner, zach.ware
Priority: normal Keywords:

Created on 2021-05-13 13:17 by taleinat, last changed 2021-05-15 19:23 by taleinat.

Messages (13)
msg393580 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-13 13:17
I recently noticed that some functions use sentinel values to differentiate between not having been passed any value vs. None. One example of this is traceback.print_exception(), hence I'm adding Irit to the nosy list.

However, using e.g. `sentinel = object()` or a single instance of a dedicated class/type can break when combined with pickling (see examples below).

Since these sentinel are usually compared using the `is` operator, having more than a single instance will break things, sometimes in very confusing ways.

I suggest ensuring that a single instance is always used, probably by using a class with a __new__() method making sure to always return a single instance.


>>> sentinel = object()
>>> sentinel2 = pickle.loads(pickle.dumps(sentinel))
>>> sentinel is sentinel2
False
>>> sentinel
<object object at 0x7fd00a986560>
>>> sentinel2
<object object at 0x7fd00a986500>


>>> class A:
...     pass
...     
>>> a = A()
>>> a2 = pickle.loads(pickle.dumps(a))
>>> a is a2
False
>>> a
<__main__.A object at 0x7fd00a9972f0>
>>> a2
<__main__.A object at 0x7fd009599450>
msg393581 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-13 13:52
Additional examples of such sentinel values in the stdlib are those in the dataclasses module, e.g. dataclasses.MISSING.
msg393582 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2021-05-13 13:59
FWIW, at work we've been using `...` as a handy not-None singleton.
msg393590 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-13 17:21
Here is what I suggest working with sentinels would look like:


>>> from dataclasses import MISSING
>>> MISSING
dataclasses.MISSING
>>> M2 = pickle.loads(pickle.dumps(MISSING))
>>> M2
dataclasses.MISSING
>>> M2 is MISSING
True


Here's an implementation which ensures a single instance is used, even considering multi-threading and pickling, which sets a nice repr according to the module and class name:


try:
    from threading import Lock
except ImportError:
    class Lock:
        def __enter__(self):
            pass
        def __exit__(self, exc_type, exc_value, traceback):
            pass


class Sentinel:
    _instance = None
    _lock = Lock()
    def __new__(cls):
        if cls._instance is None:
            with cls._lock:
                if cls._instance is None:
                    cls._instance = super().__new__(cls)
        return cls._instance
    def __repr__(self):
        *path_parts, classname = self.__class__.__qualname__.split('.')
        return '.'.join([self.__class__.__module__, *path_parts, classname.removeprefix('_')])


class _MISSING(Sentinel):
    pass
MISSING = _MISSING()
msg393594 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-13 17:38
Alternatively, sentinels can simply be classes:

class Sentinel:
    def __new__(cls, *args, **kwargs):
        raise TypeError(f'{cls.__qualname__} cannot be instantiated')

class MISSING(Sentinel):
    pass
msg393606 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-13 21:09
... and they can be given excellent reprs by using a meta-class:

class Sentinel(type):
    @classmethod
    def __prepare__(cls, name, bases, **kwds):
        d = super().__prepare__(name, bases, **kwds)
        def __new__(cls_, *args, **kwargs):
            raise TypeError(
                f'{cls_!r} is a sentinel and cannot be instantiated')
        d.update(__new__=__new__)
        return d

    def __repr__(cls):
        return f'{cls.__module__}.{cls.__qualname__}'


class MISSING(metaclass=Sentinel): pass


This also has another nice benefit:

>>> type(MISSING)
<class 'sentinels.Sentinel'>
msg393627 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-05-14 06:59
This needs a good deal more discussion before sweeping through the code and change a long standing Python idiom that has stood the test of time.  Until now, no one has claimed that this is broken.  You "recently noticing this" doesn't mean that it is wrong.
msg393628 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-05-14 07:36
I do not understand the problem with pickling sentinel values used as default values for function parameters. Could you please show an example?
msg393632 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-14 08:37
> This needs a good deal more discussion before sweeping through the code and change a long standing Python idiom that has stood the test of time.

I agree this should be discussed, which is why I created this issue about it, as place for that discussion to take place. This is also being discussed on python-dev.

The examples I've mentioned are both rather new, and they have implemented sentinel values using three different methods, with those in dataclasses even being inconsistent among themselves. I fail to see how this can be considered "chang[ing] a long standing Python idiom that has stood the test of time".

> You "recently noticing this" doesn't mean that it is wrong.

I find this wording makes things unnecessarily personal and negative. Let's discuss things based on technical merit.
msg393634 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-14 08:56
Tal: "This is also being discussed on python-dev."

https://mail.python.org/archives/list/python-dev@python.org/thread/ZLVPD2OISI7M4POMTR2FCQTE6TPMPTO3/
msg393674 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-14 16:41
> I do not understand the problem with pickling sentinel values used as default values for function parameters. Could you please show an example?

A category of relevant uses I can think of is when wrapping a function and storing the parameters it was called with for some later use. Some examples of this are logging, caching and RPC.

Specifically for RPC, using pickle to serialize/deserialize the parameters and then call the function with them seems reasonable. RPyC[1] does this in some cases, though I haven't yet checked how it handles these kinds of sentinel objects specifically.

I'll try to find a good concrete example.

[1] https://rpyc.readthedocs.io/
msg393680 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-05-14 17:47
Parameters are not passed to a function. Arguments are passed. And what you need is just to not pass to the wrapped function arguments which were not passed to wrapper. *args and **kwargs do not contain arguments which were not passed.

I understand that it would be pleasant to have better some singletons and sentinels (with better repr, and sometimes pickleable), but I don't think that the problem with pickling default values of function parameters even exists.
msg393724 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-15 19:23
> And what you need is just to not pass to the wrapped function arguments which were not passed to wrapper. *args and **kwargs do not contain arguments which were not passed.

It's common for wrapper functions to mirror the defaults of the wrapped functions and pass those along... A quick search brings up an example from dataclasses where sentinel values are passed as arguments to a wrapped callable[1].

I concede that issues with pickling such values are very unlikely. I stumbled upon such an issue once and it was difficult to debug, but that was a very unusual circumstance.

Still, it's a bit surprising that we have public facing values in stdlib modules which don't behave as one would expect with regard to pickling. This is why I created this 

It seems that I should have marked this as "enhancement" rather than the default of "behavior", though, since I meant this to be an improvement on the current situation, rather then assert the current situation is "broken".

[1] https://github.com/python/cpython/blob/c10c2ec7a0e06975e8010c56c9c3270f8ea322ec/Lib/dataclasses.py#L346
History
Date User Action Args
2021-05-15 19:23:47taleinatsettype: behavior -> enhancement
messages: + msg393724
versions: - Python 3.10
2021-05-14 17:47:35serhiy.storchakasetmessages: + msg393680
2021-05-14 16:41:16taleinatsetmessages: + msg393674
2021-05-14 08:56:30vstinnersetnosy: + vstinner
messages: + msg393634
2021-05-14 08:37:26taleinatsetmessages: + msg393632
2021-05-14 07:36:30serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg393628
2021-05-14 06:59:36rhettingersetnosy: + rhettinger
messages: + msg393627
2021-05-13 21:09:32taleinatsetmessages: + msg393606
2021-05-13 17:38:36taleinatsetmessages: + msg393594
2021-05-13 17:21:50taleinatsetmessages: + msg393590
2021-05-13 14:24:18eric.smithsetnosy: + eric.smith
2021-05-13 13:59:31zach.waresetnosy: + zach.ware
messages: + msg393582
2021-05-13 13:52:54taleinatsetmessages: + msg393581
2021-05-13 13:17:06taleinatcreate