Issue44123
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-05-13 13:17 by taleinat, last changed 2022-04-11 14:59 by admin.
Messages (13) | |||
---|---|---|---|
msg393580 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-13 13:17 | |
I recently noticed that some functions use sentinel values to differentiate between not having been passed any value vs. None. One example of this is traceback.print_exception(), hence I'm adding Irit to the nosy list. However, using e.g. `sentinel = object()` or a single instance of a dedicated class/type can break when combined with pickling (see examples below). Since these sentinel are usually compared using the `is` operator, having more than a single instance will break things, sometimes in very confusing ways. I suggest ensuring that a single instance is always used, probably by using a class with a __new__() method making sure to always return a single instance. >>> sentinel = object() >>> sentinel2 = pickle.loads(pickle.dumps(sentinel)) >>> sentinel is sentinel2 False >>> sentinel <object object at 0x7fd00a986560> >>> sentinel2 <object object at 0x7fd00a986500> >>> class A: ... pass ... >>> a = A() >>> a2 = pickle.loads(pickle.dumps(a)) >>> a is a2 False >>> a <__main__.A object at 0x7fd00a9972f0> >>> a2 <__main__.A object at 0x7fd009599450> |
|||
msg393581 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-13 13:52 | |
Additional examples of such sentinel values in the stdlib are those in the dataclasses module, e.g. dataclasses.MISSING. |
|||
msg393582 - (view) | Author: Zachary Ware (zach.ware) * | Date: 2021-05-13 13:59 | |
FWIW, at work we've been using `...` as a handy not-None singleton. |
|||
msg393590 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-13 17:21 | |
Here is what I suggest working with sentinels would look like: >>> from dataclasses import MISSING >>> MISSING dataclasses.MISSING >>> M2 = pickle.loads(pickle.dumps(MISSING)) >>> M2 dataclasses.MISSING >>> M2 is MISSING True Here's an implementation which ensures a single instance is used, even considering multi-threading and pickling, which sets a nice repr according to the module and class name: try: from threading import Lock except ImportError: class Lock: def __enter__(self): pass def __exit__(self, exc_type, exc_value, traceback): pass class Sentinel: _instance = None _lock = Lock() def __new__(cls): if cls._instance is None: with cls._lock: if cls._instance is None: cls._instance = super().__new__(cls) return cls._instance def __repr__(self): *path_parts, classname = self.__class__.__qualname__.split('.') return '.'.join([self.__class__.__module__, *path_parts, classname.removeprefix('_')]) class _MISSING(Sentinel): pass MISSING = _MISSING() |
|||
msg393594 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-13 17:38 | |
Alternatively, sentinels can simply be classes: class Sentinel: def __new__(cls, *args, **kwargs): raise TypeError(f'{cls.__qualname__} cannot be instantiated') class MISSING(Sentinel): pass |
|||
msg393606 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-13 21:09 | |
... and they can be given excellent reprs by using a meta-class: class Sentinel(type): @classmethod def __prepare__(cls, name, bases, **kwds): d = super().__prepare__(name, bases, **kwds) def __new__(cls_, *args, **kwargs): raise TypeError( f'{cls_!r} is a sentinel and cannot be instantiated') d.update(__new__=__new__) return d def __repr__(cls): return f'{cls.__module__}.{cls.__qualname__}' class MISSING(metaclass=Sentinel): pass This also has another nice benefit: >>> type(MISSING) <class 'sentinels.Sentinel'> |
|||
msg393627 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2021-05-14 06:59 | |
This needs a good deal more discussion before sweeping through the code and change a long standing Python idiom that has stood the test of time. Until now, no one has claimed that this is broken. You "recently noticing this" doesn't mean that it is wrong. |
|||
msg393628 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2021-05-14 07:36 | |
I do not understand the problem with pickling sentinel values used as default values for function parameters. Could you please show an example? |
|||
msg393632 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-14 08:37 | |
> This needs a good deal more discussion before sweeping through the code and change a long standing Python idiom that has stood the test of time. I agree this should be discussed, which is why I created this issue about it, as place for that discussion to take place. This is also being discussed on python-dev. The examples I've mentioned are both rather new, and they have implemented sentinel values using three different methods, with those in dataclasses even being inconsistent among themselves. I fail to see how this can be considered "chang[ing] a long standing Python idiom that has stood the test of time". > You "recently noticing this" doesn't mean that it is wrong. I find this wording makes things unnecessarily personal and negative. Let's discuss things based on technical merit. |
|||
msg393634 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-05-14 08:56 | |
Tal: "This is also being discussed on python-dev." https://mail.python.org/archives/list/python-dev@python.org/thread/ZLVPD2OISI7M4POMTR2FCQTE6TPMPTO3/ |
|||
msg393674 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-14 16:41 | |
> I do not understand the problem with pickling sentinel values used as default values for function parameters. Could you please show an example? A category of relevant uses I can think of is when wrapping a function and storing the parameters it was called with for some later use. Some examples of this are logging, caching and RPC. Specifically for RPC, using pickle to serialize/deserialize the parameters and then call the function with them seems reasonable. RPyC[1] does this in some cases, though I haven't yet checked how it handles these kinds of sentinel objects specifically. I'll try to find a good concrete example. [1] https://rpyc.readthedocs.io/ |
|||
msg393680 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2021-05-14 17:47 | |
Parameters are not passed to a function. Arguments are passed. And what you need is just to not pass to the wrapped function arguments which were not passed to wrapper. *args and **kwargs do not contain arguments which were not passed. I understand that it would be pleasant to have better some singletons and sentinels (with better repr, and sometimes pickleable), but I don't think that the problem with pickling default values of function parameters even exists. |
|||
msg393724 - (view) | Author: Tal Einat (taleinat) * | Date: 2021-05-15 19:23 | |
> And what you need is just to not pass to the wrapped function arguments which were not passed to wrapper. *args and **kwargs do not contain arguments which were not passed. It's common for wrapper functions to mirror the defaults of the wrapped functions and pass those along... A quick search brings up an example from dataclasses where sentinel values are passed as arguments to a wrapped callable[1]. I concede that issues with pickling such values are very unlikely. I stumbled upon such an issue once and it was difficult to debug, but that was a very unusual circumstance. Still, it's a bit surprising that we have public facing values in stdlib modules which don't behave as one would expect with regard to pickling. This is why I created this It seems that I should have marked this as "enhancement" rather than the default of "behavior", though, since I meant this to be an improvement on the current situation, rather then assert the current situation is "broken". [1] https://github.com/python/cpython/blob/c10c2ec7a0e06975e8010c56c9c3270f8ea322ec/Lib/dataclasses.py#L346 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:45 | admin | set | github: 88289 |
2021-05-15 19:23:47 | taleinat | set | type: behavior -> enhancement messages: + msg393724 versions: - Python 3.10 |
2021-05-14 17:47:35 | serhiy.storchaka | set | messages: + msg393680 |
2021-05-14 16:41:16 | taleinat | set | messages: + msg393674 |
2021-05-14 08:56:30 | vstinner | set | nosy:
+ vstinner messages: + msg393634 |
2021-05-14 08:37:26 | taleinat | set | messages: + msg393632 |
2021-05-14 07:36:30 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg393628 |
2021-05-14 06:59:36 | rhettinger | set | nosy:
+ rhettinger messages: + msg393627 |
2021-05-13 21:09:32 | taleinat | set | messages: + msg393606 |
2021-05-13 17:38:36 | taleinat | set | messages: + msg393594 |
2021-05-13 17:21:50 | taleinat | set | messages: + msg393590 |
2021-05-13 14:24:18 | eric.smith | set | nosy:
+ eric.smith |
2021-05-13 13:59:31 | zach.ware | set | nosy:
+ zach.ware messages: + msg393582 |
2021-05-13 13:52:54 | taleinat | set | messages: + msg393581 |
2021-05-13 13:17:06 | taleinat | create |