msg391039 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2021-04-14 07:34 |
The purpose of MappingProxyType is to provide a read-only proxy for mapping. It should not expose the underlying mapping because it would invalidate the purpose of read-only proxy. But there is a way to do this using comparison operator:
from types import MappingProxyType
orig = {1: 2}
proxy = MappingProxyType(orig)
class X:
def __eq__(self, other):
other[1] = 3
assert proxy[1] == 2
proxy == X()
assert proxy[1] == 3
assert orig[1] == 3
In particularly it allows to modify __dict__ of builtin types.
|
msg397244 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2021-07-10 10:06 |
Example of modifying a builtin type:
>>> class Sneaky:
... def __eq__(self, other):
... other['real'] = 42
...
>>> int.__dict__ == Sneaky()
>>> (1).real
42
But it can also lead to crash (due to outdated type cache):
>>> class Sneaky:
... def __eq__(self, other):
... other['bit_length'] = 42
...
>>> int.__dict__ == Sneaky()
>>> (1).bit_length
Segmentation fault (core dumped)
|
msg397245 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2021-07-10 10:27 |
I stumbled across this independently in bpo-44596, but missed this existing report due to the specific word I was looking for (encapsulation).
In addition to the comparison operand coercion, there's now another access option through the `__ror__` method.
The bug in both cases arises from delegating a method/function implementation to the underlying mapping type in a way that invokes the full operand coercion dance. (PyObject_RichCompare in one case, PyNumber_Or in the other)
Delegating these operations to the underlying mapping does make sense, but it needs to be a lower level delegation that bypasses the operand coercion dance, so the other operand only ever sees the proxy object, not the underlying mapping.
|
msg398023 - (view) |
Author: Brandt Bucher (brandtbucher) * |
Date: 2021-07-23 01:09 |
I believe that delegating to the actual underlying mapping without exposing it is a bit of a lost cause, since for some type m we still need these to work:
>>> types.MappingProxyType(m({"a": 0)) | types.MappingProxyType(m({"b": 1}))
m({'a': 0, 'b': 1})
>>> types.MappingProxyType(m({"a": 0)) == types.MappingProxyType(m({"a": 0}))
True
(Note that because both sides are proxies, it's impossible for any resolution to happen without m explicitly knowing how to handle them unless both mappings are unwrapped simultaneously.)
Instead, the attached PR delegates to a *copy* of the underlying mapping for these operations instead. I think this is the easiest way to approximate the current behavior while maintaining proper encapsulation.
|
msg398025 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2021-07-23 02:02 |
Maybe we should not fix this then? Copying seems unattractive, and the
reason we proxy in the first place is not absolute safety but to make sure
people don’t accidentally update the dict when they intend to update the
class (for the side effect of updating slots when e.g. __add__ is
added/removed.--
--Guido (mobile)
|
msg398033 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2021-07-23 07:13 |
For __or__ we need to copy the content of both mapping to the resulting mapping in any case. We can implement it as {**self, **other}. We should not use the copy() method because it is not a part of the Mapping interface.
For __eq__, no copying is needed if we just re-implement Mapping.__eq__ (with special cases for few known types for performance).
|
msg398045 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2021-07-23 10:59 |
Delegating operations to the underlying mapping is still OK, all we're wanting to bypass is the operand coercion dances in abstract.c and object.c that may expose the underlying mapping to the *other* operand.
We don't want to implicitly copy though, as the performance hit can be non-trivial for large mappings, even if the algorithmic complexity doesn't change.
For the `nb_or` slot, it should be correct to retrieve and call `__or__` from the underlying mapping when the left operand is a MappingProxy instance, and `__ror__` when the right operand is a MappingProxy instance (but the left operand is not).
Rich comparison is messier though, since PyObject_RichCompare handles both the operand coercion dance *and* the mapping of the specific comparison operations to their implementation slots.
I don't see a clean solution to that other than defining a new PyObject_DelegateRichCompare helper function for the mapping proxy to use that provides the mapping from specific operands to their implementation slots *without* doing the coercion dance.
However we resolve it, I think it probably won't be worth the risk of backporting the change (this has been the way it currently is for a long time, and it's a "reduce the risk of error" feature rather than any kind of security boundary).
|
msg398072 - (view) |
Author: Brandt Bucher (brandtbucher) * |
Date: 2021-07-23 15:54 |
> Delegating operations to the underlying mapping is still OK, all we're wanting to bypass is the operand coercion dances in abstract.c and object.c that may expose the underlying mapping to the *other* operand.
But this won't work if *both* operands are proxies, right? The left wrapped mapping will only ever see itself and the right mappingproxy, and the right operand will only ever see itself and the left mappingproxy. The overwhelmingly common case is where these proxies wrap dicts, and dicts only play nicely with other dicts.
I agree that creating redundant copies isn't optimal; it's just the only way that I see we'd be able to fix this without backward compatibility headaches. The fix is easy to explain to users without needing to get into the nuances of operator dispatch or bloating the code to handle weird edge-cases (trust me, I originally tried coding logic to handle all the different possibilities of subclasses and proxies and such before landing on the copying solution).
With that said, I think that if we decide it's really not worth it to copy here, Serhiy's proposal is probably "good enough". Just return a merged dict for the union operation, and implement mapping equality in a sane, usable way. I just worry that it will break backward-compatibility in subtle ways that are basically impossible to fix (comparing equality of proxied OrderedDicts, for example).
|
msg398700 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2021-08-01 14:33 |
Ouch, you're right - I forgot that dict just returned NotImplemented from comparisons and unions and made it the other operand's problem:
>>> d1 = dict(x=1)
>>> d2 = dict(x=1)
>>> from types import MappingProxyType as proxy
>>> d1.__eq__(proxy(d2))
NotImplemented
>>> d1.__or__(proxy(d2))
NotImplemented
Perhaps we could mitigate the performance hit by skipping the copy when the other side is either a builtin dict (subclasses don't count), or a proxy around a builtin dict?
|
msg399022 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2021-08-05 18:10 |
[Serhiy]
> In particularly it allows to modify __dict__ of builtin types.
This doesn't look like something that would happen accidentally. We've never had any bug reports about this occurring in the wild.
[GvR]
> Maybe we should not fix this then?
That seems reasonable. In general, it is hard to completely wall-off instances against deliberate efforts to pry them open:
>>> from types import MappingProxyType
>>> import gc
>>> orig = {1: 2}
>>> proxy = MappingProxyType(orig)
>>> refs = gc.get_referents(proxy)
>>> refs
[{1: 2}]
>>> refs[0][1] = 3
>>> orig
{1: 3}
|
msg399057 - (view) |
Author: Brandt Bucher (brandtbucher) * |
Date: 2021-08-06 05:42 |
I’m okay with closing as “won’t fix”.
|
msg399059 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2021-08-06 06:46 |
Good catch! Thank you Raymond for the third example.
I though that it can be easily fixed by calling tp_traverse() for submapping:
static int
mappingproxy_traverse(PyObject *self, visitproc visit, void *arg)
{
mappingproxyobject *pp = (mappingproxyobject *)self;
- Py_VISIT(pp->mapping);
- return 0;
+ return Py_TYPE(pp->mapping)->tp_traverse(pp->mapping, visit, arg);
}
but it does not work.
So I am okay with closing this issue.
|
msg399171 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2021-08-07 08:52 |
That resolution makes sense to me as well.
Should we make a note in the documentation for https://docs.python.org/3/library/types.html#types.MappingProxyType that the "read-only" protection is to guard against *accidental* modification, not against active attempts to subvert the protection?
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:44 | admin | set | github: 88004 |
2021-08-07 08:52:38 | ncoghlan | set | messages:
+ msg399171 |
2021-08-07 07:55:34 | gvanrossum | set | status: open -> closed resolution: rejected stage: resolved |
2021-08-06 06:46:25 | serhiy.storchaka | set | messages:
+ msg399059 |
2021-08-06 05:42:57 | brandtbucher | set | messages:
+ msg399057 |
2021-08-05 18:10:07 | rhettinger | set | messages:
+ msg399022 |
2021-08-01 14:33:00 | ncoghlan | set | messages:
+ msg398700 |
2021-07-23 17:51:57 | domdfcoding | set | nosy:
+ domdfcoding
|
2021-07-23 15:54:23 | brandtbucher | set | messages:
+ msg398072 |
2021-07-23 10:59:16 | ncoghlan | set | messages:
+ msg398045 |
2021-07-23 07:13:58 | serhiy.storchaka | set | messages:
+ msg398033 |
2021-07-23 02:02:36 | gvanrossum | set | messages:
+ msg398025 |
2021-07-23 01:09:36 | brandtbucher | set | messages:
+ msg398023 stage: patch review -> (no value) |
2021-07-23 01:08:55 | brandtbucher | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request25843 |
2021-07-11 16:15:52 | brandtbucher | set | nosy:
+ brandtbucher
|
2021-07-10 14:51:15 | gvanrossum | set | nosy:
+ gvanrossum
|
2021-07-10 10:27:23 | ncoghlan | set | nosy:
+ ncoghlan messages:
+ msg397245
|
2021-07-10 10:07:00 | serhiy.storchaka | set | messages:
+ msg397244 |
2021-07-10 10:00:37 | serhiy.storchaka | link | issue44596 superseder |
2021-04-14 07:34:28 | serhiy.storchaka | create | |