classification
Title: There is a way to access an underlying mapping in MappingProxyType
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: brandtbucher, domdfcoding, gvanrossum, ncoghlan, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-04-14 07:34 by serhiy.storchaka, last changed 2021-08-07 08:52 by ncoghlan. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 27300 closed brandtbucher, 2021-07-23 01:08
Messages (13)
msg391039 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-14 07:34
The purpose of MappingProxyType is to provide a read-only proxy for mapping. It should not expose the underlying mapping because it would invalidate the purpose of read-only proxy. But there is a way to do this using comparison operator:

from types import MappingProxyType

orig = {1: 2}
proxy = MappingProxyType(orig)

class X:
    def __eq__(self, other):
        other[1] = 3

assert proxy[1] == 2
proxy == X()
assert proxy[1] == 3
assert orig[1] == 3

In particularly it allows to modify __dict__ of builtin types.
msg397244 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-07-10 10:06
Example of modifying a builtin type:

>>> class Sneaky:
...     def __eq__(self, other):
...         other['real'] = 42
... 
>>> int.__dict__ == Sneaky()
>>> (1).real
42

But it can also lead to crash (due to outdated type cache):

>>> class Sneaky:
...     def __eq__(self, other):
...         other['bit_length'] = 42
... 
>>> int.__dict__ == Sneaky()
>>> (1).bit_length
Segmentation fault (core dumped)
msg397245 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2021-07-10 10:27
I stumbled across this independently in bpo-44596, but missed this existing report due to the specific word I was looking for (encapsulation).

In addition to the comparison operand coercion, there's now another access option through the `__ror__` method.

The bug in both cases arises from delegating a method/function implementation to the underlying mapping type in a way that invokes the full operand coercion dance. (PyObject_RichCompare in one case, PyNumber_Or in the other)

Delegating these operations to the underlying mapping does make sense, but it needs to be a lower level delegation that bypasses the operand coercion dance, so the other operand only ever sees the proxy object, not the underlying mapping.
msg398023 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-07-23 01:09
I believe that delegating to the actual underlying mapping without exposing it is a bit of a lost cause, since for some type m we still need these to work:

>>> types.MappingProxyType(m({"a": 0)) | types.MappingProxyType(m({"b": 1}))
m({'a': 0, 'b': 1}) 
>>> types.MappingProxyType(m({"a": 0)) == types.MappingProxyType(m({"a": 0}))
True

(Note that because both sides are proxies, it's impossible for any resolution to happen without m explicitly knowing how to handle them unless both mappings are unwrapped simultaneously.)

Instead, the attached PR delegates to a *copy* of the underlying mapping for these operations instead. I think this is the easiest way to approximate the current behavior while maintaining proper encapsulation.
msg398025 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-07-23 02:02
Maybe we should not fix this then? Copying seems unattractive, and the
reason we proxy in the first place is not absolute safety but to make sure
people don’t accidentally update the dict when they intend to update the
class (for the side effect of updating slots when e.g. __add__ is
added/removed.--
--Guido (mobile)
msg398033 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-07-23 07:13
For __or__ we need to copy the content of both mapping to the resulting mapping in any case. We can implement it as {**self, **other}. We should not use the copy() method because it is not a part of the Mapping interface.

For __eq__, no copying is needed if we just re-implement Mapping.__eq__ (with special cases for few known types for performance).
msg398045 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2021-07-23 10:59
Delegating operations to the underlying mapping is still OK, all we're wanting to bypass is the operand coercion dances in abstract.c and object.c that may expose the underlying mapping to the *other* operand.

We don't want to implicitly copy though, as the performance hit can be non-trivial for large mappings, even if the algorithmic complexity doesn't change.

For the `nb_or` slot, it should be correct to retrieve and call `__or__` from the underlying mapping when the left operand is a MappingProxy instance, and `__ror__` when the right operand is a MappingProxy instance (but the left operand is not).

Rich comparison is messier though, since PyObject_RichCompare handles both the operand coercion dance *and* the mapping of the specific comparison operations to their implementation slots.

I don't see a clean solution to that other than defining a new PyObject_DelegateRichCompare helper function for the mapping proxy to use that provides the mapping from specific operands to their implementation slots *without* doing the coercion dance.

However we resolve it, I think it probably won't be worth the risk of backporting the change (this has been the way it currently is for a long time, and it's a "reduce the risk of error" feature rather than any kind of security boundary).
msg398072 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-07-23 15:54
> Delegating operations to the underlying mapping is still OK, all we're wanting to bypass is the operand coercion dances in abstract.c and object.c that may expose the underlying mapping to the *other* operand.

But this won't work if *both* operands are proxies, right? The left wrapped mapping will only ever see itself and the right mappingproxy, and the right operand will only ever see itself and the left mappingproxy. The overwhelmingly common case is where these proxies wrap dicts, and dicts only play nicely with other dicts.

I agree that creating redundant copies isn't optimal; it's just the only way that I see we'd be able to fix this without backward compatibility headaches. The fix is easy to explain to users without needing to get into the nuances of operator dispatch or bloating the code to handle weird edge-cases (trust me, I originally tried coding logic to handle all the different possibilities of subclasses and proxies and such before landing on the copying solution).

With that said, I think that if we decide it's really not worth it to copy here, Serhiy's proposal is probably "good enough". Just return a merged dict for the union operation, and implement mapping equality in a sane, usable way. I just worry that it will break backward-compatibility in subtle ways that are basically impossible to fix (comparing equality of proxied OrderedDicts, for example).
msg398700 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2021-08-01 14:33
Ouch, you're right - I forgot that dict just returned NotImplemented from comparisons and unions and made it the other operand's problem:

>>> d1 = dict(x=1)
>>> d2 = dict(x=1)
>>> from types import MappingProxyType as proxy
>>> d1.__eq__(proxy(d2))
NotImplemented
>>> d1.__or__(proxy(d2))
NotImplemented

Perhaps we could mitigate the performance hit by skipping the copy when the other side is either a builtin dict (subclasses don't count), or a proxy around a builtin dict?
msg399022 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-08-05 18:10
[Serhiy]
> In particularly it allows to modify __dict__ of builtin types.

This doesn't look like something that would happen accidentally.  We've never had any bug reports about this occurring in the wild.


[GvR]
> Maybe we should not fix this then?

That seems reasonable.  In general, it is hard to completely wall-off instances against deliberate efforts to pry them open:

    >>> from types import MappingProxyType
    >>> import gc
    >>> orig = {1: 2}
    >>> proxy = MappingProxyType(orig)
    >>> refs = gc.get_referents(proxy)
    >>> refs
    [{1: 2}]
    >>> refs[0][1] = 3
    >>> orig
    {1: 3}
msg399057 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-08-06 05:42
I’m okay with closing as “won’t fix”.
msg399059 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-08-06 06:46
Good catch! Thank you Raymond for the third example.

I though that it can be easily fixed by calling tp_traverse() for submapping:

 static int
 mappingproxy_traverse(PyObject *self, visitproc visit, void *arg)
 {
     mappingproxyobject *pp = (mappingproxyobject *)self;
-    Py_VISIT(pp->mapping);
-    return 0;
+    return Py_TYPE(pp->mapping)->tp_traverse(pp->mapping, visit, arg);
 }

but it does not work.

So I am okay with closing this issue.
msg399171 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2021-08-07 08:52
That resolution makes sense to me as well.

Should we make a note in the documentation for https://docs.python.org/3/library/types.html#types.MappingProxyType that the "read-only" protection is to guard against *accidental* modification, not against active attempts to subvert the protection?
History
Date User Action Args
2021-08-07 08:52:38ncoghlansetmessages: + msg399171
2021-08-07 07:55:34gvanrossumsetstatus: open -> closed
resolution: rejected
stage: resolved
2021-08-06 06:46:25serhiy.storchakasetmessages: + msg399059
2021-08-06 05:42:57brandtbuchersetmessages: + msg399057
2021-08-05 18:10:07rhettingersetmessages: + msg399022
2021-08-01 14:33:00ncoghlansetmessages: + msg398700
2021-07-23 17:51:57domdfcodingsetnosy: + domdfcoding
2021-07-23 15:54:23brandtbuchersetmessages: + msg398072
2021-07-23 10:59:16ncoghlansetmessages: + msg398045
2021-07-23 07:13:58serhiy.storchakasetmessages: + msg398033
2021-07-23 02:02:36gvanrossumsetmessages: + msg398025
2021-07-23 01:09:36brandtbuchersetmessages: + msg398023
stage: patch review -> (no value)
2021-07-23 01:08:55brandtbuchersetkeywords: + patch
stage: patch review
pull_requests: + pull_request25843
2021-07-11 16:15:52brandtbuchersetnosy: + brandtbucher
2021-07-10 14:51:15gvanrossumsetnosy: + gvanrossum
2021-07-10 10:27:23ncoghlansetnosy: + ncoghlan
messages: + msg397245
2021-07-10 10:07:00serhiy.storchakasetmessages: + msg397244
2021-07-10 10:00:37serhiy.storchakalinkissue44596 superseder
2021-04-14 07:34:28serhiy.storchakacreate