classification
Title: Inconsistent behavior if globals is a dict subclass
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: rejected
Dependencies: Superseder: Allow mappings as globals (was: Fix dictionary subclass ...)
View: 1402289
Assigned To: Nosy List: brett.cannon, methane, pablogsal, pfalcon, ppperry, terry.reedy, vstinner
Priority: normal Keywords: 3.6regression

Created on 2018-01-21 23:46 by ppperry, last changed 2020-01-17 20:09 by brett.cannon. This issue is now closed.

Messages (32)
msg310388 - (view) Author: (ppperry) Date: 2018-01-21 23:46
Take the following code:

    import builtins
    class K(dict):
	def __getitem__(self, k):
		if k not in builtins.__dict__:
			print("get %s" % k)
		return dict.__getitem__(self, k)
	def __setitem__(self, k, v):
		print("set %s" % k)
		dict.__setitem__(self, k, v)
    exec("""
        foo = "bar"
        foo
        try:
            qyz
        except NameError:
	    pass
        class K:
	    baz = foo
	    def f(ggg=foo): pass
        def g(ggg=foo):
	    global f
	    f = 87
            f
        g()
    """,K())

This should consitently either call or not call the overridden methods on the dictionary, producing either no output or:

    set foo
    get foo
    get qyz
    get foo
    get foo
    set K
    get foo
    set g
    get g
    set f
    get f
    
Instead, only sometime the override gets called, producing

    set foo
    get foo
    get qyz
    set K
    get foo
    set g
    get g
    get f

meaning that 
    (a) modifications of global variables via global statements
    (b) code at class-level

ignore overridden methods, whereas everything else follows them
msg310810 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-01-26 22:20
I did take the code, and it would not run because of the indent.  With that fixed, it still did not run because the the indent in the quoted code fed to exec.  This does run:

import builtins
class K(dict):
    def __getitem__(self, k):
            if k not in builtins.__dict__:
                    print("get %s" % k)
            return dict.__getitem__(self, k)
    def __setitem__(self, k, v):
            print("set %s" % k)
            dict.__setitem__(self, k, v)
exec("""
foo = "bar"
foo
try:
    qyz
except NameError:
    pass
class K:
    baz = foo
    def f(ggg=foo): pass
def g(ggg=foo):
    global f
    f = 87
    f
g()
""", K())

When I run in IDLE, which execs the entire code, so none is run at module level, I get the same result.  The same is true running directly in Python after wrapping the entire code with

exec('''
# and
''')

In this case, nothing other than the outer exec is run at module level.  So I am not sure there is a bug.  If you don't get a more definitive response here, I suggest posting to python-list for opinions on whether bug or feature.
msg310819 - (view) Author: (ppperry) Date: 2018-01-26 23:06
Uh, I'm not undertsanding the relevance of whether the code is run at module-level or not. It can't possibly be a feature that * some code * uses the overrides and *other code* doesn't.
msg310822 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-01-26 23:24
I was referring back to your original a) and b) items.

Perhaps 'accidental' versus 'intentional, for whatever reason', would be the better contrast.

The weekly new issue report comes out on Fridays around noon eastern US.  I hope someone else clicks on this one.

I suggested the possibility of python-list because there would usually be people 'competing' to explain the puzzling behavior and check for obscure clauses on the reference.  In any case, some analysis would have to precede action.
msg359066 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2019-12-30 22:24
Some smart maintainer closed https://bugs.python.org/issue36220 as a duplicate of this one. That ticket might have more details of the underlying issues.
msg359067 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-30 22:40
Victor, Pablo, I added you two since you commented on the duplicate.

I believe this should be closed as 'not a bug'.  The doc for globals() says is returns a 'dictionary', meaning an instance of dict.  The doc for exec says globals "must be a dictionary (and not a subclass of dictionary)".  (This is mere implied when locals is also given, but it *is* implied.).  The behavior when globals is otherwise is undefined.
msg359068 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2019-12-30 23:00
> The doc for exec says globals "must be a dictionary (and not a subclass of dictionary)"

Docs are full of mistakes and outdated information.

Fixing STORE_GLOBAL case from https://bugs.python.org/issue36220#msg359046 would be trivial and cheap re: overheads. And useful, yeah.

And the gentleman who submitted the other ticket argues that:

> but the inconsistency between LOAD_NAME and LOAD_GLOBAL definitely seems wrong.

Which is easy to agree with, though there're more concerns re: runtime overhead.
msg359069 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-30 23:17
> The doc for exec says globals "must be a dictionary (and not a subclass of dictionary)"

I agree with Terry, the moment you pass a dict subclass to exec you are out of contract. If any, we may need to sanitize the input to exec, although I don't think is worth paying the performance price for that.

> Some smart maintainer closed https://bugs.python.org/issue36220 as a duplicate of this one.

I closed issue36220 as a duplicate. Paul, I don't know if you are being sarcastic here so I will assume that you are not but in case you are, I suggest you stop as this violates our CoC.

> Docs are full of mistakes and outdated information.

They are also full of contracts and as of today, what the docs document is what is considered the public API. 

In any case, my opinion is that LOAD_NAME and LOAD_GLOBAL should not pay the performance price to handle less-specific APIs just to support a very uncommon case (using dict subclasses as globals) that are also out of contract. For STORE_GLOBAL I feel less strongly about that, but I am still -1 on it.
msg359073 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2019-12-31 00:16
> I agree with Terry, the moment you pass a dict subclass to exec you are out of contract. If any, we may need to sanitize the input to exec, although I don't think is worth paying the performance price for that.

exec() params are already checked, as a seconds param, only dict or dict subclasses are accepted. Seems like good enough contract.

> Paul, I don't know if you are being sarcastic here so I will assume that you are not but in case you are, I suggest you stop as this violates our CoC.

I am not, and please stop your accusations, that violates CoC.
msg359075 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-31 00:59
> exec() params are already checked, as a seconds param, only dict or dict subclasses are accepted. Seems like good enough contract.

As I said, the documentation is considered the public API contract and again, the documentation says that dict subclasses are not accepted:

>> it must be a dictionary (and not a subclass of dictionary)

My opinion is that as Terry said, this is a violation of the contract and supporting dict subclasses is a case unusual enough to not sacrifice performance for the general case. 

> I am not, and please stop your accusations, that violates CoC.

A respectful indication to keep the discussion civilized does not violate the CoC. If you are not being sarcastic, then everything is fine.
msg359086 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-31 01:46
Guido, some people would like manipulation and use of globals to work with dict subclasses as well as dicts themselves.  I believe that the current restriction/optimization is at least partly your work.  Do you have any quick opinion, or do you think discussion should go to python-ideas?
msg359089 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-31 01:48
There are more examples on duplicate #36220.
msg359091 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-12-31 02:17
This is probably an issue for the new Steering Council.
msg360078 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-15 21:45
I modified ceval.c to accept dict subclasses when I was working on PEP 416:
https://www.python.org/dev/peps/pep-0416/

This PEP motivation was to implement a sandbox. The PEP was rejected. Later, I closed my pysandbox beause it was "broken by design":
https://lwn.net/Articles/574215/

Paul Sokolovsky wrote in bpo-36220 than his idea is also to implement a sandbox:
https://bugs.python.org/issue36220#msg359046

IMHO we should reject dict subclasses to make Python (especially ceval.c) more efficient. The global type must be checked at every place where it's possible to specify or override globals. Like the exec() function.
msg360139 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 19:55
> Later, I closed my pysandbox beause it was "broken by design":
https://lwn.net/Articles/574215/

Thanks for the link, insightful. Still unclear, by design of what it's broken ;-).


> Paul Sokolovsky wrote in bpo-36220 than his idea is also to implement a sandbox

Depends on the definition of "sandbox". If it's "But I was contacted by
different companies interested to use pysandbox in production on their
web application.  So I think that there is a real need to execute
arbitrary untrusted code." (re: https://lwn.net/Articles/574323/). Then no, I definitely not developing such a "sandbox".

If by "sandbox" you mean "ability to override some namespacing rules used by Python", then yep, I, and other folks (according to these >1 issue reports) interested in that. It's unclear why demise of your particular project should affect all other possible projects interested in dealing with namespacing matters. I for example interested in study of some properties of corpus of Python code. I can understand your claim that you "own" the partial features you introduced, but claiming them to be useful for only our own usecase is somewhat ..., well, short-sighted, just as claiming that all such uses should cease and desist is as insightful as claiming that, for example, list comprehensions are useless because you faced a case when one needs to be rewritten as explicit loop.

> IMHO we should reject dict subclasses to make Python (especially ceval.c) more efficient.

I'm afraid efficiency story for CPython is irrecoverably lost, unless someone secretly works on adding JIT to it, but as we know, JIT work happily happens outside of it, in multiples. CPython still can either serve as a common platform for prototyping and experimentation, or complicate that by a committee fiat. Because the current situation is that CPython is too dynamic to be "slow", but not enough dynamic to perform useful instrumentation easily. For example, to comfortably do study on the proportions of actual usage of various "overdynamic" features.
msg360140 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 19:57
s/only our own usecase/only your own usecase/ (missing "y" typo)
msg360141 - (view) Author: (ppperry) Date: 2020-01-16 20:25
I was not attempting to run untrusted Python code when I filed this bug report.
msg360143 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 20:51
Ok, so the patch for my usecase (STORE_GLOBAL) is vividly trivial, so to go thru the full circle, I posted it: https://github.com/python/cpython/pull/18033 .
msg360144 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-01-16 20:53
> IMHO we should reject dict subclasses to make Python (especially ceval.c) more efficient. The global type must be checked at every place where it's possible to specify or override globals. Like the exec() function.

I agree with Victor in that we should reject dict subclasses everywhere.
msg360146 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-16 21:38
> Depends on the definition of "sandbox".

In bpo-36220, it clearly means "execute untrusted code": hide access to the filesystem for example.


> I for example interested in study of some properties of corpus of Python code.

This issue is a feature request: you ask to modify Python so you can pass dict subclasses as namespaces and expect CPython to respect the mapping protocol (__getitem__/__setitem__, rather than accessing directly dict internals (well, don't call overriden __getitem__/__setitem__ methods).

Currently, we are in a gray area: it's supported in some places, but not everywhere.

Implementing such features which require to introduce new code, which means increasing the maintenance burden, and it will slow down Python, since every access to a namespace would require to check if the namespace is a dict or a dict subclass.

Paul: you're are in front of 3 core developers who are rejecting your feature request. It doesn't mean that it is not valid. It means that we don't consider your use case as important enough to justify to slow down Python for everything and to increase the maintenance burden.

Namespace performances are really critical for overall Python performances.

My idea is only to reject the feature in CPython implementation, not the in Python language specification. It means that other Python implementation would be free to decide to implement it.

You're free to fork Python and modify the C code to implement your use case, to study Python internals.
msg360147 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-16 21:44
More detail about the implementation. Currently, namespaces are accessed directly with C functions which expect to get an exact dict type: these functions are designed for performance, not to handle dict subclasses, on purpose.

Supporting dict subclasses first require to check if the namespace type is dict: that's the first performance overhead. Then new code should be added to handle subclasses: that's the additional maintenance burden. This code path should be tested, maintained, reviewed, etc.

The risk is also to forget to support subclasses in one function which would break the feature. Same problem is new code is added which doesn't support dict subclasses.

This feature is far from being free to implement. It's very expensive on multiple aspects.
msg360149 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-16 22:18
exec() function is currently quite clear on which types are accepted or not: exact dict for globals, mapping for locals.

*  "If only globals is provided, it must be a dictionary (and not a subclass of dictionary), which will be used for both the global and the local variables."
* "If provided, locals can be any mapping object."

Moreover, the requirement on the exact dict type allowed to implement an optimization on globals in Python 3.8:

"LOAD_GLOBAL instruction now uses new “per opcode cache” mechanism. It is about 40% faster now. (Contributed by Yury Selivanov and Inada Naoki in bpo-26219.)"
msg360153 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 23:32
> you ask to modify Python so you can pass dict subclasses as namespaces and expect CPython to respect the mapping protocol

But no, per your own account, you made noticeable, though not complete, code changes in that direction. The only thing I'm saying "now that it's all there, it's only logical to fix the missing parts, about which people report issues". While you suddenly come up with "alternative" solution - throw it all away.

And I always just took your word for it, but now did some git logging/blaming. So, LOAD_GLOBAL handling of dict subclasses was introduced by you in https://github.com/python/cpython/commit/b0b224233e481d979430a54d257d871424ff19fb , in 2012, almost 8 years ago. 

And what about STORE_NAME handling? I traced that to https://github.com/python/cpython/commit/214b1c3aaea3e83302df9ea37a37b3c7548b92b1 of 2004.
msg360154 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 23:33
> Paul: you're are in front of 3 core developers who are rejecting your feature request.

But no, it's not my feature request. There were 2 tickets by at least 2 people. I just saw my case to be similar to cases of those people, so instead of creating duplicate bug reports, I chimed in, to show the general issue: various name-related opcodes don't treat namespace objects consistently.

And if I'm in front on 3 core developers here, then only because someone's Rubber Ducky (https://en.wikipedia.org/wiki/Rubber_duck_debugging) took a good vacation. Because I may imagine following "debugging session":

- Hey Ducky, some time ago I hacked on one project. As you know, I'm a core developer, so I kinda can add things on my whim, so I added some stuff to CPython core for that project, not consistent, not complete. Since then, I lost interest in my project, and as you can imagine, couldn't care less about the code in the core. The problem? It was almost 8 years ago. People discovered those features, and came to use on them. Not only that, they raise heads and demand it to be implemented more thoroughly and consistently. So, don't you think now would be good time to punish them and just rip that code out?

- You say how could I possible to do that on my own? No worries, I have 2 more buddies vouching for me, we core developers never let each other down.

- You're saying that removing features after 8 years is problematic? No worries, we can always say it was a regression from just 3 years ago.

- Hey Ducky, there's a more problem still, there's a particularly obnoxious dude, who initially tried to argue need for adding a feature based on my 8-year old code. If we support stuff in LOAD_GLOBAL, he says, then it should be piece of cake to support it in STORE_GLOBAL, which is called much less frequently. But we got him into the cold with a news of removal that 8-year code. Still he can't calm down, and changing tactics arguing that due to the way that globals are used at the module level (as locals too), then STORE_GLOBAL should behave consistently with STORE_NAME, tossing some pretty simple exec() code showing inconsistency. Should we just ignore him, or go for total punishment and make it "consistent" just the same as above, by removing support in STORE_NAME, instead of adding to STORE_GLOBAL? Now, that code is 16 years old. Do you think we can go for a twist like that?
msg360155 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 23:34
> Namespace performances are really critical for overall Python performances.

Yeah, that's why I'd like for myself and other people to make it easy to explore the behavior of namespace lookups, to see how to optimize them.


> You're free to fork Python and modify the C code to implement your use case, to study Python internals.

Thanks, already done. I'm now looking for a way to share those studies with other people, who won't patch and compile CPython. So, no worries, I've already lost, I need that fix yesterday, in actual versions of CPython deployed around the world. The rest if just idea that bugs which aren't fixed, persist; and the very interesting case of decision making by CPython core developers.
msg360156 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-16 23:43
> exec() function is currently quite clear

A recent case: https://bugs.python.org/issue38316, co_stacksize was quite clear what it is. Turned out, a bug in the documentation (likely, just someone forgot to update it to the actual code). That's just one case, there're tons of bugs in the docs.
msg360165 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-01-17 09:59
This is duplicate of #1402289, which was rejected.

I agree with Victor.  Python should be free to optimize namespace access.

Namespace access is an implementation detail and user should not assume such "consistency" here.  It is like this:

  x = 1000
  y = x - 1 + 1
  print(x is y)  # True or False, it's implementation detail.


It seems inconsistent for some people, but there are some areas people shouldn't assume consistency.
msg360167 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-17 10:27
Absolutely should be able to optimize namespace access. The fact that namespace is a dict is an implementation detail, it's still inefficient even with all those version counters and inline caches. Ironically, to let people prototype better, more efficient ways to deal with namespace access, it should be possible to override an object used as a namespace. Thanks for making that much more complicated.
msg360172 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-01-17 11:59
> Ironically, to let people prototype better, more efficient ways to deal with namespace access, it should be possible to override an object used as a namespace.

You can benchmark your prototype namespace object by regular Python code:

  d["name"]  # LOAD_GLOBAL
  d["name"] = 1  # STORE_GLOBAL

So I don't think no overriding blocks prototyping.


On the other hand, allowing overriding makes future optimizations difficult.  For example:

```
>>> import dis
>>> def counter():
...     global _cnt
...     _cnt += 1
...     return _cnt
...
>>> dis.dis(counter)
  3           0 LOAD_GLOBAL              0 (_cnt)
              2 LOAD_CONST               1 (1)
              4 INPLACE_ADD
              6 STORE_GLOBAL             0 (_cnt)

  4           8 LOAD_GLOBAL              0 (_cnt)
             10 RETURN_VALUE
```

We may be possible to replace bytecode from `STORE_GLOBAL _cnt; LOAD_GLOBAL _cnt` into `DUP_TOP; STORE_GLOBAL _cnt`.

If we guarantee namespace overriding, it's very easy to break the guarantee while such optimization.
msg360174 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-17 12:08
As explained multiple times, this feature request is rejected. I close the issue.

Paul: Your behavior is not appropriate and violates the Python code of conduct. You have already been warned by Pablo.
msg360177 - (view) Author: Paul Sokolovsky (pfalcon) * Date: 2020-01-17 12:34
> We may be possible to replace bytecode from `STORE_GLOBAL _cnt; LOAD_GLOBAL _cnt` into `DUP_TOP; STORE_GLOBAL _cnt`.

Sounds good, and that's why I personally care about the "STORE" case, and the patch I submit touches only it, which would affect only cases when explicit "global var" declarations are used, which aren't frequent. I definitely would be too shy to submit a patch for more LOAD_* cases, though I may image why people may want that still.
msg360212 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2020-01-17 20:09
Paul, your attitude on this issue has unfortunately begun violating the PSF Code of Conduct which this issue tracker adheres to. I realize you're frustrated that the change you are after is not happening, but that doesn't mean you can deride others like you have been.

Pablo tried to ask you nicely to change your tone, but now please consider this an official warning that if your communication style does not become more civil you will most likely end up with your account suspended. I have also let the Conduct WG know about this thread and warning for record-keeping purposes.
History
Date User Action Args
2021-08-05 06:20:54terry.reedylinkissue44749 superseder
2020-01-17 20:09:57brett.cannonsetnosy: + brett.cannon
messages: + msg360212
2020-01-17 12:34:56pfalconsetmessages: + msg360177
2020-01-17 12:08:49vstinnersetstatus: open -> closed
resolution: rejected
messages: + msg360174

stage: test needed -> resolved
2020-01-17 11:59:42methanesetmessages: + msg360172
2020-01-17 10:27:10pfalconsetmessages: + msg360167
2020-01-17 09:59:24methanesetnosy: + methane
messages: + msg360165
2020-01-17 05:51:12methanesetsuperseder: Allow mappings as globals (was: Fix dictionary subclass ...)
2020-01-16 23:43:32pfalconsetmessages: + msg360156
2020-01-16 23:34:27pfalconsetmessages: + msg360155
2020-01-16 23:33:02pfalconsetmessages: + msg360154
2020-01-16 23:32:05pfalconsetmessages: + msg360153
2020-01-16 22:18:38vstinnersetmessages: + msg360149
2020-01-16 21:44:26vstinnersetmessages: + msg360147
2020-01-16 21:38:19vstinnersetmessages: + msg360146
2020-01-16 20:53:43pablogsalsetkeywords: + 3.6regression

messages: + msg360144
2020-01-16 20:51:16pfalconsetmessages: + msg360143
2020-01-16 20:25:30ppperrysetmessages: + msg360141
2020-01-16 20:09:30gvanrossumsetnosy: - gvanrossum
2020-01-16 19:57:28pfalconsetmessages: + msg360140
2020-01-16 19:55:34pfalconsetmessages: + msg360139
2020-01-15 21:45:42vstinnersetmessages: + msg360078
2019-12-31 02:17:12gvanrossumsetmessages: + msg359091
2019-12-31 01:48:09terry.reedysetmessages: + msg359089
2019-12-31 01:46:48terry.reedysetnosy: + gvanrossum
messages: + msg359086

type: behavior -> enhancement
stage: test needed
2019-12-31 00:59:59pablogsalsetmessages: + msg359075
2019-12-31 00:16:04pfalconsetmessages: + msg359073
2019-12-30 23:17:06pablogsalsetmessages: + msg359069
2019-12-30 23:00:11pfalconsetmessages: + msg359068
2019-12-30 22:40:05terry.reedysetnosy: + vstinner, pablogsal

messages: + msg359067
versions: + Python 3.9, - Python 3.6
2019-12-30 22:24:43pfalconsetnosy: + pfalcon
messages: + msg359066
2018-01-26 23:24:33terry.reedysetnosy: - pmpp
messages: + msg310822
2018-01-26 23:10:25pmppsetnosy: + pmpp
2018-01-26 23:06:03ppperrysetmessages: + msg310819
2018-01-26 22:21:00terry.reedysetnosy: + terry.reedy
messages: + msg310810
2018-01-22 20:33:07ppperrysettype: behavior
2018-01-21 23:46:26ppperrycreate