msg321495 - (view) |
Author: Dan Rose (Dan Rose) * |
Date: 2018-07-11 19:08 |
In the Python 3.7.0 interpreter, the following evaluates to False. In 3.6.4, it was True:
c,d = 500,500
c is d
This seems to be because, in some cases, Python 3.7 fails to intern integers inside tuples:
a = (500,500)
print(a[0] is a[1]) # False
a = (500,500,42)
print(a[0] is a[1]) # False
a = (500,500,'42')
print(a[0] is a[1]) # False
answer = 42
a = (500,500,answer)
print(a[0] is a[1]) # True
a = (500,500,[42])
print(a[0] is a[1]) # True
a = [500,500]
print(a[0] is a[1]) # True
I believe the above should all return True.
|
msg321497 - (view) |
Author: Dan Rose (Dan Rose) * |
Date: 2018-07-11 19:32 |
Another curious case:
a = (500,500); b = (500,500)
print(a[0] is b[0]) # True
print(a[0] is b[1]) # False
print(a[1] is b[0]) # False
print(a[1] is b[1]) # True
|
msg321498 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2018-07-11 20:38 |
This was a side-effect of #29469.
|
msg321499 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-07-11 20:54 |
Should the AST optimizer "interns" constants?
|
msg321513 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-12 03:11 |
Thank you for finding it.
I had worked on constant merging. [1]
It didn't fix this case for now. I'll continue the suspended work.
[1]: https://github.com/methane/cpython/pull/14/files
But it may be a too big to fix only this regression.
How is this regression critical?
If it's important, I'll try to create minimal version of the patch to backport.
|
msg321515 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-07-12 04:52 |
This is not only with integers.
>>> a = ((1, 2), (1, 2))
>>> a[0] is a[1]
False
>>> a = ('@#$', '@#$')
>>> a[0] is a[1]
False
>>> a = (1.0, 1.0)
>>> a[0] is a[1]
False
The only exception is short ASCII identifier-like strings (as a side effect of interning them):
>>> a = ('foo', 'foo')
>>> a[0] is a[1]
True
I'm not sure this is a problem which should be resolved.
|
msg321516 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2018-07-12 04:58 |
The language doesn't define anything about this - any program relying on accidental identity is in error itself.
Still, it's nice if a code object's co_consts vector is as short as reasonably possible. That's a matter of pragmatics, not of correctness.
|
msg321517 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-12 05:02 |
OK, unless example code this regression affects seriously is shown, I target only 3.8.
|
msg321518 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-07-12 05:08 |
The co_consts vector is already as short as possible, except cases when tuples are created at code generation time, but this is not related to this issue (see issue33318 and issue33325).
>>> def f():
... a = (1.0, 1.0)
... b = (1.0, 1.0)
...
>>> f.__code__.co_consts
(None, (1.0, 1.0))
>>> f.__code__.co_consts[1][0] is f.__code__.co_consts[1][1]
False
|
msg321519 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2018-07-12 05:21 |
Fine, Serhiy, so reword it a tiny bit: it's nice if a code object's co_consts vector references as few distinct objects as possible. Still a matter of pragmatics, not of correctness.
|
msg321520 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-12 05:22 |
FYI, same constants are shared even when they are in different tuples on 3.6.
For modules having large constant table containing large integer or floats, non name-like string, and bytes are affected significantly.
|
msg321526 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-07-12 08:27 |
Honestly, I don't think that it's a bug. We might enhance Python 3.8 to reduce constants duplication, but there is no need to "fix" Python 3.7.
|
msg321576 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2018-07-12 18:56 |
I also agree that there's no bug here, so I'm marking this an enhancement and removing the regression label.
|
msg321633 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2018-07-13 20:06 |
I think (space) 'performance' would be a better label, as this is strictly an implementation improvement, not a language change. But we often (usually? sometimes?) limit performance improvements to the 'next version' so we have the alpha/beta/candidate releases to discover possible regressions.
I think this is worth considering just because the pattern is so odd. But if this is ironed out as part of a broader and better patch, great.
|
msg321943 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-19 09:32 |
Counting object types in logging/__pycache__/__init__.cpython-38.pyc:
master:
[('r', 1815), (')', 467), ('Z', 339), ('s', 314), ('z', 273), ('c', 157), ('N', 154), ('a', 24), ('F', 14), ('i', 11), ('T', 8)]
GH-8341:
[('r', 1737), (')', 375), ('Z', 339), ('s', 314), ('z', 264), ('c', 157), ('N', 138), ('a', 24), ('F', 14), ('i', 11), ('T', 8)]
It reduced about 5% objects.
I chose logging/__init__ because it's large except tests, and it's written in OO-based. (Large application has many OO-based code).
|
msg321989 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-07-20 08:34 |
This is another memory overhead comparison.
It seems merging constants reduces 2~3% memory usage.
import sys, django, flask
sys._debugmallocstats()
### master branch
class size num pools blocks in use avail blocks
----- ---- --------- ------------- ------------
0 8 1 312 194
1 16 1 135 118
2 24 2 198 138
3 32 22 2690 82
4 40 62 6214 48
5 48 149 12465 51
6 56 254 18260 28
7 64 354 22300 2
8 72 208 11589 59
9 80 124 6173 27
10 88 59 2708 6
...
# bytes in allocated blocks = 9,261,952
### merge-const branch
class size num pools blocks in use avail blocks
----- ---- --------- ------------- ------------
0 8 1 312 194
1 16 1 135 118
2 24 1 161 7
3 32 22 2657 115
4 40 62 6202 60
5 48 102 8501 67
6 56 235 16879 41
7 64 346 21751 47
8 72 204 11387 37
9 80 122 6083 17
10 88 58 2656 12
...
# bytes in allocated blocks = 8,919,824
|
msg330428 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-11-26 12:23 |
New changeset c2e1607a51d7a17f143b5a34e8cff7c6fc58a091 by Miss Islington (bot) (INADA Naoki) in branch 'master':
bpo-34100: Merge constants recursively (GH-8341)
https://github.com/python/cpython/commit/c2e1607a51d7a17f143b5a34e8cff7c6fc58a091
|
msg330519 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-27 13:39 |
> New changeset c2e1607a51d7a17f143b5a34e8cff7c6fc58a091 by Miss Islington (bot) (INADA Naoki) in branch 'master'
This change introduced a lot of memory leaks (reference leaks):
https://buildbot.python.org/all/#/builders/1/builds/422
The following change fix "make && ./python -m test -R 3:3 test_code":
diff --git a/Python/compile.c b/Python/compile.c
index acb5cfe29b..cb3e73740d 100644
--- a/Python/compile.c
+++ b/Python/compile.c
@@ -1277,6 +1277,7 @@ merge_consts_recursive(struct compiler *c, PyObject *o)
else {
u = k;
}
+ Py_DECREF(k);
Py_INCREF(u);
PyTuple_SET_ITEM(tuple, i, u);
i++;
@@ -1288,6 +1289,7 @@ merge_consts_recursive(struct compiler *c, PyObject *o)
Py_DECREF(key);
return NULL;
}
+ Py_DECREF(PyTuple_GET_ITEM(key, 1));
PyTuple_SET_ITEM(key, 1, new);
}
But I dislike the frozenset branch of merge_consts_recursive(): modifying a tuple is a bad idea. For example, if you replace PyTuple_SET_ITEM() with PyTuple_SetItem(), you get a PyErr_BadInternalCall() because the reference count is greater than 1.
I don't see how to rewrote properly the code, so I reverted (removed) it to let someone else fix the code: PR 10743.
|
msg330521 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-27 14:12 |
New changeset 1005c84535191a72ebb7587d8c5636a065b7ed79 by Victor Stinner in branch 'master':
bpo-34100: Partially revert merge_consts_recursive() (GH-10743)
https://github.com/python/cpython/commit/1005c84535191a72ebb7587d8c5636a065b7ed79
|
msg330522 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-27 14:13 |
merge_consts_recursive(): rather than modifying the 'key' tuple, would it make sense to *add* the new key to the cache?
|
msg330572 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2018-11-28 03:13 |
I agree that modifying tuple is bad idea in general case.
But in case of constants, in-place modify is the easiest approach.
Our purpose is "merging" constant. On the other hand, "build new tuple and
replace old tuple" approach makes two same constant tuples.
If old tuple is referenced from somewhere, we fail to "merging".
So I don't think we need to follow the general principle unless it reduces code.
FYI, we have similar in-place editing for interning already.
https://github.com/python/cpython/blob/master/Objects/codeobject.c#L47-L95
|
msg330582 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-28 09:16 |
Well, it's up to you. I will review your PR ;-)
|
msg330605 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-28 15:58 |
New changeset f7e4d3642fbb88f4e6243c952a0e223fb5df1c65 by Victor Stinner (INADA Naoki) in branch 'master':
bpo-34100: compile: Re-enable frozenset merging (GH-10760)
https://github.com/python/cpython/commit/f7e4d3642fbb88f4e6243c952a0e223fb5df1c65
|
msg330607 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-28 16:05 |
Thanks INADA Naoki! Thanks a simple and cool optimization!
Maybe you might want to document it at https://docs.python.org/dev/whatsnew/3.8.html#optimizations ?
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:03 | admin | set | github: 78281 |
2018-11-28 16:05:45 | vstinner | set | messages:
+ msg330607 |
2018-11-28 15:58:49 | vstinner | set | messages:
+ msg330605 |
2018-11-28 11:26:23 | vstinner | set | pull_requests:
+ pull_request10011 |
2018-11-28 10:18:03 | methane | set | pull_requests:
+ pull_request10008 |
2018-11-28 09:16:33 | vstinner | set | messages:
+ msg330582 |
2018-11-28 03:14:00 | methane | set | messages:
+ msg330572 |
2018-11-27 14:13:40 | vstinner | set | messages:
+ msg330522 |
2018-11-27 14:12:51 | vstinner | set | messages:
+ msg330521 |
2018-11-27 13:39:21 | vstinner | set | messages:
+ msg330519 |
2018-11-27 13:33:44 | vstinner | set | pull_requests:
+ pull_request9990 |
2018-11-26 12:24:08 | methane | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2018-11-26 12:23:32 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg330428
|
2018-07-23 05:22:57 | methane | link | issue29336 superseder |
2018-07-20 08:34:13 | methane | set | messages:
+ msg321989 |
2018-07-19 09:32:56 | methane | set | keywords:
+ needs review, - 3.7regression type: enhancement -> resource usage stage: needs patch -> patch review |
2018-07-19 09:32:05 | methane | set | keywords:
- patch
messages:
+ msg321943 stage: patch review -> needs patch |
2018-07-19 09:24:55 | methane | set | keywords:
+ patch stage: needs patch -> patch review pull_requests:
+ pull_request7875 |
2018-07-13 20:06:24 | terry.reedy | set | nosy:
+ terry.reedy
messages:
+ msg321633 stage: needs patch |
2018-07-12 18:56:15 | brett.cannon | set | type: resource usage -> enhancement
messages:
+ msg321576 nosy:
+ brett.cannon |
2018-07-12 08:35:29 | methane | set | title: Same integers in a tuple of constant literals are not merged -> Same constants in tuples are not merged while compile() |
2018-07-12 08:27:46 | vstinner | set | messages:
+ msg321526 |
2018-07-12 05:22:53 | methane | set | messages:
+ msg321520 |
2018-07-12 05:21:03 | tim.peters | set | messages:
+ msg321519 |
2018-07-12 05:08:56 | serhiy.storchaka | set | messages:
+ msg321518 |
2018-07-12 05:02:26 | methane | set | messages:
+ msg321517 versions:
- Python 3.7 |
2018-07-12 04:58:09 | tim.peters | set | nosy:
+ tim.peters messages:
+ msg321516
|
2018-07-12 04:52:16 | serhiy.storchaka | set | messages:
+ msg321515 |
2018-07-12 04:00:52 | methane | set | keywords:
+ 3.7regression title: Python doesn't intern integers in a tuple of constant literals -> Same integers in a tuple of constant literals are not merged type: behavior -> resource usage components:
+ Interpreter Core versions:
+ Python 3.8 |
2018-07-12 03:11:37 | methane | set | assignee: methane messages:
+ msg321513 |
2018-07-11 20:54:32 | vstinner | set | messages:
+ msg321499 |
2018-07-11 20:38:14 | mark.dickinson | set | nosy:
+ mark.dickinson, vstinner, serhiy.storchaka, methane messages:
+ msg321498
|
2018-07-11 19:32:57 | Dan Rose | set | messages:
+ msg321497 |
2018-07-11 19:08:31 | Dan Rose | create | |