msg332857 - (view) |
Author: Ma Lin (malin) * |
Date: 2019-01-02 07:20 |
Please see the PR
|
msg332858 - (view) |
Author: Ma Lin (malin) * |
Date: 2019-01-02 07:30 |
Every non-empty str will be checked twice at present.
|
msg332861 - (view) |
Author: Ma Lin (malin) * |
Date: 2019-01-02 11:08 |
This redundant exists since Python 3.4 or earlier.
|
msg332864 - (view) |
Author: Stefan Behnel (scoder) * |
Date: 2019-01-02 11:25 |
Unlikely to get changed in Py3.4/5 anymore, since this is not even a bug fix. I wouldn't even fight for backporting, although 3.7 seems ok for it.
I agree that this code duplication is worth removing. I don't consider hashing the empty string important enough for leaving it in, especially because the net performance effect can at most be zero, if not negative, for the normal case of non-empty strings.
|
msg332866 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2019-01-02 12:16 |
New changeset a1d14253066f7dd60cfb465c6511fa565f312b42 by Serhiy Storchaka (animalize) in branch 'master':
bpo-35636: Remove redundant check in unicode_hash(). (GH-11402)
https://github.com/python/cpython/commit/a1d14253066f7dd60cfb465c6511fa565f312b42
|
msg332908 - (view) |
Author: Ma Lin (malin) * |
Date: 2019-01-03 05:10 |
Thanks for review.
Don't know why bytes and str generates the same hash value for ASCII sequence.
>>> hash('abc') == hash(b'abc')
True
This may brings some hash collisions, does it affect performance slightly?
|
msg332909 - (view) |
Author: Stefan Behnel (scoder) * |
Date: 2019-01-03 07:02 |
> why bytes and str generates the same hash value for ASCII sequence
Probably mostly for historical Py2 reasons. These days, both are somewhat unlikely to appear in the same dict. But still, I'd advise against changing the hash function without a very good reason. You never know how much code relies on it in one way or another.
|
msg332910 - (view) |
Author: Ma Lin (malin) * |
Date: 2019-01-03 07:13 |
> I'd advise against changing the hash function without a very good reason. You never know how much code relies on it in one way or another.
ok, maybe this can be changed in Python 4.0
|
msg332911 - (view) |
Author: Stefan Behnel (scoder) * |
Date: 2019-01-03 07:20 |
> maybe this can be changed in Python 4.0
Well, if you find a *very* good reason for changing it, as I said. Py4 won't be special in that regard, I suppose.
|
msg332916 - (view) |
Author: Ma Lin (malin) * |
Date: 2019-01-03 07:45 |
One scene is caching regular expresses, b'[a-z]', '[a-z]' may exist in the same dict. Any way, it's trivial on the whole.
|
msg333068 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2019-01-05 19:14 |
For historical reasons. In Python 2, str and unicode consisting of ASCII characters can be equal. Equal values should have the same hash. In Python 3, bytes and str are always different. This can cause subtle bugs in the code ported from Python 2. Options -b and -bb were added to help to catch such bugs. For increasing a chance of catching such bugs, hashes of bytes and str consisting of ASCII characters with same codes, should be equal.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:09 | admin | set | github: 79817 |
2019-01-05 19:14:26 | serhiy.storchaka | set | status: open -> closed messages:
+ msg333068
keywords:
patch, patch, patch resolution: fixed stage: patch review -> resolved |
2019-01-03 07:45:46 | malin | set | messages:
+ msg332916 |
2019-01-03 07:20:48 | scoder | set | messages:
+ msg332911 |
2019-01-03 07:13:32 | malin | set | messages:
+ msg332910 |
2019-01-03 07:02:11 | scoder | set | messages:
+ msg332909 |
2019-01-03 05:10:25 | malin | set | messages:
+ msg332908 |
2019-01-02 12:16:10 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg332866
|
2019-01-02 11:33:15 | serhiy.storchaka | set | pull_requests:
- pull_request10785 |
2019-01-02 11:33:03 | serhiy.storchaka | set | pull_requests:
- pull_request10784 |
2019-01-02 11:32:47 | serhiy.storchaka | set | keywords:
patch, patch, patch versions:
- Python 3.6, Python 3.7 |
2019-01-02 11:25:50 | scoder | set | nosy:
+ scoder
messages:
+ msg332864 versions:
- Python 3.4, Python 3.5 |
2019-01-02 11:08:15 | malin | set | type: enhancement -> performance versions:
+ Python 3.4, Python 3.5 messages:
+ msg332861 title: remove redundant code in unicode_hash(PyObject *self) -> remove redundant check in unicode_hash(PyObject *self) |
2019-01-02 07:49:36 | malin | set | versions:
+ Python 3.6, Python 3.7 |
2019-01-02 07:30:20 | malin | set | type: enhancement messages:
+ msg332858 components:
+ Interpreter Core versions:
+ Python 3.8 |
2019-01-02 07:25:14 | malin | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request10785 |
2019-01-02 07:25:10 | malin | set | keywords:
+ patch stage: (no value) pull_requests:
+ pull_request10784 |
2019-01-02 07:25:06 | malin | set | keywords:
+ patch stage: (no value) pull_requests:
+ pull_request10783 |
2019-01-02 07:20:57 | malin | create | |