New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PATCH] remove bogus codepath from pprint._safe_repr #62882
Comments
pprint._safe_repr for type str uses much slower codepath by default, which does not makes a sense in Python3 context. Instead of simply using repr, it check the existence of 'locale' in sys.modules and if found, it goes one-by-one-char call str.isalpha() on each and apply the quoting for non-alpha chars. This is extremely slow, but as locale is usually in sys.modules, it's used by default. The point of such code was because in python2, str.isalpha() depends on locale, so for locale-aware Python builds, there was a different path needed. But this does not apply for Python3, where all strings are unicode, so .isaplha() is not locale sensitive anymore. |
This is simple code checks if .isalnum is or is not locale sensitive and a small measurement of how much is the repr faster, compared to old codepath. BTW: python3 test_pprint.py on patched version have succeeded OK (expected failures=1) |
Ran some benchmark numbers: $ ./python -m timeit -s "from pprint import saferepr; s='é\"x' * 1000" "saferepr(s)"
-> before patch: 555 usec per loop
-> after patch: 10.9 usec per loop
$ ./python -m timeit -s "from pprint import saferepr; s='xxx' * 1000" "saferepr(s)"
-> before patch: 290 usec per loop
-> after patch: 8.68 usec per loop |
We can just remove "if typ is str:" case at all. A general case works for strings. But for performance (this will slowdown Antoine's tests by 15-20%) we should left it and may be even extend it to other builtin types: builtin_type = (str, int, float, NoneType, list, tuple, dict, set, bytes): # or may be use a set?
...
if typ in builtin_type:
return repr(object), True, False |
Here is a patch which adds fast path for builtin scalar types. Or we can just remove a special case for strings. |
The fast scalars approach looks great! |
Antoine, what you say? |
Are there tests for all the "builtin scalars"? |
To made some sense in Python 3 context we should remove the check for the locale module and replace repr() by ascii(). |
New changeset c77a42c234d6 by Serhiy Storchaka in branch 'default': |
Yes, now there are. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: