New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential UnicodeDecodeError in dis #85669
Comments
A potential UnicodeDecodeError could be raised when run "python -m dis" on non-utf8 encoding environment. Assume there is a file named "a.py", and contains "print('喵')", then save with UTF8 encoding. Run "python -m dis ./a.py", on non-UTF8 encoding environment, for example a Windows PC which default language is Chinese. A UnicodeDecodeError raised. Traceback (most recent call last):
File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Program Files\Python38\lib\dis.py", line 553, in <module>
_test()
File "C:\Program Files\Python38\lib\dis.py", line 548, in _test
source = infile.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xb5 in position 9: illegal multibyte sequence That because Windows' default encoding is decided by language. Chinese use cp936(GB2312) as default encoding and can't handle UTF8 encoding. It just need to read in "rb" mode instead of "r". |
I searched the whole Lib folder and find a lot of code uses "open(filename, 'r')" without handling default encoding. Should we open another issue for these problem? |
Good catch. Yes, when read Python source files you should either open them in binary mode if reading bytes is enough for use, or open them with tokenize.open() if we need string data, or use token.detect_encoding() and pass the result to open(). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: