New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Py3k fails to parse a file with an iso-8859-1 string #46912
Comments
While running the 2to3 script on the scons codebase, I ran into an Attached is just the portion of the script that causes the error. 2to3 throws an error on the string regardless of whether the unicode RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: ws_comma
Traceback (most recent call last):
File "/usr/local/bin/2to3", line 5, in <module>
sys.exit(refactor.main())
File "/usr/local/lib/python3.0/lib2to3/refactor.py", line 81, in main
rt.refactor_args(args)
File "/usr/local/lib/python3.0/lib2to3/refactor.py", line 188, in
refactor_args
self.refactor_file(arg)
File "/usr/local/lib/python3.0/lib2to3/refactor.py", line 217, in
refactor_file
input = f.read() + "\n" # Silence certain parse errors
File "/usr/local/lib/python3.0/io.py", line 1611, in read
decoder.decode(self.buffer.read(), final=True))
File "/usr/local/lib/python3.0/io.py", line 1199, in decode
output = self.decoder.decode(input, final=final)
File "/usr/local/lib/python3.0/codecs.py", line 300, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 59-60:
invalid data |
2to3 running under Python 2.5.1 handles this file just fine. 2to3 collinwinter@Silves: This suggests this problem isn't 2to3-specific. Refiling this issue |
Someone on the #python IRC channel suggested that the default for python If you remove the unicode string literal (u'') from the front of the |
Also, I can confirm that running 2to3 with Python 2.6 correctly converts |
Confirmed in py3k on rev71995. |
The problem is that 2to3 just reads the file with whatever |
Patch using tokenize.detect_encoding() to read the encoding of Python We might write unit test. See also related issue: bpo-5093 |
Fixed in r72491. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: