New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_codecs currently failing on several Windows buildbots #64770
Comments
The Windows buildbots are currently broken due to a codec issue. I populated the "nosy" list based on the "unicode" experts from the Experts Index. http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x/builds/4040 test_streamreaderwriter (test.test_codecs.WithStmtTest) ... test test_codecs failed ====================================================================== Traceback (most recent call last):
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_codecs.py", line 157, in test_readline
self.assertEqual(readalllines("".join(vw), True), "|".join(vw))
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_codecs.py", line 136, in readalllines
line = reader.readline(size=size, keepends=keepends)
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\codecs.py", line 548, in readline
data = self.read(readsize, firstline=True)
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\codecs.py", line 494, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'CP_UTF8' codec can't decode bytes in position 0--1: No mapping for the Unicode character exists in the target code page. |
Note that this appears to be in Windows-specific code ("CP_UTF8"), rather than being cross-platform code which happens to only fail on Windows. So we need someone who does both Windows and Unicode. |
It looks to be related to changeset e988661e458c5402c0236cd1084a8671249a760d |
Serhiy said on IRC that he doesn't have a Windows development environment, so he didn't think he could help. |
UTF-7 decoder is not related to this test. The test_readline test was broken from the born, and a part of this test was do nothing. After fixing it in bpo-20520, new bugs were exposed: bpo-20538 and this. This bug was hidden until fixing bpo-20538. Note that there is no test_partial in CP65001Test. Perhaps it is related. The simplest solution would be to temporary skip test_readline in CP65001Test: test_readline = unittest.expectedFailure(ReadTest.test_readline) |
The test tries to decode a partial UTF-8 bytes string. The problem is that codecs.code_page_decode() doesn't implement fully partial decoders. The decoder only supports partial decoding for a few code pages: 932, 936, 949, 950, and 1361. The partial decoding is currently based on IsDBCSLeadByteEx(): It may be possible to enhance decoders, but it's not a regression from Python 3.3 and so can be done in Python 3.5. Please just skip failing tests for CP_UTF8 (cp 65001) and maybe other Windows code pages in test_codecs. (I don't have time to write a patch to skip, sorry.) |
New changeset 4f6499fc2f09 by Victor Stinner in branch 'default': |
I opened bpo-20574 to implement the missing feature for cp65001. |
New changeset d8f48717b74e by Victor Stinner in branch '3.3': |
Would have been nice to do this also on 3.3 branch... |
Ah yes, sorry. I forgot that the utf-7 change was also applied to 3.3. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: