Message323214
The test fails because
byte_str.decode('ascii', 'surragateescape')
is not what ascii(byte_str) - returns when called from the commandline.
Assumption: since " check('utf8', [arg_utf8])" succeeds I assume the parsing of the command-line is correct.
DETAILS
>>> arg = 'h\xe9\u20ac'.encode('utf-8')
>>> arg
b'h\xc3\xa9\xe2\x82\xac'
>>> arg.decode('ascii', 'surrogateescape')
'h\udcc3\udca9\udce2\udc82\udcac'
I am having a difficult time getting the syntax correct for all the "escapes", so I added a print statement in the check routine:
test_cmd_line (test.test_utf8_mode.UTF8ModeTests) ...
code:import locale, sys; print("%s:%s" % (locale.getpreferredencoding(), ascii(sys.argv[1:]))) arg:b'h\xc3\xa9\xe2\x82\xac'
out:UTF-8:['h\xe9\u20ac']
code:import locale, sys; print("%s:%s" % (locale.getpreferredencoding(), ascii(sys.argv[1:]))) arg:b'h\xc3\xa9\xe2\x82\xac'
out:ISO8859-1:['h\xc3\xa9\xe2\x82\xac']
test code with my debug statement (to generate above):
def test_cmd_line(self):
arg = 'h\xe9\u20ac'.encode('utf-8')
arg_utf8 = arg.decode('utf-8')
arg_ascii = arg.decode('ascii', 'surrogateescape')
code = 'import locale, sys; print("%s:%s" % (locale.getpreferredencoding(), ascii(sys.argv[1:])))'
def check(utf8_opt, expected, **kw):
out = self.get_output('-X', utf8_opt, '-c', code, arg, **kw)
print("\ncode:%s arg:%s\nout:%s" % (code, arg, out))
args = out.partition(':')[2].rstrip()
self.assertEqual(args, ascii(expected), out)
check('utf8', [arg_utf8])
if sys.platform == 'darwin' or support.is_android:
c_arg = arg_utf8
else:
c_arg = arg_ascii
check('utf8=0', [c_arg], LC_ALL='C')
So the first check succeeds:
check('utf8', [arg_utf8])
But the second does not:
FAIL: test_cmd_line (test.test_utf8_mode.UTF8ModeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/data/prj/python/src/python3-3.7.0/Lib/test/test_utf8_mode.py", line 225, in test_cmd_line
check('utf8=0', [c_arg], LC_ALL='C')
File "/data/prj/python/src/python3-3.7.0/Lib/test/test_utf8_mode.py", line 218, in check
self.assertEqual(args, ascii(expected), out)
AssertionError: "['h\\xc3\\xa9\\xe2\\x82\\xac']" != "['h\\udcc3\\udca9\\udce2\\udc82\\udcac']"
- ['h\xc3\xa9\xe2\x82\xac']
+ ['h\udcc3\udca9\udce2\udc82\udcac']
: ISO8859-1:['h\xc3\xa9\xe2\x82\xac']
I tried saying the "expected" is arg, but arg is still a byte object, the cmd_line result is not (printed as such).
AssertionError: "['h\\xc3\\xa9\\xe2\\x82\\xac']" != "[b'h\\xc3\\xa9\\xe2\\x82\\xac']"
- ['h\xc3\xa9\xe2\x82\xac']
+ [b'h\xc3\xa9\xe2\x82\xac']
? +
: ISO8859-1:['h\xc3\xa9\xe2\x82\xac'] |
|
Date |
User |
Action |
Args |
2018-08-06 16:06:50 | Michael.Felt | set | recipients:
+ Michael.Felt |
2018-08-06 16:06:50 | Michael.Felt | set | messageid: <1533571610.9.0.56676864532.issue34347@psf.upfronthosting.co.za> |
2018-08-06 16:06:50 | Michael.Felt | link | issue34347 messages |
2018-08-06 16:06:50 | Michael.Felt | create | |
|