Author Michael.Felt
Recipients Michael.Felt
Date 2018-08-07.20:23:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1533673415.26.0.56676864532.issue34347@psf.upfronthosting.co.za>
In-reply-to
Content
Common "experts" - feedback needed!

Original
test test_utf8_mode failed -- Traceback (most recent call last):
  File "/data/prj/python/git/python3-3.8/Lib/test/test_utf8_mode.py", line 225, in test_cmd_line
    check('utf8=0', [c_arg], LC_ALL='C')
  File "/data/prj/python/git/python3-3.8/Lib/test/test_utf8_mode.py", line 217, in check
    self.assertEqual(args, ascii(expected), out)
AssertionError: "['h\\xc3\\xa9\\xe2\\x82\\xac']" != "['h\\udcc3\\udca9\\udce2\\udc82\\udcac']"
- ['h\xc3\xa9\xe2\x82\xac']
+ ['h\udcc3\udca9\udce2\udc82\udcac']
 : ISO8859-1:['h\xc3\xa9\xe2\x82\xac']

Modification #1:
        if sys.platform == 'darwin' or support.is_android:
            c_arg = arg_utf8
        elif sys.platform.startswith("aix"):
            c_arg = arg_ascii.encode('utf-8', 'surrogateescape')
        else:
            c_arg = arg_ascii
        check('utf8=0', [c_arg], LC_ALL='C')

Result:
AssertionError: "['h\\xc3\\xa9\\xe2\\x82\\xac']" != "[b'h\\xc3\\xa9\\xe2\\x82\\xac']"
- ['h\xc3\xa9\xe2\x82\xac']
+ [b'h\xc3\xa9\xe2\x82\xac']
?  +
 : ISO8859-1:['h\xc3\xa9\xe2\x82\xac']

Modifiction #2:
        if sys.platform == 'darwin' or support.is_android:
            c_arg = arg_utf8
        elif sys.platform.startswith("aix"):
            c_arg = arg
        else:
            c_arg = arg_ascii
        check('utf8=0', [c_arg], LC_ALL='C')

AssertionError: "['h\\xc3\\xa9\\xe2\\x82\\xac']" != "[b'h\\xc3\\xa9\\xe2\\x82\\xac']"
- ['h\xc3\xa9\xe2\x82\xac']
+ [b'h\xc3\xa9\xe2\x82\xac']
?  +
 : ISO8859-1:['h\xc3\xa9\xe2\x82\xac']

The "expected" continues to be a "bytes" object, while the CLI code returns a non-byte string.
Or - the original has an ascii string object but uses \udc rather than \x

\udc is common (i.e., I see it frequently in googled results on other things) - should something in ascii() be changed to output \udc rather than \x ?

Thx!
History
Date User Action Args
2018-08-07 20:23:35Michael.Feltsetrecipients: + Michael.Felt
2018-08-07 20:23:35Michael.Feltsetmessageid: <1533673415.26.0.56676864532.issue34347@psf.upfronthosting.co.za>
2018-08-07 20:23:35Michael.Feltlinkissue34347 messages
2018-08-07 20:23:35Michael.Feltcreate