This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test.test_codeccallbacks.CodecCallbackTest.test_xmlcharrefreplace_with_surrogates() and test.test_unicode.UnicodeTest.test_encode_decimal_with_surrogates() loaded from *.pyc files fail with Python supporting wide unicode
Type: behavior Stage: resolved
Components: Tests Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, benjamin.peterson, ezio.melotti, lemburg, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2013-10-30 23:33 by Arfrever, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue19457.patch serhiy.storchaka, 2013-10-31 11:28 review
Messages (7)
msg201786 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2013-10-30 23:33
test.test_codeccallbacks.CodecCallbackTest.test_xmlcharrefreplace_with_surrogates() and test.test_unicode.UnicodeTest.test_encode_decimal_with_surrogates() fail with Python supporting wide unicode, when they have been loaded from *.pyc files (test_codeccallbacks.pyc, test_unicode.pyc).
(This bug can be reproduced when running `make test`, which runs test suite twice, firstly with *.pyc files initially absent.)

This bug is a regression in 2.7.6rc1. These tests are absent in 2.7.5. These tests were added in 719ee60fc5e2.

$ ./configure --enable-unicode=ucs4
...
$ make
...
$ LD_LIBRARY_PATH="$(pwd)" ./python Lib/test/regrtest.py -v test_codeccallbacks
...
$ LD_LIBRARY_PATH="$(pwd)" ./python Lib/test/regrtest.py -v test_codeccallbacks
== CPython 2.7.6rc1 (2.7:dd12639b82bf, Oct 30 2013, 23:53:21) [GCC 4.8.1]
==   Linux-3.11.6
==   /tmp/cpython/build/test_python_6715
Testing with flags: sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, tabcheck=0, verbose=0, unicode=0, bytes_warning=0, hash_randomization=0)
test_codeccallbacks
test_backslashescape (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badandgoodbackslashreplaceexceptions (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badandgoodignoreexceptions (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badandgoodreplaceexceptions (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badandgoodstrictexceptions (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badandgoodxmlcharrefreplaceexceptions (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badhandlerresults (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badlookupcall (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_badregistercall (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_bug828737 (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_callbacks (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_charmapencode (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_decodehelper (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_decodeunicodeinternal (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_decoding_callbacks (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_encodehelper (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_longstrings (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_lookup (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_translatehelper (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_unencodablereplacement (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_unicodedecodeerror (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_unicodeencodeerror (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_unicodetranslateerror (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_uninamereplace (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_unknownhandler (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_xmlcharnamereplace (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_xmlcharrefreplace (test.test_codeccallbacks.CodecCallbackTest) ... ok
test_xmlcharrefreplace_with_surrogates (test.test_codeccallbacks.CodecCallbackTest) ... FAIL
test_xmlcharrefvalues (test.test_codeccallbacks.CodecCallbackTest) ... ok

======================================================================
FAIL: test_xmlcharrefreplace_with_surrogates (test.test_codeccallbacks.CodecCallbackTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/cpython/Lib/test/test_codeccallbacks.py", line 93, in test_xmlcharrefreplace_with_surrogates
    exp, msg='%r.encode(%r)' % (s, encoding))
AssertionError: u'\U0001f49d'.encode('ascii')

----------------------------------------------------------------------
Ran 29 tests in 0.071s

FAILED (failures=1)
test test_codeccallbacks failed -- Traceback (most recent call last):
  File "/tmp/cpython/Lib/test/test_codeccallbacks.py", line 93, in test_xmlcharrefreplace_with_surrogates
    exp, msg='%r.encode(%r)' % (s, encoding))
AssertionError: u'\U0001f49d'.encode('ascii')

1 test failed:
    test_codeccallbacks
$ LD_LIBRARY_PATH="$(pwd)" ./python Lib/test/regrtest.py -v test_unicode
...
$ LD_LIBRARY_PATH="$(pwd)" ./python Lib/test/regrtest.py -v test_unicode
== CPython 2.7.6rc1 (2.7:dd12639b82bf, Oct 30 2013, 23:53:21) [GCC 4.8.1]
==   Linux-3.11.6
==   /tmp/cpython/build/test_python_7518
Testing with flags: sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, tabcheck=0, verbose=0, unicode=0, bytes_warning=0, hash_randomization=0)
test_unicode
test___contains__ (test.test_unicode.UnicodeTest) ... ok
test__format__ (test.test_unicode.UnicodeTest) ... ok
test_bug1001011 (test.test_unicode.UnicodeTest) ... ok
test_capitalize (test.test_unicode.UnicodeTest) ... ok
test_center (test.test_unicode.UnicodeTest) ... ok
test_codecs (test.test_unicode.UnicodeTest) ... ok
test_codecs_charmap (test.test_unicode.UnicodeTest) ... ok
test_codecs_errors (test.test_unicode.UnicodeTest) ... ok
test_codecs_idna (test.test_unicode.UnicodeTest) ... ok
test_codecs_utf7 (test.test_unicode.UnicodeTest) ... ok
test_codecs_utf8 (test.test_unicode.UnicodeTest) ... ok
test_comparison (test.test_unicode.UnicodeTest) ... ok
test_concatenation (test.test_unicode.UnicodeTest) ... ok
test_constructor (test.test_unicode.UnicodeTest) ... ok
test_contains (test.test_unicode.UnicodeTest) ... ok
test_conversion (test.test_unicode.UnicodeTest) ... ok
test_count (test.test_unicode.UnicodeTest) ... ok
test_encode_decimal (test.test_unicode.UnicodeTest) ... ok
test_encode_decimal_with_surrogates (test.test_unicode.UnicodeTest) ... FAIL
test_endswith (test.test_unicode.UnicodeTest) ... ok
test_expandtabs (test.test_unicode.UnicodeTest) ... ok
test_expandtabs_overflows_gracefully (test.test_unicode.UnicodeTest) ... ok
test_extended_getslice (test.test_unicode.UnicodeTest) ... ok
test_find (test.test_unicode.UnicodeTest) ... ok
test_find_etc_raise_correct_error_messages (test.test_unicode.UnicodeTest) ... ok
test_floatformatting (test.test_unicode.UnicodeTest) ... ok
test_format (test.test_unicode.UnicodeTest) ... ok
test_format_auto_numbering (test.test_unicode.UnicodeTest) ... ok
test_format_float (test.test_unicode.UnicodeTest) ... ok
test_format_huge_item_number (test.test_unicode.UnicodeTest) ... ok
test_format_huge_precision (test.test_unicode.UnicodeTest) ... ok
test_format_huge_width (test.test_unicode.UnicodeTest) ... ok
test_format_subclass (test.test_unicode.UnicodeTest) ... ok
test_formatting (test.test_unicode.UnicodeTest) ... ok
test_formatting_huge_precision (test.test_unicode.UnicodeTest) ... ok
test_formatting_huge_width (test.test_unicode.UnicodeTest) ... ok
test_hash (test.test_unicode.UnicodeTest) ... ok
test_index (test.test_unicode.UnicodeTest) ... ok
test_inplace_rewrites (test.test_unicode.UnicodeTest) ... ok
test_isalnum (test.test_unicode.UnicodeTest) ... ok
test_isalnum_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_isalpha (test.test_unicode.UnicodeTest) ... ok
test_isalpha_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_isdecimal (test.test_unicode.UnicodeTest) ... ok
test_isdecimal_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_isdigit (test.test_unicode.UnicodeTest) ... ok
test_isdigit_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_islower (test.test_unicode.UnicodeTest) ... ok
test_islower_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_isnumeric (test.test_unicode.UnicodeTest) ... ok
test_isnumeric_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_isspace (test.test_unicode.UnicodeTest) ... ok
test_isspace_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_issue8271 (test.test_unicode.UnicodeTest) ... ok
test_istitle (test.test_unicode.UnicodeTest) ... ok
test_istitle_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_isupper (test.test_unicode.UnicodeTest) ... ok
test_isupper_non_bmp (test.test_unicode.UnicodeTest) ... ok
test_join (test.test_unicode.UnicodeTest) ... ok
test_literals (test.test_unicode.UnicodeTest) ... ok
test_ljust (test.test_unicode.UnicodeTest) ... ok
test_lower (test.test_unicode.UnicodeTest) ... ok
test_mul (test.test_unicode.UnicodeTest) ... ok
test_none_arguments (test.test_unicode.UnicodeTest) ... ok
test_partition (test.test_unicode.UnicodeTest) ... ok
test_printing (test.test_unicode.UnicodeTest) ... ok
test_raiseMemError (test.test_unicode.UnicodeTest) ... ok
test_replace (test.test_unicode.UnicodeTest) ... ok
test_replace_overflow (test.test_unicode.UnicodeTest) ... ok
test_repr (test.test_unicode.UnicodeTest) ... ok
test_rfind (test.test_unicode.UnicodeTest) ... ok
test_rindex (test.test_unicode.UnicodeTest) ... ok
test_rjust (test.test_unicode.UnicodeTest) ... ok
test_rpartition (test.test_unicode.UnicodeTest) ... ok
test_rsplit (test.test_unicode.UnicodeTest) ... ok
test_slice (test.test_unicode.UnicodeTest) ... ok
test_split (test.test_unicode.UnicodeTest) ... ok
test_splitlines (test.test_unicode.UnicodeTest) ... ok
test_startswith (test.test_unicode.UnicodeTest) ... ok
test_startswith_endswith_errors (test.test_unicode.UnicodeTest) ... ok
test_strip (test.test_unicode.UnicodeTest) ... ok
test_subscript (test.test_unicode.UnicodeTest) ... ok
test_surrogates (test.test_unicode.UnicodeTest) ... ok
test_swapcase (test.test_unicode.UnicodeTest) ... ok
test_title (test.test_unicode.UnicodeTest) ... ok
test_translate (test.test_unicode.UnicodeTest) ... ok
test_ucs4 (test.test_unicode.UnicodeTest) ... ok
test_unicode_repr (test.test_unicode.UnicodeTest) ... ok
test_upper (test.test_unicode.UnicodeTest) ... ok
test_utf8_decode_invalid_sequences (test.test_unicode.UnicodeTest) ... ok
test_utf8_decode_valid_sequences (test.test_unicode.UnicodeTest) ... ok
test_zfill (test.test_unicode.UnicodeTest) ... ok

======================================================================
FAIL: test_encode_decimal_with_surrogates (test.test_unicode.UnicodeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/cpython/Lib/test/test_unicode.py", line 1672, in test_encode_decimal_with_surrogates
    '123' + exp)
  File "/tmp/cpython/Lib/test/test_unicode.py", line 45, in assertEqual
    super(UnicodeTest, self).assertEqual(first, second, msg)
AssertionError: '123💝' != '123��'

----------------------------------------------------------------------
Ran 92 tests in 16.002s

FAILED (failures=1)
test test_unicode failed -- Traceback (most recent call last):
  File "/tmp/cpython/Lib/test/test_unicode.py", line 1672, in test_encode_decimal_with_surrogates
    '123' + exp)
  File "/tmp/cpython/Lib/test/test_unicode.py", line 45, in assertEqual
    super(UnicodeTest, self).assertEqual(first, second, msg)
AssertionError: '123💝' != '123��'

1 test failed:
    test_unicode
msg201803 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-31 11:28
This issue is not release blocker because it affects only testing.

Here is a patch which should fix tests.
msg201804 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-31 11:31
+        if u'\ud83d\udc9d' != u'\U0001f49d':

If would prefer a test on sys.maxunicode, something like:

   if sys.maxunicode == 0xffff:

Oh, I didn't remember that Python supports surrogate pairs, but not always. Support of non-BMP characters in Python 2 is ugly :-)
msg201813 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-31 13:19
> +        if u'\ud83d\udc9d' != u'\U0001f49d':
>
> If would prefer a test on sys.maxunicode, something like:
>
>    if sys.maxunicode == 0xffff:

No. 1. The check is true only on wide build. 2. It depends on how test module was loaded, true if it loaded from .py-file and false if it loaded from .py[co]-file.
msg201814 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-31 13:27
> 2. It depends on how test module was loaded, true if it loaded from .py-file and false if it loaded from .py[co]-file.

I tested with Python compiled in narrow or wide build: len(u'\ud83d\udc9d') value changes depending if the file is compiled to PYC in narrow or wide build. Oh, I have a headache. I didn't remember that Python 2 was so much broken with non-BMP characters :-p
msg201821 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-10-31 15:06
New changeset 8d5df9602a72 by Serhiy Storchaka in branch '2.7':
Issue #19457: Fixed xmlcharrefreplace tests on wide build when tests are
http://hg.python.org/cpython/rev/8d5df9602a72
msg201826 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-31 15:24
Thank you Arfrever for your report. And please describe a problem in the body of the issue, not in its title.
History
Date User Action Args
2022-04-11 14:57:52adminsetgithub: 63656
2013-10-31 15:24:47serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg201826

stage: patch review -> resolved
2013-10-31 15:06:48python-devsetnosy: + python-dev
messages: + msg201821
2013-10-31 13:27:33vstinnersetmessages: + msg201814
2013-10-31 13:19:06serhiy.storchakasetmessages: + msg201813
2013-10-31 11:31:26vstinnersetmessages: + msg201804
2013-10-31 11:28:38serhiy.storchakasetfiles: + issue19457.patch
priority: release blocker -> normal

components: + Tests
keywords: + patch
type: behavior
messages: + msg201803
stage: patch review
2013-10-30 23:33:42Arfrevercreate