classification
Title: test failure in test_codeccallbacks
Type: behavior Stage: resolved
Components: Tests Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: brett.cannon, eric.snow, ezio.melotti, ncoghlan, pitrou, python-dev, serhiy.storchaka, terry.reedy
Priority: high Keywords: patch

Created on 2013-08-11 00:03 by pitrou, last changed 2013-08-17 00:52 by eric.snow. This issue is now closed.

Files
File name Uploaded Description Edit
issue18706.patch serhiy.storchaka, 2013-08-11 06:49 review
issue18706.diff ezio.melotti, 2013-08-11 10:24
Messages (11)
msg194854 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-11 00:03
Since issue #18681, the following test sequence fails:

./python -m test -v test_codecencodings_kr test_imp test_codeccallbacks

[...]

======================================================================
ERROR: test_xmlcharnamereplace (test.test_codeccallbacks.CodecCallbackTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/antoine/cpython/default/Lib/test/test_codeccallbacks.py", line 112, in test_xmlcharnamereplace
    self.assertEqual(sin.encode("ascii", "test.xmlcharnamereplace"), sout)
  File "/home/antoine/cpython/default/Lib/test/test_codeccallbacks.py", line 102, in xmlcharnamereplace
    l.append("&%s;" % html.entities.codepoint2name[ord(c)])
AttributeError: 'module' object has no attribute 'entities'
msg194859 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-08-11 01:09
Failure is intermittent. The only related lines in test_codeccallbacks are

002: import html.entities
102:    l.append("&%s;" % html.entities.codepoint2name[ord(c)])
msg194868 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-11 06:49
There are two ways to fix this issue -- one change test_imp.py and other change test_codeccallbacks.py. The proposed patch contains both.
msg194870 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-11 07:42
However it looks weird:

>>> def f():
...     import html.entities
...     del sys.modules['html']
... 
>>> f()
>>> import html.entities
>>> html.entities
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'entities'

Perhaps import machinery should be fixed instead of tests.
msg194878 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-08-11 09:46
> ./python -m test -v test_codecencodings_kr test_imp test_codeccallbacks

Thanks, I was trying to reproduce the failure yesterday with test_imp test_codeccallbacks but it wasn't working -- now I can reproduce it.

My idea was to import html and html.parser and then remove them both, and that's what the patch did:

>>> import sys
>>> m = set(sys.modules)
>>> from html import parser
>>> set(sys.modules) - m
{'html.parser', 'linecache', 'tokenize', 'warnings', '_markupbase', 'html', 'token'}
>>> del sys.modules['html']
>>> set(sys.modules) - m
{'html.parser', 'linecache', 'tokenize', '_markupbase', 'warnings', 'token'}
>>> del sys.modules['html.parser']
>>> set(sys.modules) - m
{'token', '_markupbase', 'tokenize', 'warnings', 'linecache'}

I think the problem is that if test_codecencodings_kr is run before test_imp, test_imp deletes html and html.parser but leaves the html.entities imported by test_codecencodings_kr, and when test_codeccallbacks tries to use it again it fails:

>>> import sys
>>> m = set(sys.modules)
>>> import html.entities
>>> set(sys.modules) - m
{'html.entities', 'html'}
>>> from html import parser
>>> set(sys.modules) - m
{'_markupbase', 'html', 'html.entities', 'tokenize', 'html.parser', 'linecache', 'warnings', 'token'}
>>> del sys.modules['html']
>>> set(sys.modules) - m
{'_markupbase', 'tokenize', 'html.entities', 'token', 'html.parser', 'linecache', 'warnings'}
>>> del sys.modules['html.parser']
>>> set(sys.modules) - m
{'_markupbase', 'tokenize', 'html.entities', 'token', 'linecache', 'warnings'}
>>> html.entities
<module 'html.entities' from '/home/wolf/dev/py/py3k/Lib/html/entities.py'>
>>> import html.entities
>>> html.entities
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'entities'

A solution would be to remove all the modules that start with 'html.' (see attached patch):

>>> import sys
>>> m = set(sys.modules)
>>> import html.entities
>>> from html import parser
>>> del sys.modules['html']
>>> del sys.modules['html.parser']
>>> set(sys.modules) - m
{'warnings', 'linecache', 'token', 'html.entities', 'tokenize', '_markupbase'}
>>> del sys.modules['html.entities']
>>> import html.entities
>>> html.entities
<module 'html.entities' from '/home/wolf/dev/py/py3k/Lib/html/entities.py'>

Another (possibly better) solution would be to copy sys.modules before messing with it and restore it afterwards.  I was looking in test.support for a context manager to do it but I didn't find it, so I just deleted the modules I imported manually.
msg194879 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-11 09:52
> Perhaps import machinery should be fixed instead of tests.

Yes, the import machinery is acting weird here.
msg194886 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-11 10:23
Perhaps `del sys.modules['module']` should remove all 'module.submodule' from sys.modules. Or perhaps `import module.submodule` should ensure that 'module' is in sys.modules and has the 'submodule' attribute.

Ezio, seems you forgot to attach a patch.
msg194888 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-11 11:05
My preference goes to Serhiy's fix for test_imp.
Note that the import machinery oddity would deserve fixing too (in a separate issue perhaps ?).
msg194895 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-08-11 15:38
> My preference goes to Serhiy's fix for test_imp.

Fair enough.  Serhiy do you want to commit your fix?

> Note that the import machinery oddity would deserve fixing too
> (in a separate issue perhaps ?).

This should be a separate issue.
msg194904 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-11 17:14
New changeset dab790a17c4d by Serhiy Storchaka in branch '3.3':
Issue #18706: Fix a test for issue #18681 so it no longer breaks test_codeccallbacks*.
http://hg.python.org/cpython/rev/dab790a17c4d

New changeset 1f4aed2c914c by Serhiy Storchaka in branch 'default':
Issue #18706: Fix a test for issue #18681 so it no longer breaks test_codeccallbacks*.
http://hg.python.org/cpython/rev/1f4aed2c914c
msg195445 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-08-17 00:51
Regarding this:

>>> def f():
...     import html.entities
...     del sys.modules['html']
...
>>> f()
>>> import html.entities
>>> html.entities
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'entities'

This is due to the fact that the code to add the submodule to the package's namespace is associated with the import of the submodule, not the parent (aka, package).  So the import after calling f() happily bypasses all that since the submodule already exists in sys.modules.

skip loading the module: 

http://hg.python.org/cpython/file/default/Lib/importlib/_bootstrap.py#l1612

bind the submodule to the parent:

http://hg.python.org/cpython/file/default/Lib/importlib/_bootstrap.py#l1568

I wouldn't necessarily say the import machinery is acting weird.  Directly messing with sys.modules, particularly for submodules, isn't something the rest of the import code explicitly accommodates (outside of loaders).  In this case the code doing the removing needs to make sure submodules are removed too.

If anybody thinks we should support this sort of interaction with sys.modules then we could address it in another issue.  Outside of testing, what are the use cases for removing modules from sys.modules?  Regardless, a solution might be as simple as moving the parent-binding code in _find_and_load_unlocked() to _gcd_import().
History
Date User Action Args
2013-08-17 00:52:01eric.snowsetmessages: + msg195445
2013-08-16 16:44:19ezio.melottisetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013-08-11 17:14:24python-devsetnosy: + python-dev
messages: + msg194904
2013-08-11 15:38:10ezio.melottisetmessages: + msg194895
2013-08-11 11:05:22pitrousetmessages: + msg194888
2013-08-11 10:24:25ezio.melottisetfiles: + issue18706.diff
2013-08-11 10:23:45serhiy.storchakasetmessages: + msg194886
2013-08-11 09:52:16pitrousetmessages: + msg194879
2013-08-11 09:46:52ezio.melottisetmessages: + msg194878
2013-08-11 07:42:45serhiy.storchakasetnosy: + brett.cannon, ncoghlan, eric.snow
messages: + msg194870
2013-08-11 06:49:36serhiy.storchakasetfiles: + issue18706.patch
keywords: + patch
messages: + msg194868

stage: needs patch -> patch review
2013-08-11 06:11:33serhiy.storchakasetnosy: + serhiy.storchaka
2013-08-11 01:09:01terry.reedysetnosy: + terry.reedy
messages: + msg194859
2013-08-11 00:03:15pitroucreate