classification
Title: embedded interpreter or virtualenv fails with "ImportError: cannot import name MAXREPEAT"
Type: crash Stage: resolved
Components: Library (Lib), Regular Expressions Versions: Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Alex.Burka, Julian, benjamin.peterson, ezio.melotti, georg.brandl, mrabarnett, ned.deily, python-dev, samueljohn, serhiy.storchaka, xuhdev
Priority: high Keywords: patch

Created on 2013-05-24 17:40 by samueljohn, last changed 2013-09-20 18:42 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
sre_MAXREPEAT.patch serhiy.storchaka, 2013-09-16 13:32 review
Messages (13)
msg189924 - (view) Author: Samuel John (samueljohn) Date: 2013-05-24 17:40
As also discussed at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=704084 and https://github.com/mxcl/homebrew/pull/19300, Python 2.7.4 and 2.7.5 seem to have added an `from _sre import MAXREPEAT` to the sre_compile.py, sre_parse.py and sre_constants.py modules.

But python 2.7.3 (and older?) don't have the built in MAXREPEAT in _sre. Some virtualenvs have to be updated (which is easy) but some Software (such as vim, shipped with OS X 10.8.3) is statically linked to an older python 2.7.2 (I guess) but somehow picks up my newly built python 2.7.5 and attempts to load it's site.py. (Weechat also reported to being affected)

I think this is more a bug of vim/weechat etc. but we at homebrew have to do some "hacky" fix, because Apple is not going to update vim very soon, and so having a newer python in path breaks system stuff.

So I am fine if you close this here again. But at least we have a reference or perhaps you guys have a better idea how to work-around.

For homebrew, I propose a monkey-patch in re.py to the _sre module if it does not have a MAXREPEAT.


   try:
       from _sre import MAXREPEAT
   except ImportError:
       import _sre
       _sre.MAXREPEAT = 65535 # this monkey-patches all other places of "from _sre import MAXREPEAT"
msg190132 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-05-27 06:01
After spending some time investigating this issue, I believe that potential upgrade compatibility issues have been introduced by the changes for Issue13169.  How critical they are and, in particular, whether they violate our implicit promises of maintenance (point) release compatibility are questions for discussion.

The signature of the problem is "ImportError: cannot import name MAXREPEAT" as a result of an attempt to import re or a module that itself imports re:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nad/issue18050/u/lib/python2.7/re.py", line 105, in <module>
    import sre_compile
  File "/home/nad/issue18050/u/lib/python2.7/sre_compile.py", line 14, in <module>
    import sre_parse
  File "/home/nad/issue18050/u/lib/python2.7/sre_parse.py", line 17, in <module>
    from sre_constants import *
  File "/home/nad/issue18050/u/lib/python2.7/sre_constants.py", line 18, in <module>
    from _sre import MAXREPEAT
ImportError: cannot import name MAXREPEAT

The changes for Issue13169 moved the definition of MAXREPEAT into C code and then added an import of the new C constant into Lib/sre_constants.py to continue to provide sre_constants.MAXREPEAT for third-party modules that have been using it.  As long as the versions of the Python interpreter and the standard library Python files (sys.prefix/lib/pythonX.Y) remain in sync, there is not a problem.  However, if a situation arises where a pre-13169 interpreter is used with a post-13169 standard library, the "cannot import name MAXREPEAT" ImportError will occur.  I have found at least two situations where this can happen:

1. when a C application has statically embedded a pre-13169 interpreter and the standard library pointed to by its sys.prefix gets upgraded to a post-13169 version.  The interpreter then crashes during initialization in Lib/site.py which imports re in both Python 2 and 3 (for different purposes).

2. when a virtualenv created with a pre-13169 non-shared interpreter is used with an upgraded post-13169 standard library.  In this case, the interpreter makes it past initialization because virtualenv (at least, the current version) creates a modified site.py in the virtualenv lib/pythonX.Y that happens to not import re.  However, the import error will occur on the first use of re.  Side note: 3.3 standard library pyvenv does not seem to have this problem since the created venv symlinks to the sys.prefix interpreter and libs rather than copying it, like virtualenv does.

Note that Pythons built with --enable-shared (or --enable-framework on OS X) generally will not have a problem as long as the shared libpythonX.Y and the standard library remain consistent.  That is, in both cases above, a Python upgrade will automatically cause both the embedded app and the virtualenv to run with the newer interpreter.  AFAICT, the problems will only be seen when using a non-shared Python.

I believe the upgrades affected by this problem are:

2.7 through 2.7.3 upgraded to 2.7.4 or 2.7.5
3.3.0 upgraded to 3.3.1 or 3.3.2
3.2 through 3.2.3 upgraded to 3.2.4 or 3.2.5 (unverified)

The problem should be fixable by applying a patch along the lines suggested by Samuel.  Regardless of whether this is a compatibility break or not, I think we should fix the problem because people are already running into it.  (Nosying the release managers for their input.)

While related, the root cause of the vim problem reported above is probably more complicated because, although it appears to embed a Python interpreter, the standard library used by the OS X system vim appears to depend on $PATH, apparently incorrect behavior in vim.  Unfortunately, OS X vim users on 10.8 (probably also on 10.7) may encounter this problem when they try to use :py if they install an updated version of Python 2.7, such as from python.org or a third-party distributor like Homebrew or MacPorts.  And, when vim crashes due to the import error, it leaves the terminal settings in an unusable state.  One user workaround might be to create a shell function or alias to tweak PATH before using vim to ensure /usr/bin/python2.7 is found first.  Or simply patch re.py in the upgraded Python.
msg190138 - (view) Author: Samuel John (samueljohn) Date: 2013-05-27 12:59
Ned, incredibly helpful description. Thanks for investigating! I have nothing to add to that.
msg190195 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-28 11:19
I am afraid that importing MAXREPEAT is not the only issue. Short time ago, CODESIZE was increased from 2 to 4 on narrow builds (issue1160). This makes compiled patterns generated by Lib/sre_compile.py incompatible with old _sre module on narrow builds.

I think it is a bad idea to mix a different versions of Python stdlib code and corresponded binary extensions. There are other examples when Python and C code changed synchronously and Python code depends on new names exposed by C module (i.e. 35ef949e85d7, b6ec3b717f7e).
msg190252 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-05-28 21:44
Serhly, while I don't disagree with your points, I should have made clearer that the issue here is that the _sre module is a static module (built into the interpreter executable or shared lib as shown in Modules/Setup.dist) and *not* included in the shared library (sys.prefix/lib/pythonX.Y/) whereas are_constants *is*. If the binaries produced by both the python and C files changes end up in sys.prefix/lib/pythonX.Y, there is not a problem.  That's normally the case and I believe that is the case with both of the other examples you cited.  So they are not going to exhibit this problem.  The problem is when a change introduces a dependency between static and shared modules, like this one does.
msg190343 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-05-30 06:16
Another report of users being affected by this issue:  https://trac.macports.org/ticket/39207
msg190344 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-05-30 06:25
Also: http://stackoverflow.com/questions/16301735/importerror-cannot-import-name-maxrepeat-with-cx-freeze
msg190351 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-30 09:38
I just think that the patch only silences an import error. Is test_re passed with this patch on 32-bit platform with <=2.7.3 static binaries and 2.7.5 py-files?
msg197897 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-16 13:32
Well. While the running different versions of binaries and Python files is not a good idea, perhaps we can apply this change. But only for 2.7 and 3.3. There is no need in this garbage in 3.4.

I'm still not sure that there are no other inconsistencies between old static binaries and newer Python files.
msg197955 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-09-17 00:29
The patch LGTM. And I agree that the fix is not needed for 3.4.  Thanks, Serhiy.

I verified that it does solve the "embedded" problem (case 1 above) when using embedded versions of all previous releases of 2.7.x (except 2.7.0) and 3.3.x.

For the record, it appears 2.7.1 introduced a separate incompatibility issue that causes a similar initialization crash with an embedded version of 2.7.0:

Traceback (most recent call last):
  File "/py/test_issue18050/root/lib/python2.7/site.py", line 62, in <module>
    import os
  File "/py/test_issue18050/root/lib/python2.7/os.py", line 398, in <module>
    import UserDict
  File "/py/test_issue18050/root/lib/python2.7/UserDict.py", line 83, in <module>
    import _abcoll
  File "/py/test_issue18050/root/lib/python2.7/_abcoll.py", line 11, in <module>
    from abc import ABCMeta, abstractmethod
  File "/py/test_issue18050/root/lib/python2.7/abc.py", line 8, in <module>
    from _weakrefset import WeakSet
  File "/py/test_issue18050/root/lib/python2.7/_weakrefset.py", line 5, in <module>
    from _weakref import ref
ImportError: No module named _weakref

Googling shows a number of reports of users who have run into that one, too, though no one seems to have opened an issue here about it.  I don't think it's worth trying to fix that one at this point as there probably aren't that many instances of system executables that still have embedded static 2.7.0 interpreters anymore (I hope).  Unfortunately, that's not the case for embedded static 2.7.2, e.g. vim on OS X 10.8.x.
msg197958 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-09-17 03:02
To answer your earlier question, there are other inter-version incompatibilities in some of the non-static standard library modules such that test_re cannot be run without errors.  However, applying the patch at least allows the embedded interpreter to not crash during initialization, a big improvement over the current situation.
msg198156 - (view) Author: Roundup Robot (python-dev) Date: 2013-09-20 18:30
New changeset 68a7d77a90c3 by Serhiy Storchaka in branch '3.3':
Issue #18050: Fixed an incompatibility of the re module with Python 3.3.0
http://hg.python.org/cpython/rev/68a7d77a90c3

New changeset f27af2243e2a by Serhiy Storchaka in branch '2.7':
Issue #18050: Fixed an incompatibility of the re module with Python 2.7.3
http://hg.python.org/cpython/rev/f27af2243e2a
msg198159 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-20 18:42
Thank you Samuel for your report and suggested solution. Thank you Ned for additional investigating.
History
Date User Action Args
2013-09-20 18:42:48serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg198159

stage: commit review -> resolved
2013-09-20 18:30:34python-devsetnosy: + python-dev
messages: + msg198156
2013-09-17 03:02:21ned.deilysetmessages: + msg197958
2013-09-17 00:29:19ned.deilysetmessages: + msg197955
stage: patch review -> commit review
2013-09-16 13:32:16serhiy.storchakasetfiles: + sre_MAXREPEAT.patch
versions: - Python 3.4
messages: + msg197897

assignee: serhiy.storchaka
keywords: + patch
stage: needs patch -> patch review
2013-06-01 20:20:02xuhdevsetnosy: + xuhdev
2013-05-30 09:38:17serhiy.storchakasetmessages: + msg190351
2013-05-30 06:25:28ned.deilysetmessages: + msg190344
2013-05-30 06:16:37ned.deilysetmessages: + msg190343
2013-05-28 21:44:48ned.deilysetmessages: + msg190252
2013-05-28 11:19:09serhiy.storchakasetmessages: + msg190195
2013-05-27 12:59:31samueljohnsetmessages: + msg190138
2013-05-27 12:49:50Juliansetnosy: + Julian
2013-05-27 06:01:10ned.deilysetpriority: normal -> high


components: + Library (Lib), - Extension Modules
title: _sre.MAXREPEAT not defined in 2.7.3 -> embedded interpreter or virtualenv fails with "ImportError: cannot import name MAXREPEAT"
nosy: + georg.brandl, benjamin.peterson, serhiy.storchaka
versions: + Python 3.3, Python 3.4
messages: + msg190132
stage: needs patch
2013-05-24 22:08:38ned.deilysetnosy: + ned.deily
2013-05-24 21:10:14Alex.Burkasetnosy: + Alex.Burka
2013-05-24 17:40:43samueljohncreate