Created on 2013-05-24 17:40 by samueljohn, last changed 2013-09-20 18:42 by serhiy.storchaka. This issue is now closed.
|sre_MAXREPEAT.patch||serhiy.storchaka, 2013-09-16 13:32||review|
|msg189924 - (view)||Author: Samuel John (samueljohn)||Date: 2013-05-24 17:40|
As also discussed at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=704084 and https://github.com/mxcl/homebrew/pull/19300, Python 2.7.4 and 2.7.5 seem to have added an `from _sre import MAXREPEAT` to the sre_compile.py, sre_parse.py and sre_constants.py modules. But python 2.7.3 (and older?) don't have the built in MAXREPEAT in _sre. Some virtualenvs have to be updated (which is easy) but some Software (such as vim, shipped with OS X 10.8.3) is statically linked to an older python 2.7.2 (I guess) but somehow picks up my newly built python 2.7.5 and attempts to load it's site.py. (Weechat also reported to being affected) I think this is more a bug of vim/weechat etc. but we at homebrew have to do some "hacky" fix, because Apple is not going to update vim very soon, and so having a newer python in path breaks system stuff. So I am fine if you close this here again. But at least we have a reference or perhaps you guys have a better idea how to work-around. For homebrew, I propose a monkey-patch in re.py to the _sre module if it does not have a MAXREPEAT. try: from _sre import MAXREPEAT except ImportError: import _sre _sre.MAXREPEAT = 65535 # this monkey-patches all other places of "from _sre import MAXREPEAT"
|msg190132 - (view)||Author: Ned Deily (ned.deily) *||Date: 2013-05-27 06:01|
After spending some time investigating this issue, I believe that potential upgrade compatibility issues have been introduced by the changes for Issue13169. How critical they are and, in particular, whether they violate our implicit promises of maintenance (point) release compatibility are questions for discussion. The signature of the problem is "ImportError: cannot import name MAXREPEAT" as a result of an attempt to import re or a module that itself imports re: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/nad/issue18050/u/lib/python2.7/re.py", line 105, in <module> import sre_compile File "/home/nad/issue18050/u/lib/python2.7/sre_compile.py", line 14, in <module> import sre_parse File "/home/nad/issue18050/u/lib/python2.7/sre_parse.py", line 17, in <module> from sre_constants import * File "/home/nad/issue18050/u/lib/python2.7/sre_constants.py", line 18, in <module> from _sre import MAXREPEAT ImportError: cannot import name MAXREPEAT The changes for Issue13169 moved the definition of MAXREPEAT into C code and then added an import of the new C constant into Lib/sre_constants.py to continue to provide sre_constants.MAXREPEAT for third-party modules that have been using it. As long as the versions of the Python interpreter and the standard library Python files (sys.prefix/lib/pythonX.Y) remain in sync, there is not a problem. However, if a situation arises where a pre-13169 interpreter is used with a post-13169 standard library, the "cannot import name MAXREPEAT" ImportError will occur. I have found at least two situations where this can happen: 1. when a C application has statically embedded a pre-13169 interpreter and the standard library pointed to by its sys.prefix gets upgraded to a post-13169 version. The interpreter then crashes during initialization in Lib/site.py which imports re in both Python 2 and 3 (for different purposes). 2. when a virtualenv created with a pre-13169 non-shared interpreter is used with an upgraded post-13169 standard library. In this case, the interpreter makes it past initialization because virtualenv (at least, the current version) creates a modified site.py in the virtualenv lib/pythonX.Y that happens to not import re. However, the import error will occur on the first use of re. Side note: 3.3 standard library pyvenv does not seem to have this problem since the created venv symlinks to the sys.prefix interpreter and libs rather than copying it, like virtualenv does. Note that Pythons built with --enable-shared (or --enable-framework on OS X) generally will not have a problem as long as the shared libpythonX.Y and the standard library remain consistent. That is, in both cases above, a Python upgrade will automatically cause both the embedded app and the virtualenv to run with the newer interpreter. AFAICT, the problems will only be seen when using a non-shared Python. I believe the upgrades affected by this problem are: 2.7 through 2.7.3 upgraded to 2.7.4 or 2.7.5 3.3.0 upgraded to 3.3.1 or 3.3.2 3.2 through 3.2.3 upgraded to 3.2.4 or 3.2.5 (unverified) The problem should be fixable by applying a patch along the lines suggested by Samuel. Regardless of whether this is a compatibility break or not, I think we should fix the problem because people are already running into it. (Nosying the release managers for their input.) While related, the root cause of the vim problem reported above is probably more complicated because, although it appears to embed a Python interpreter, the standard library used by the OS X system vim appears to depend on $PATH, apparently incorrect behavior in vim. Unfortunately, OS X vim users on 10.8 (probably also on 10.7) may encounter this problem when they try to use :py if they install an updated version of Python 2.7, such as from python.org or a third-party distributor like Homebrew or MacPorts. And, when vim crashes due to the import error, it leaves the terminal settings in an unusable state. One user workaround might be to create a shell function or alias to tweak PATH before using vim to ensure /usr/bin/python2.7 is found first. Or simply patch re.py in the upgraded Python.
|msg190138 - (view)||Author: Samuel John (samueljohn)||Date: 2013-05-27 12:59|
Ned, incredibly helpful description. Thanks for investigating! I have nothing to add to that.
|msg190195 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2013-05-28 11:19|
I am afraid that importing MAXREPEAT is not the only issue. Short time ago, CODESIZE was increased from 2 to 4 on narrow builds (issue1160). This makes compiled patterns generated by Lib/sre_compile.py incompatible with old _sre module on narrow builds. I think it is a bad idea to mix a different versions of Python stdlib code and corresponded binary extensions. There are other examples when Python and C code changed synchronously and Python code depends on new names exposed by C module (i.e. 35ef949e85d7, b6ec3b717f7e).
|msg190252 - (view)||Author: Ned Deily (ned.deily) *||Date: 2013-05-28 21:44|
Serhly, while I don't disagree with your points, I should have made clearer that the issue here is that the _sre module is a static module (built into the interpreter executable or shared lib as shown in Modules/Setup.dist) and *not* included in the shared library (sys.prefix/lib/pythonX.Y/) whereas are_constants *is*. If the binaries produced by both the python and C files changes end up in sys.prefix/lib/pythonX.Y, there is not a problem. That's normally the case and I believe that is the case with both of the other examples you cited. So they are not going to exhibit this problem. The problem is when a change introduces a dependency between static and shared modules, like this one does.
|msg190343 - (view)||Author: Ned Deily (ned.deily) *||Date: 2013-05-30 06:16|
Another report of users being affected by this issue: https://trac.macports.org/ticket/39207
|msg190344 - (view)||Author: Ned Deily (ned.deily) *||Date: 2013-05-30 06:25|
|msg190351 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2013-05-30 09:38|
I just think that the patch only silences an import error. Is test_re passed with this patch on 32-bit platform with <=2.7.3 static binaries and 2.7.5 py-files?
|msg197897 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2013-09-16 13:32|
Well. While the running different versions of binaries and Python files is not a good idea, perhaps we can apply this change. But only for 2.7 and 3.3. There is no need in this garbage in 3.4. I'm still not sure that there are no other inconsistencies between old static binaries and newer Python files.
|msg197955 - (view)||Author: Ned Deily (ned.deily) *||Date: 2013-09-17 00:29|
The patch LGTM. And I agree that the fix is not needed for 3.4. Thanks, Serhiy. I verified that it does solve the "embedded" problem (case 1 above) when using embedded versions of all previous releases of 2.7.x (except 2.7.0) and 3.3.x. For the record, it appears 2.7.1 introduced a separate incompatibility issue that causes a similar initialization crash with an embedded version of 2.7.0: Traceback (most recent call last): File "/py/test_issue18050/root/lib/python2.7/site.py", line 62, in <module> import os File "/py/test_issue18050/root/lib/python2.7/os.py", line 398, in <module> import UserDict File "/py/test_issue18050/root/lib/python2.7/UserDict.py", line 83, in <module> import _abcoll File "/py/test_issue18050/root/lib/python2.7/_abcoll.py", line 11, in <module> from abc import ABCMeta, abstractmethod File "/py/test_issue18050/root/lib/python2.7/abc.py", line 8, in <module> from _weakrefset import WeakSet File "/py/test_issue18050/root/lib/python2.7/_weakrefset.py", line 5, in <module> from _weakref import ref ImportError: No module named _weakref Googling shows a number of reports of users who have run into that one, too, though no one seems to have opened an issue here about it. I don't think it's worth trying to fix that one at this point as there probably aren't that many instances of system executables that still have embedded static 2.7.0 interpreters anymore (I hope). Unfortunately, that's not the case for embedded static 2.7.2, e.g. vim on OS X 10.8.x.
|msg197958 - (view)||Author: Ned Deily (ned.deily) *||Date: 2013-09-17 03:02|
To answer your earlier question, there are other inter-version incompatibilities in some of the non-static standard library modules such that test_re cannot be run without errors. However, applying the patch at least allows the embedded interpreter to not crash during initialization, a big improvement over the current situation.
|msg198156 - (view)||Author: Roundup Robot (python-dev)||Date: 2013-09-20 18:30|
New changeset 68a7d77a90c3 by Serhiy Storchaka in branch '3.3': Issue #18050: Fixed an incompatibility of the re module with Python 3.3.0 http://hg.python.org/cpython/rev/68a7d77a90c3 New changeset f27af2243e2a by Serhiy Storchaka in branch '2.7': Issue #18050: Fixed an incompatibility of the re module with Python 2.7.3 http://hg.python.org/cpython/rev/f27af2243e2a
|msg198159 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2013-09-20 18:42|
Thank you Samuel for your report and suggested solution. Thank you Ned for additional investigating.
|2013-09-20 18:42:48||serhiy.storchaka||set||status: open -> closed|
messages: + msg198159
stage: commit review -> resolved
messages: + msg198156
|2013-09-17 03:02:21||ned.deily||set||messages: + msg197958|
stage: patch review -> commit review
versions: - Python 3.4
messages: + msg197897
keywords: + patch
stage: needs patch -> patch review
|2013-05-30 09:38:17||serhiy.storchaka||set||messages: + msg190351|
|2013-05-30 06:25:28||ned.deily||set||messages: + msg190344|
|2013-05-30 06:16:37||ned.deily||set||messages: + msg190343|
|2013-05-28 21:44:48||ned.deily||set||messages: + msg190252|
|2013-05-28 11:19:09||serhiy.storchaka||set||messages: + msg190195|
|2013-05-27 12:59:31||samueljohn||set||messages: + msg190138|
|2013-05-27 06:01:10||ned.deily||set||priority: normal -> high|
components: + Library (Lib), - Extension Modules
title: _sre.MAXREPEAT not defined in 2.7.3 -> embedded interpreter or virtualenv fails with "ImportError: cannot import name MAXREPEAT"
nosy: + georg.brandl, benjamin.peterson, serhiy.storchaka
versions: + Python 3.3, Python 3.4
messages: + msg190132
stage: needs patch