classification
Title: Fix fpectl-induced ABI breakage
Type: Stage:
Components: Extension Modules Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Dima Pasechnik, mark.dickinson, ncoghlan, njs
Priority: normal Keywords:

Created on 2017-01-03 00:15 by njs, last changed 2017-05-17 22:42 by njs.

Messages (12)
msg284511 - (view) Author: Nathaniel Smith (njs) * Date: 2017-01-03 00:15
It turns out that CPython built with --with-fpectl has a different ABI than CPython built without --with-fpectl (which is the default). Specifically, if you have an extension module built against a --with-fpectl CPython, and it uses the PyFPE_{START,END}_PROTECT macros (as e.g. Cython modules do), then it ends up referring to some symbols that aren't provided by no-fpectl CPython builds.

These macros are part of the stable ABI, so it's possible (though unlikely?) that there are existing modules using the stable ABI that refer to these symbols.

Mailing list discussion:
   December: https://mail.python.org/pipermail/python-dev/2016-December/147065.html
   January: https://mail.python.org/pipermail/python-dev/2017-January/147092.html
   Guido's "let's get rid of it": https://mail.python.org/pipermail/python-dev/2017-January/147094.html

There are 3 parts to the fpectl machinery:
- macros PyFPE_{START,END}_PROTECT in Include/pyfpe.h, which are part of the public C API. Depending on --with-fpectl, these are either no-ops, or else refer to:
- global symbols PyFPE_{jbuf,counter,dummy}, defined in Python/pyfpe.c iff --with-fpectl is enabled.
- the stdlib module 'fpectl' (Modules/fpectlmodule.c) which provides some Python APIs that make use of the information that the macros put into the global symbols. (If the user doesn't use fpectl, then the macros do nothing except update the global variables.)

From the python-dev discussion, I think the resolution is:

- for all supported CPython releases, we should modify Python/pyfpe.c so that the PyFPE_jbuf and PyFPE_counter are defined unconditionally. (I.e., remove the #ifdef WANT_SIGFPE_HANDLER that currently protects their definitions). After this change, all CPython builds will be able to import all CPython extension modules (though the actual fpectl functionality may or may not be available).

- in the next 3.5 and maybe 3.4 releases, we should add a deprecation warning to the fpectl module.

- in the next 2.7 release, we should add a Py3k warning to the fpectl module.

- in trunk / 3.7, we should remove --with-fpectl and the fpectl stdlib module entirely. For stable API compatibility we might want to leave the PyFPE_* macros (defined as no-ops) and/or the PyFPE_{jbuf,counter,dummy} symbols, but nothing will actually use them.
msg284529 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-01-03 01:47
- 3.4 is already in security-fix only mode so we can safely ignore it for this purpose

- 3.5.3 is likely to be the last general bugfix release of 3.5, so we can probably skip that as well

- that would mean the ABI compability shims would only go in 3.6.1 and the next 2.7 release

At that point, does it actually make sense to provide the shims? Or should we instead just add the deprecation warnings and say "Don't use the --with-fpectl option, as it doesn't work properly, and breaks ABI compatibility for extension modules built with that Python"?

And then add a build time "#pragma message '--with-fpectl' is deprecated as it breaks extension module ABI compatibility" to the WANT_SIGFPE_HANDLER branch in Include/fpectl.h
msg284532 - (view) Author: Nathaniel Smith (njs) * Date: 2017-01-03 02:26
> At that point, does it actually make sense to provide the shims? Or should we instead just add the deprecation warnings and say "Don't use the --with-fpectl option, as it doesn't work properly, and breaks ABI compatibility for extension modules built with that Python"?

"Providing the shims" consists of deleting two lines of code, so eh, why not? And in theory there could be "stable ABI" extensions that depend on the shims. But I agree that it doesn't make much difference either way.

> And then add a build time "#pragma message '--with-fpectl' is deprecated as it breaks extension module ABI compatibility" to the WANT_SIGFPE_HANDLER branch in Include/fpectl.h

This would hassle every end user who builds extension modules on Debian/Ubuntu (b/c their default Python build uses --with-fpectl). But end users can't do anything about how Debian/Ubuntu build CPython. And in fact Debian/Ubuntu can't do anything about how Debian/Ubuntu build CPython either until 3.7 comes out and breaks ABI, because switching now would break installed systems...

So long as we keep the PyFPE_* macros as no-ops (which technically we have to to preserve the stable ABI), then there's no need to break compatibility at the C API level. The place we want to break compatibility is by dropping the Python-level fpectl package, so I think that's where we should warn.
msg284540 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-01-03 06:29
Ah, cool - I didn't know that Debian built with fpectl enabled by default. In that case, +1 for:

- low maintenance ABI and API compatibility shims that are kept around indefinitely (since they leaked into the limited API/stable ABI) but don't actually do anything
- deprecating and removing the fpectl module
- adding the ABI compatibility shims in the maintenance releases where the retroactive deprecation warnings are introduced
msg293803 - (view) Author: Dima Pasechnik (Dima Pasechnik) Date: 2017-05-16 21:50
While fpectl might be a bit rusty, this is a priceless tool in debugging situations, where one needs to identify components that do something wrong to FPU, see e.g.
https://github.com/numpy/numpy/issues/9007
and the related
https://trac.sagemath.org/ticket/22799

As long as there is no equivalent, it's too early to discuss getting rid of it, IMHO.
msg293806 - (view) Author: Nathaniel Smith (njs) * Date: 2017-05-16 22:06
@Dima: are you volunteering to fix and maintain it? I can see why it's useful to have some way to get at the fpu flags, but I don't see how fpectl specifically helps with that issue, and fpectl has always been broken on x86-64.
msg293816 - (view) Author: Dima Pasechnik (Dima Pasechnik) Date: 2017-05-16 22:36
@njs: Fixing this entails switching over from ieeefp to fenv, right? Looks doable, although not trivial.
It might potentially be useful for various numerics, IMHO.
msg293819 - (view) Author: Nathaniel Smith (njs) * Date: 2017-05-16 23:12
Also fixing the abi issues that started this, and probably making an argument for why it makes sense for all of cpython's built-in float operations to check the fpu flags, and to do so using a weird longjmp-based mechanism that only some platforms support. The fact that it's disabled by default and has been broken for a decade+ without anyone noticing might be working against you here...

You might get the impression that I think this is a bad idea. I do :-). But I am genuinely trying to helpful; I'm sure people would be willing to listen to an argument, and if you want to make one then those are the sorts of issues you're likely to need some answer for.
msg293820 - (view) Author: Nathaniel Smith (njs) * Date: 2017-05-16 23:15
Another option you might want to consider is proposing to add a proper fpu control flag setting/checking API to the math module.
msg293870 - (view) Author: Dima Pasechnik (Dima Pasechnik) Date: 2017-05-17 18:42
@njs: to point out that usefulness of this module is not just wishful thinking. I just used it to locate, up to the line in a Python extension module written in C, a bug in Sagemath (that has perhaps 20 FPU-using extensions, some of them as large as numpy).  (Without using it we were pulling out our hair for weeks over this)

https://trac.sagemath.org/ticket/22799#comment:103

Thanks goodness that fpectl@FreeBSD is easy to fix by commenting out a couple of "fpresetsticky(fpgetsticky());" lines---fpresetsticky() is not available on 64-bit platforms.
(our primary concern is OSX, where fpectl never worked---fortunately FreeBSD is close enough relative...)
msg293872 - (view) Author: Dima Pasechnik (Dima Pasechnik) Date: 2017-05-17 18:46
PS. I would volunteer to fix it and maintain it, assuming I have some modest funding to support such an activity. (What precisely "it" should be, is another question).
msg293901 - (view) Author: Nathaniel Smith (njs) * Date: 2017-05-17 22:42
@Dima:
> @njs: to point out that usefulness of this module is not just wishful thinking. I just used it to locate, up to the line in a Python extension module written in C, a bug in Sagemath (that has perhaps 20 FPU-using extensions, some of them as large as numpy).  (Without using it we were pulling out our hair for weeks over this)

That's pretty cool :-). But from skimming your link, it sounds like it would have been sufficient in your case to add a call to "fesetmask(FP_X_INV)" using C or Cython or ctypes (or directly in gdb), and then running the program under gdb to get a backtrace where the SIGFPE was delivered? Is that correct? Or did your debugging depend on the specific fpectl machinery for responding to that signal?

> PS. I would volunteer to fix it and maintain it, assuming I have some modest funding to support such an activity.

I'm not personally aware of any funding sources for this, if that's the question.
History
Date User Action Args
2017-05-17 22:42:30njssetmessages: + msg293901
2017-05-17 18:46:50Dima Pasechniksetmessages: + msg293872
2017-05-17 18:42:25Dima Pasechniksetmessages: + msg293870
2017-05-16 23:15:22njssetmessages: + msg293820
2017-05-16 23:12:06njssetmessages: + msg293819
2017-05-16 22:36:07Dima Pasechniksetmessages: + msg293816
2017-05-16 22:06:41njssetmessages: + msg293806
2017-05-16 21:50:16Dima Pasechniksetnosy: + Dima Pasechnik
messages: + msg293803
2017-01-10 08:59:36mark.dickinsonsetnosy: + mark.dickinson
2017-01-03 06:29:29ncoghlansetmessages: + msg284540
2017-01-03 02:26:36njssetmessages: + msg284532
2017-01-03 01:47:58ncoghlansetmessages: + msg284529
2017-01-03 01:26:38ncoghlansetnosy: + ncoghlan
2017-01-03 00:15:10njscreate