classification
Title: access to cdecimal / libmpdec API
Type: enhancement Stage: resolved
Components: Versions:
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Add a minimal decimal capsule API
View: 41324
Assigned To: Nosy List: belopolsky, larry, lemburg, loewis, p-ganssle, pitrou, scoder, skrah
Priority: normal Keywords: patch

Created on 2014-08-14 13:29 by pitrou, last changed 2020-07-17 11:24 by skrah. This issue is now closed.

Files
File name Uploaded Description Edit
api-demo.c skrah, 2014-08-15 18:27
Messages (31)
msg225297 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-14 13:29
Currently cdecimal exports no C API that I know of, and it makes sure the libmpdec symbols are kept private in the .so file. It would be nice for external C code (or, in general, non-Python code) to be able to access cdecimal objects, and make operations on them, without the huge overhead of the regular C Python API.
msg225357 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-08-15 18:37
I'm a little unsure what to do with the API, see also #15237:

  1) I'm not too fond of the Capsule method, especially because
     it *seems* possible to get at the symbols directly on Linux
     and Windows (provided that they aren't static of course).

  2) I would not like to spend time on it if we go ahead and
     make decimal a builtin (double effort).

  3) It's not clear whether users would not be served better by
     using functions from libmpdec directly (much faster,
     probably less error handling).


For 3) I've attached a draft that illustrates the issue.
msg225588 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-20 20:31
>  3) It's not clear whether users would not be served better by
>     using functions from libmpdec directly (much faster,
>     probably less error handling).

That's what I meant. The issue here is that Python's libmpdec is not exposed to third-party code at all. Also there should probably be a (thin?) API to get at the underlying mpdec object from a cdecimal PyObject (apologies for the poor wording, I'm actually not acquainted with the libmpdec APIs).

As for the Capsule method, well, at least it would be better than nothing (or than any platform-specific hack).

>  2) I would not like to spend time on it if we go ahead and
>     make decimal a builtin (double effort).

I haven't heard any consensus on that yet :-)

(for the record, the context is that we would like to support decimal objects efficiently in Numba)
msg225589 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-20 20:54
Note that you could expose a C API even if decimal didn't become built-in, see Include/datetime.h for example.
msg225590 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-20 21:10
Relatedly, is the libmpdec ABI stable? That is, if I build a separate libmpdec, can I expect it to handle cdecimal's innards fine, or will there be problems?
msg225599 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2014-08-21 06:21
> (for the record, the context is that we would like to support decimal objects efficiently in Numba)

Same for Cython, although I guess we wouldn't do more than shipping the necessary declarations and (likely) also enable auto-coercion between the libmpdec decimal type (struct?) and CPython's decimal type, in the same way that we do it for byte strings.

Thus, a public header file with the necessary type checking and packing/unpacking C-API functions would be nice and also sufficient for now.
msg225762 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-08-23 20:42
> That's what I meant. The issue here is that Python's libmpdec is not exposed
> to third-party code at all. Also there should probably be a (thin?) API to get
> at the underlying mpdec object from a cdecimal PyObject (apologies for the poor
> wording, I'm actually not acquainted with the libmpdec APIs).


People were asking for libmpdec symbols to be hidden (#16745). It's easy to
revert, just a couple of pragmas in the headers.



> As for the Capsule method, well, at least it would be better than nothing
> (or than any platform-specific hack).

Platform specific maybe, but no hack:  I was thinking about storing the DSO
handle in the PyModuleObject struct and add functions to lookup symbols.  
I'm attaching a diff for Linux -- It has been a while, but I'm rather certain 
that the corresponding scheme also worked on Windows (can't test now, my 
Windows VM is defunct).

That would leave the usual troublemakers AIX etc., which have sketchy support
anyway.

Of course getting all symbols for a module should be done in one separate
file (_decimal_api.c), the diff is just a demonstration.



> Relatedly, is the libmpdec ABI stable?

Yes, starting from 2.4.0.


> That is, if I build a separate libmpdec, can I expect it to handle cdecimal's
> innards fine, or will there be problems?

You need to initialize libmpdec in the exact same way as in _decimal.c.
Generally it should be fine (though I'm not thrilled by the idea :), but
I suspect there could be locale problems on Windows with a second libmpdec.
msg225768 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-23 22:44
Le 23/08/2014 16:42, Stefan Krah a écrit :
>
>> That's what I meant. The issue here is that Python's libmpdec is not exposed
>> to third-party code at all. Also there should probably be a (thin?) API to get
>> at the underlying mpdec object from a cdecimal PyObject (apologies for the poor
>> wording, I'm actually not acquainted with the libmpdec APIs).
>
> People were asking for libmpdec symbols to be hidden (#16745). It's easy to
> revert, just a couple of pragmas in the headers.

Who are those people? #16745 was opened by you :-)

> Platform specific maybe, but no hack:  I was thinking about storing the DSO
> handle in the PyModuleObject struct and add functions to lookup symbols.
> I'm attaching a diff for Linux -- It has been a while, but I'm rather certain
> that the corresponding scheme also worked on Windows (can't test now, my
> Windows VM is defunct).

How does it work? I've tried to dlopen() and then dlsym() the _decimal 
file manually, it wouldn't work for private (e.g. mpd) symbols.

> That would leave the usual troublemakers AIX etc., which have sketchy support
> anyway.

That sounds rather worrying. How about OS X? Why would that whole scheme 
be better than a capsule?
msg225805 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-08-24 08:56
Antoine Pitrou <report@bugs.python.org> wrote:
> Who are those people? #16745 was opened by you :-)

MvL, in #4555 (msg176486).

> > Platform specific maybe, but no hack:  I was thinking about storing the DSO
> > handle in the PyModuleObject struct and add functions to lookup symbols.
> > I'm attaching a diff for Linux -- It has been a while, but I'm rather certain
> > that the corresponding scheme also worked on Windows (can't test now, my
> > Windows VM is defunct).
> 
> How does it work? I've tried to dlopen() and then dlsym() the _decimal 
> file manually, it wouldn't work for private (e.g. mpd) symbols.

Yes, the symbols would need to be public, see module_get_symbol.diff.

> 
> > That would leave the usual troublemakers AIX etc., which have sketchy support
> > anyway.
> 
> That sounds rather worrying. How about OS X? Why would that whole scheme 
> be better than a capsule?

I'm not sure about OS X, but I would be surprised if it did not work.

To my limited knowledge, Capsules are slow, see also here (the penultimate
paragraph):

https://mail.python.org/pipermail/python-dev/2013-March/124481.html

Stefan (Behnel), could you comment on the strategy that you had in mind?
Is it similar to module_get_symbol.diff or entirely different?
msg225821 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-24 14:15
Le 24/08/2014 04:56, Stefan Krah a écrit :
>
> I'm not sure about OS X, but I would be surprised if it did not work.
>
> To my limited knowledge, Capsules are slow, see also here (the penultimate
> paragraph):

They are slow if you have to lookup and unwrap a capsule every time you 
use it. But the way it would work for _decimal (and the way it already 
works for _datetime), AFAIK, would be different: you would look up the 
"capsule API structure" once and then simply dereference function 
pointers from that structure.

You can actually take a look at Include/datetime.h and see if the 
approach would work. See especially:

/* Define global variable for the C API and a macro for setting it. */
static PyDateTime_CAPI *PyDateTimeAPI = NULL;

#define PyDateTime_IMPORT \
     PyDateTimeAPI = (PyDateTime_CAPI 
*)PyCapsule_Import(PyDateTime_CAPSULE_NAME, 0)

which encourages a once-in-a-process-lifetime API lookup pattern.
I don't think a hand-coded dlsym()-based approach can be any 
significantly faster.
msg225825 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-08-24 14:58
Ah yes, the array of function pointers is directly accessible. I did not look
close enough -- reading the word "spam" 100x in the docs always makes me skim
the text. ;)
msg225826 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-24 15:02
> MvL, in #4555 (msg176486).

Ok, I'm cc'ing Martin then :-)
Note RTLD_LOCAL seems to be the default with dlopen(). Now I don't know how that behaves when you have a chained library loading, e.g.:

  Apache --> dlopen(Python dll) --> dlopen(_decimal dll)

_decimal is an interesting case, since AFAIK with other C extensions it isn't really interesting to access the underlying C library, but with _decimal not being able to access libmpdec would force _decimal to duplicate a large part of the libmpdec API with a PyDec prefix added in front of it.

(which means that, perhaps, the answer is to make the mpd_ prefix configurable with a #define?)
msg225827 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-08-24 15:11
Antoine Pitrou <report@bugs.python.org> wrote:
> (which means that, perhaps, the answer is to make the mpd_ prefix configurable with a #define?)

I don't know 100% what you have in mind, but Debian and Arch already ship
--with-system-libmpdec, so only the mpd_* functions would be available.
msg225828 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-24 15:14
Le 24/08/2014 11:11, Stefan Krah a écrit :
>
> Antoine Pitrou <report@bugs.python.org> wrote:
>> (which means that, perhaps, the answer is to make the mpd_ prefix configurable with a #define?)
>
> I don't know 100% what you have in mind, but Debian and Arch already ship
> --with-system-libmpdec, so only the mpd_* functions would be available.

Ah... that probably kills the idea then. I was thinking in the context 
of a private-built libmpdec and had forgotten about that possibility.
msg226035 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2014-08-28 17:56
> Stefan (Behnel), could you comment on the strategy that you had in mind?
> Is it similar to module_get_symbol.diff or entirely different?

I agree with Antoine that a Capsule would work better here (and yes, the performance problem of capsules is only with cases where they need to be unpacked frequently). Here is our current API for datetime as an example:

https://github.com/cython/cython/blob/e6c13f8922d6406f64f2578f5a0041e1615291a3/Cython/Includes/cpython/datetime.pxd

It's not great (it would be possible to make it look more OOP-ish), but it's simple and quite easy to use. The top of the file contains the necessary header declarations, and the rest are inline C wrapper functions that basically just rename the existing capsule C-API functions and macros to make them easily and nicely callable from Cython code without having to care about the Capsule and its list of C functions.

The declarations for _cdecimal would use a similar scheme and additionally include the libmpdec header declarations so that users could work with the underlying C data directly with a single (c-)import. That would then require the libmpdec symbols to be available, though, also when it's linked internally into _cdecimal.
msg226176 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-08-31 13:05
Thanks, Stefan.  So everyone agrees that Capsule is the right way for the API.


Then this issue is about making the libmpdec symbols public.  I've tried
to produce a collision with duplicate symbols as outlined in msg176486,
but I haven't been successful (on Linux).
msg226299 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-09-03 07:32
I think this is a duplicate of #15237.
msg226304 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-09-03 09:07
Well, we have two issues now:

  1) Make the _decimal API available via capsule.

  2) Make the libmpdec symbols public (i.e. remove "GCC visibility push(hidden)"
     from Modules/_decimal/libmpdec/mpdecimal.h.


The question here is now whether 2) is safe. Note that active symbol
hiding has always only worked for gcc (but I think on Windows and AIX
the symbols are hidden by default anyway).


A third option is to make both the _decimal and libmpdec APIs available
via capsule, which is a lot of work (300 functions). Also people would
likely want the API to work on 2.7, which would mean that large parts
of the cdecimal on PyPI (which uses the incompatible libmpdec-2.3)
would need to be rewritten.
msg226310 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-03 12:54
> large parts of the cdecimal on PyPI (which uses the incompatible libmpdec-2.3) would need to be rewritten.

Ah, so it has an incompatible ABI? That will complicate things a bit :-)
msg226311 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-03 12:56
> Note that active symbol hiding has always only worked for gcc (but I think on Windows and AIX the symbols are hidden by default anyway).

Does it mean a separate Windows and AIX solution should be found?
I think if we can't make the mpd symbols available in a cross-platform way, the incentive starts getting much lower :-)
msg226313 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-09-03 16:33
Antoine Pitrou <report@bugs.python.org> wrote:
> > large parts of the cdecimal on PyPI (which uses the incompatible libmpdec-2.3) would need to be rewritten.
>
> Ah, so it has an incompatible ABI? That will complicate things a bit :-)

Yes, cdecimal on PyPI is slower, has a different ABI, uses libmpdec-2.3,
has subtle differences in the context handling, cannot subclass the
context, ... ;)

I think a common C API for 2.7 is only possible with a real backport.
msg226314 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-03 17:00
Le 03/09/2014 18:33, Stefan Krah a écrit :
> 
> 
> Yes, cdecimal on PyPI is slower, has a different ABI, uses libmpdec-2.3,
> has subtle differences in the context handling, cannot subclass the
> context, ... ;)
> 
> I think a common C API for 2.7 is only possible with a real backport.

Ok, so let's ignore that and focus on 3.5?
msg226317 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-09-03 19:37
Sure, if there are people who write python3-only C modules (I can't think
of one right now).
msg226318 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-03 20:08
If there aren't today, there will be in a few years time, and they'll be glad they can support Python 3.5.
msg226322 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-09-03 20:48
Are there any other modules where the capsule API works in both CPython and PyPy?  I thought capsule APIs were decidedly implementation-specific.

Not that I'm not for it in theory.  But this is some crazy uncharted hyper-compatibility territory we're talking about here.
msg226323 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-09-03 20:58
The compatibility discussion was for the cdecimal-2.3 package that's
hosted on PyPI and used for Python 2.7.  IOW, will people use a capsule
API that's Python-3 only?

Compatibility with pypy would be esoteric indeed. :)
msg228427 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-10-04 02:57
I'd like to focus this issue a bit. Antoine originally proposed that non-Python code might want to access libmpdec. However, given that this is now a separate project (as it seems), I don't think it's Python's task to make the API available. If it is a separate library really, you shouldn't need to link to Python in order to use it. IOW, the solution should be to use the system libmpdec.

So the focus should be on other Python C modules that want to access decimal objects. Are there precedents of such modules? What API features do they actually need?
msg228452 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-10-04 12:47
libmpdec has always been a separate project, predating the integration
into Python-3.3.  Before the Python-3.3 release, Jim Jewett suggested
a cleaner library/module separation (and he was right, it made the
code much more readable).

Then distributors wanted --with-system-libmpdec, so here we are.


Let's discuss the _decimal capsule API in #15237.


As for the libmpdec symbols: I agree it is somewhat unclean to
do this on the Python side.

For *some* functions it is also tricky (but by no means impossible)
to use them directly and expect the same results as from _decimal.

Maybe a solution for numba is to just assume --with-system-libmpdec.
libmpdec is extremely easy to package, distributions just have to
ensure that their Python version matches the libmpdec version.
msg228453 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-10-04 12:49
Of course Windows is a problem. I do not know how to implement
--with-system-libmpdec on Windows.
msg228474 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-04 18:35
For the record, we have realized that pure C code brought less than a 2x speedup compared to straightforward Python code, so cdecimal support has become a low-priority concern for Numba. For the curious, the C code is from libmpdec's own benchmark (Web site seems down at the moment) and the Python code is here: https://gist.github.com/pitrou/9eb2a5651b99f2e3c4ce
msg373825 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2020-07-17 11:24
Closing in favor of #41324, which adds just the most important
functions.
History
Date User Action Args
2020-07-17 11:24:39skrahsetstatus: open -> closed
resolution: duplicate
messages: + msg373825

superseder: Add a minimal decimal capsule API
stage: resolved
2019-02-27 16:40:10p-gansslesetnosy: + p-ganssle
2014-10-04 18:35:24pitrousetmessages: + msg228474
2014-10-04 12:49:54skrahsetmessages: + msg228453
2014-10-04 12:47:39skrahsetmessages: + msg228452
2014-10-04 02:57:51loewissetmessages: + msg228427
2014-09-29 19:35:50belopolskysetnosy: + belopolsky
2014-09-03 20:58:14skrahsetmessages: + msg226323
2014-09-03 20:48:17larrysetmessages: + msg226322
2014-09-03 20:08:25pitrousetmessages: + msg226318
2014-09-03 19:37:18skrahsetmessages: + msg226317
2014-09-03 17:00:14pitrousetmessages: + msg226314
2014-09-03 16:33:21skrahsetmessages: + msg226313
2014-09-03 12:56:41pitrousetmessages: + msg226311
2014-09-03 12:54:09pitrousetmessages: + msg226310
2014-09-03 09:07:59skrahsetmessages: + msg226304
2014-09-03 07:32:41larrysetnosy: + larry
messages: + msg226299
2014-08-31 13:05:59skrahsetmessages: + msg226176
2014-08-31 12:59:42skrahsetfiles: - module_get_symbol.diff
2014-08-28 17:56:00scodersetmessages: + msg226035
2014-08-24 15:14:10pitrousetmessages: + msg225828
2014-08-24 15:11:20skrahsetmessages: + msg225827
2014-08-24 15:02:33pitrousetnosy: + loewis
messages: + msg225826
2014-08-24 14:58:47skrahsetmessages: + msg225825
2014-08-24 14:15:26pitrousetmessages: + msg225821
2014-08-24 08:56:07skrahsetmessages: + msg225805
2014-08-23 22:44:48pitrousetmessages: + msg225768
2014-08-23 20:42:30skrahsetfiles: + module_get_symbol.diff
keywords: + patch
messages: + msg225762
2014-08-21 13:06:26pitrousetnosy: + lemburg
2014-08-21 06:21:24scodersetnosy: + scoder
messages: + msg225599
2014-08-20 21:10:51pitrousetmessages: + msg225590
2014-08-20 20:54:28pitrousetmessages: + msg225589
2014-08-20 20:31:27pitrousetmessages: + msg225588
2014-08-15 18:37:34skrahsetmessages: + msg225357
2014-08-15 18:27:33skrahsetfiles: + api-demo.c
2014-08-14 13:29:57pitroucreate