Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyLong: use GMP #66121

Closed
hvenev mannequin opened this issue Jul 5, 2014 · 22 comments
Closed

PyLong: use GMP #66121

hvenev mannequin opened this issue Jul 5, 2014 · 22 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@hvenev
Copy link
Mannequin

hvenev mannequin commented Jul 5, 2014

BPO 21922
Nosy @tim-one, @rhettinger, @mdickinson, @pitrou, @vstinner, @skrah, @MojoVampire, @hvenev, @skirpichev
Files
  • pygmp.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-10-01.00:43:42.831>
    created_at = <Date 2014-07-05.12:22:24.854>
    labels = ['interpreter-core', 'performance']
    title = 'PyLong: use GMP'
    updated_at = <Date 2021-03-07.09:44:41.079>
    user = 'https://github.com/hvenev'

    bugs.python.org fields:

    activity = <Date 2021-03-07.09:44:41.079>
    actor = 'Sergey.Kirpichev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-10-01.00:43:42.831>
    closer = 'benjamin.peterson'
    components = ['Interpreter Core']
    creation = <Date 2014-07-05.12:22:24.854>
    creator = 'h.venev'
    dependencies = []
    files = ['35867']
    hgrepos = []
    issue_num = 21922
    keywords = ['patch']
    message_count = 22.0
    messages = ['222347', '222348', '222350', '222351', '222361', '222368', '222369', '222370', '222371', '222372', '222389', '222507', '222508', '222547', '222548', '222550', '222826', '222851', '222988', '222999', '223044', '228044']
    nosy_count = 10.0
    nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'pitrou', 'vstinner', 'casevh', 'skrah', 'josh.r', 'h.venev', 'Sergey.Kirpichev']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = None
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue21922'
    versions = ['Python 3.5']

    @hvenev
    Copy link
    Mannequin Author

    hvenev mannequin commented Jul 5, 2014

    I have implemented the PyLong interface using the GMP mpn functions. API/ABI compatibility is retained (except for longintrepr).

    It can be enabled by passing --enable-big-digits=gmp to ./configure.

    No large performance regressions have been observed for small numbers (a few operations are about 10% slower). For large numbers some operations are a lot faster.

    There is also int.__gcd__ which may be used by fractions.gcd.

    The GIL is sometimes released. Minimum number of digis for releasing GIL:

    • multiplication - 64
    • division - 64,
    • modular exponentiation - 16,
    • base conversion - 64 (256 for binary bases)
    • GCD - 16

    The tests for long, float, decimal, fractions, string, unicode, bytes, pickle, marshal and enum pass. The tests for int fail because the error messages are a bit different (when creating int from bytes or bytearray the value is not shown). I may have run other tests and they have not failed. I have not tested on anything but x86-64.

    The following testcases yield 42x performace improvement:

    • 16384-bit RSA on 8 threads on quad-core with HT # GIL released
    • Multiplying 5600000-bit ints
    • Dividing 6000000-bit ints
    • Converting 300000-character str to int(base=10)
    • Converting 1250000-bit int to str

    @hvenev hvenev mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Jul 5, 2014
    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 5, 2014

    Did you mean to upload a patch?

    @vstinner
    Copy link
    Member

    vstinner commented Jul 5, 2014

    Hi, I worked on a similar patch 6 years ago, while Python 3.0 was developped:
    https://mail.python.org/pipermail/python-dev/2008-November/083315.html
    http://bugs.python.org/issue1814

    The summary is that using GMP makes Python slower because most numbers are small: fit in [-2^31; 2^31-1], and GMP allocation is expensive.

    There is also a license issue: GMP license is GPL which is not compatible with the Python license.

    If you want to work on large numbers, you can gmpy:
    https://code.google.com/p/gmpy/

    """
    The following testcases yield 42x performace improvement:

    • 16384-bit RSA on 8 threads on quad-core with HT # GIL released
    • Multiplying 5600000-bit ints
    • Dividing 6000000-bit ints
    • Converting 300000-character str to int(base=10)
    • Converting 1250000-bit int to str
      """

    That's not a common use case. Run the Python benchmark suite with your patch to see if your patch has a similar overhead than my old patch.
    http://hg.python.org/benchmarks

    @hvenev
    Copy link
    Mannequin Author

    hvenev mannequin commented Jul 5, 2014

    PyLongObject is a PyVarObject. It contains many mp_limb_t's. There is little overhead. For some operations if the result is in [-20;256] no memory will be allocated. There are special codepaths for 1-limb operations.

    And I just finished GDB support.

    Please test if it works for you.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 5, 2014

    Hmm, the license (LGPL) should only matter for the Windows binaries
    and we can just compile without --enable-big-digits=gmp.

    Even *if* the Windows binaries were built with gmp support, it would
    be sufficient for any redistributor to point to the external library
    sources for gmp that the build has actually used -- your own code
    does not automatically become LGPL licensed and you are under no
    obligation to reveal it.

    I haven't looked at the patch in detail, but I don't have any
    fundamental objection against adding a configure option, provided
    that the additional code is well isolated (which seems to be the
    case).

    Of course we'd have to set up buildbots for the option etc...

    @hvenev
    Copy link
    Mannequin Author

    hvenev mannequin commented Jul 5, 2014

    After some minor optimizations my implementation is about 1.8% slower on pystone and about 4% slower on bm_nqueens. It's 4 times faster on bm_pidigits.

    @vstinner
    Copy link
    Member

    vstinner commented Jul 5, 2014

    Please try the Python benchmark suite.

    @mdickinson
    Copy link
    Member

    I *do* have an objection to adding the configure option: from that point on, it means that maintaining the GMP-based long implementation is now the responsibility of the core developers, and I think that's an unnecessary maintenance burden, for an option that few users will care about. I think having two long integer implementations in the core is worse than having one.

    @vstinner
    Copy link
    Member

    vstinner commented Jul 5, 2014

    I think having two long integer implementations in the core is worse than having one.

    I agree. If the GMP implementation is accepted, the old implementation must be dropped and replaced by the GMP implementation.

    @mdickinson
    Copy link
    Member

    If the GMP implementation is accepted, the old implementation must be
    dropped and replaced by the GMP implementation.

    Agreed. I'm open to that, but it's critical that common use-cases (i.e., those *not* using 1000-digit integers!) aren't slowed down.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 6, 2014

    Note that we could probably release the GIL in the current implementation, too - we just haven't bothered adding such an optimization.

    @hvenev
    Copy link
    Mannequin Author

    hvenev mannequin commented Jul 7, 2014

    After optimization, tests on small ints (< 2**30)

    Currently only addition, subtraction, negation and ~ are a bit slower (< 5%).

    Most other operations are the same. Bitwise operators, //, %, ** and pow are faster.

    Converting to and from strings is a bit faster. pickle, marshal and json are faster.

    bm_nqueens is a bit slower. pystone is a bit faster. There are no performance regressions in other benchmarks.

    When I fix +,- and ~ I will reupload the patch.

    @vstinner
    Copy link
    Member

    vstinner commented Jul 7, 2014

    IMO you must discuss the GMP license on the python-dev mailing list since we wrote that if the GMP patch is accepted, it will not be optional and so affect all platforms.

    @mdickinson
    Copy link
    Member

    IMO you must discuss the GMP license on the python-dev mailing list

    I think the maintenance implications of having another external dependency would also need discussion on python-dev.

    @mdickinson
    Copy link
    Member

    Hmm. Looking back at my previous comments, I should explain my negativity a bit better.

    1. Like Victor's bpo-1814 work, this is great stuff, and very impressive.

    2. I'm really not convinced that it belongs in the Python core.

    To expand on 2: we already have a simple, highly portable, battle-tested implementation of big integers that's reasonably efficient for normal uses and requires little day-to-day maintenance. We'd be replacing that with something that's a lot more complicated, less thoroughly tested, and *not* significantly more efficient in normal use-cases. Apart from the pain of the transition (and any refactor of this size is bound to involve some pain), I'm particularly worried about future headaches involved in maintaining the external GMP dependency: keeping up with bugfixes, managing the binary builds, etc. I anticipate that that would add quite of a lot of work for the core team in general and those building releases in particular. (And that's another reason that we should have a python-dev discussion, so that those involved in the release process get a chance to weigh in.) To offset that, there needs to be a clear *benefit* to making this change.

    A couple of specific questions for Hristo Venev:

    • What's the current status of GMP on Windows? IIUC, one of the motivations for the MPIR unfriendly fork of GMP was better Windows support. Does your patch already build cleanly on Windows?

    • Whom do you see this change benefiting? Those doing serious large-number stuff on Python (SAGE users, for example) are presumably already using gmpy2 or something similar.

    • Do you want to initiate the python-dev discussion, or leave that to someone else?

    [I'm deliberately steering clear of the licensing issues; it needs discussion, but IANAL and I have nothing useful to contribute here.]

    @mdickinson
    Copy link
    Member

    @mdickinson
    Copy link
    Member

    ... and if there's one person who's *very* well placed to comment on the ease or difficulty of keeping up with GMP/MPIR (especially on Windows), it's Case Van Horsen, who I notice has recently added himself to the nosy. @casevh: any comments?

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Jul 12, 2014

    Disclaimer: as Mark alluded to, I maintain gmpy2.

    Some comments on MPIR/GMP:

    For all practical purposes, building GMP on Windows requires some version of the mingw compiler toolchain. None of the performance gains of custom assembly code is available if GMP is build with the VS compiler. When compiled with mingw, GMP supports CPU detection to automatically use code optimized for the specific instruction set. This level of optimization may not be needed for Python, though.

    The MPIR fork of GMP can be built with VS. Assembly code is supported via the YASM assembler plugin. Only a single instruction set is supported by the lib/dll.

    gmpy2 currently uses MPIR. I've had no issues with its stability.

    The mpz type has a maximum precision. IIRC, the maximum length is 2^31 bits on a 32-bit platform and 2^37 on a 64-bit platform. The mpn interface may or may not have the same restrictions. This might impact code that runs correctly, but slowly, with Python's normal PyLong implementation.

    GMP does not handle out-of-memory situations gracefully. When GMP encounters a memory allocation failure (exceeding the limits above or when running our of scratch space), it will just abort. It is easy in gmpy2 to trigger an abort that will crash the Python interpreter.

    My main concern is tightly linking the Python interpreter to a specific version of GMP (i.e. whatever version is used for the Windows builds or the version provided by the distribution). As long as gmpy2 can continue to use another version of GMP, it shouldn't matter to me.

    GMP and MPIR are both licensed under LGPL v3+ (not v2+). I'll reserve any further licensing discussions for python-dev.

    I'll try to test the patch this weekend and that should answer some of my questions.

    @rhettinger
    Copy link
    Contributor

    I agree with all of Mark's objections. Unless there is a compelling win, Python is better-off without the maintenance, portability, and licensing issues.

    I have vague memories of this having been discussed a long time ago and it is unlikely that there have been any changes to the reasons for not doing it.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Jul 14, 2014

    I've successfully tested the patch. The patch works fine but there are a couple of issues:

    1. The patch relies on several new low-level functions that were introduced in the latest release of GMP 6.0.0. Most Linux distributions are still providing older versions.

    2. The new functions are not available in any version of MPIR so I can't try a Windows build.

    I've done some basic tests but I'll wait until Hristo updates the patch with his improvements before I run detailed tests.

    @mdickinson
    Copy link
    Member

    When GMP encounters a memory allocation failure (exceeding the limits above or when running our of scratch space), it will just abort.

    Ouch; that's not friendly. Seems like this is part of the reason that Armin Rigo abandoned the attempt to use GMP in PyPy.

    https://bitbucket.org/pypy/pypy/issue/892/

    @vstinner
    Copy link
    Member

    vstinner commented Oct 1, 2014

    Can we close this issue?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants