Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor algorithmic improvements for math.isqrt #90416

Closed
mdickinson opened this issue Jan 4, 2022 · 2 comments
Closed

Minor algorithmic improvements for math.isqrt #90416

mdickinson opened this issue Jan 4, 2022 · 2 comments
Labels
3.11 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage

Comments

@mdickinson
Copy link
Member

BPO 46258
Nosy @mdickinson
PRs
  • bpo-46258: Streamline isqrt fast path #30333
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-01-15.09:58:40.801>
    created_at = <Date 2022-01-04.17:30:08.804>
    labels = ['extension-modules', '3.11', 'performance']
    title = 'Minor algorithmic improvements for math.isqrt'
    updated_at = <Date 2022-01-15.09:58:40.800>
    user = 'https://github.com/mdickinson'

    bugs.python.org fields:

    activity = <Date 2022-01-15.09:58:40.800>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-01-15.09:58:40.801>
    closer = 'mark.dickinson'
    components = ['Extension Modules']
    creation = <Date 2022-01-04.17:30:08.804>
    creator = 'mark.dickinson'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46258
    keywords = ['patch']
    message_count = 2.0
    messages = ['409693', '410635']
    nosy_count = 1.0
    nosy_names = ['mark.dickinson']
    pr_nums = ['30333']
    priority = 'normal'
    resolution = None
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue46258'
    versions = ['Python 3.11']

    @mdickinson
    Copy link
    Member Author

    There are a couple of minor algorithmic improvements possible for the math.isqrt fast path (which is used for nonnegative integers smaller than 2**64). On my machine those improvements produce a little over a 10% speedup.

    The current algorithm for values under 2**64 involves exactly four division instructions, corresponding to four Newton steps. The proposal is to:

      1. Replace the first division with a table lookup. The necessary table is extremely small: 12 entries at one byte per entry.
      1. Arrange for the return type of the _approximate_sqrt helper function to be uint32_t rather than uint64_t. That means that the correction step only involves a 32-bit-by-32-bit multiplication, not a 64-bit-by-64-bit multiplication.

    The second part is a bit subtle: the input to _approximate_sqrt is a 64-bit integer n in the range [262, 264). Barring any overflow, the output u is guaranteed to satisfy (u-1)**2 < n < (u+1)**2. That implies that (u-1)**2 < 2**64, from which it follows that u <= 2**32. So the only possible case where u might overflow a uint32_t is when u == 2**32. But from the earlier inequality, that can only happen if n > (2**32 - 1)**2, and in that case the top 31 bits of n are completely determined and since the first steps of the algorithm only depend on the topmost bits of n, it's easy to follow through the algorithm and see that it's not possible for u to be 2**32 in that case. (We always get u = 128 from the lookup, followed by u = 255 after the first division, then u = 65536 after the second.)

    @mdickinson mdickinson added 3.11 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage labels Jan 4, 2022
    @mdickinson
    Copy link
    Member Author

    New changeset d02c5e9 by Mark Dickinson in branch 'main':
    bpo-46258: Streamline isqrt fast path (bpo-30333)
    d02c5e9

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant