Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using PyNumber_AsSsize_t in itertools.islice #74722

Closed
MSeifert04 mannequin opened this issue Jun 1, 2017 · 12 comments
Closed

Using PyNumber_AsSsize_t in itertools.islice #74722

MSeifert04 mannequin opened this issue Jun 1, 2017 · 12 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@MSeifert04
Copy link
Mannequin

MSeifert04 mannequin commented Jun 1, 2017

BPO 30537
Nosy @rhettinger, @mdickinson, @embray, @smarnach, @serhiy-storchaka, @jdemeyer, @MSeifert04, @wroberts
PRs
  • bpo-30537: use PyNumber in itertools.islice instead of PyLong #1918
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2017-06-08.06:41:03.998>
    created_at = <Date 2017-06-01.11:26:09.913>
    labels = ['3.7', 'type-feature', 'library']
    title = 'Using PyNumber_AsSsize_t in itertools.islice'
    updated_at = <Date 2018-01-12.15:09:24.449>
    user = 'https://github.com/MSeifert04'

    bugs.python.org fields:

    activity = <Date 2018-01-12.15:09:24.449>
    actor = 'erik.bray'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2017-06-08.06:41:03.998>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2017-06-01.11:26:09.913>
    creator = 'MSeifert'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30537
    keywords = []
    message_count = 12.0
    messages = ['294934', '295000', '295088', '295173', '295185', '295192', '295327', '295330', '295379', '295383', '309854', '309859']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'mark.dickinson', 'erik.bray', 'smarnach', 'serhiy.storchaka', 'jdemeyer', 'MSeifert', 'Will Roberts']
    pr_nums = ['1918']
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue30537'
    versions = ['Python 3.7']

    @MSeifert04
    Copy link
    Mannequin Author

    MSeifert04 mannequin commented Jun 1, 2017

    In a question on StackOverflow (https://stackoverflow.com/questions/44302946/itertools-does-not-recognize-numpy-ints-as-valid-inputs-on-python-3-6) it was mentioned that numpys scalars cannot be used as index for itertools.islice.

    The reason for this is because the numbers are converted with PyLong_AsSsize_t (which requires a PyLongObject [or subclass]) instead of PyNumber_AsSsize_t (which only requires an __index__ method).

    I'm not sure if there are many use-cases where numpy scalars make sense as inputs for islice but it's definetly unexpected that itertools.islice([1, 2, 3, 4, 5, 6], np.int32(3)) currently throws a ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.

    @MSeifert04 MSeifert04 mannequin added type-bug An unexpected behavior, bug, or error 3.7 (EOL) end of life stdlib Python modules in the Lib dir labels Jun 1, 2017
    @wroberts
    Copy link
    Mannequin

    wroberts mannequin commented Jun 2, 2017

    Note that this issue also seems to affect other methods in the itertools package, such as permutations.

    @serhiy-storchaka
    Copy link
    Member

    Adding support of more general int-like objects in islice() looks reasonable to me. I'm not sure about permutations(), but if make this change, use PyIndex_Check() instead of PyNumber_Check(), or don't use the special check at all, allowing PyNumber_AsSsize_t() to fail.

    The __setstate__() methods shouldn't be changed. We know that the __reduce__() methods return exact ints, not general int-like objects.

    When replace PyLong_AsSsize_t with PyNumber_AsSsize_t take into account that PyLong_AsSsize_t is atomic and thread-safe, while PyNumber_AsSsize_t can call Python code and release the GIL.

    @rhettinger
    Copy link
    Contributor

    [Michael Seifert]

    I'm not sure if there are many use-cases where numpy scalars
    make sense as inputs for islice

    Likewise, I'm not sure there is any evidence of a use-case. The incompatible relationship between numpy ints and PyLong_AsSsize_t() has been around for a long time and no one seems to have even noticed until now.

    Also, you're right that the error message isn't very helpful unless you already understand that np.int32 isn't a subclass of int.

    [Serhiy]

    Adding support of more general int-like objects in islice()
    looks reasonable to me.

    Conceptually, anything with an __index__ method might make sense in the context of slicing, so the proposal doesn't sound unreasonable on the face of it.

    Practically though, there doesn't seem to be any evidence of a real use case here. It is unclear whether allowing np.int32 would ever benefit anyone and whether dropping the current requirement for an explicit int() cast would encourage weird and inefficient code.

    Either way, this should be categorized as a feature request. CPython itself is passing tests and isn't breaking any promises.

    [Will Roberts]

    Note that this issue also seems to affect other methods
    in the itertools package, such as permutations.

    If we do make some sort of change, it should be limited it to just islice(). To me, this only makes sense in the context of index arguments unless someone makes a broad and consequential executive decision that everything in Python that accepts an integer needs to also check for int-like objects as well.

    @rhettinger rhettinger added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jun 5, 2017
    @wroberts
    Copy link
    Mannequin

    wroberts mannequin commented Jun 5, 2017

    Thanks for feedback, Serhiy and Raymond! Github PR now has reverted changes except to the calls in islice_new; I am happy to squash if you would like.

    Serhiy, this is my first time poking around in CPython code. What are the potential consequences of making itertools.islice less atomic/thread-safe, and/or possibly releasing the GIL? Are there any gotchas to watch out for in this patch specifically? I've modelled my changes on the code in listobject.c [list_subscript], but I would love to hear if there's a better way to do things.

    Raymond, I'd also be curious to learn about any code weirdness or inefficiency you have in mind. I agree with you that there might not be a compelling use-case here. The SO question looks to be a bit contrived; however, the interesting bits to me here are that the behaviour of numpy interacting with itertools has changed since py27, and also that the proposed workarounds/solutions seem ... inelegant? Need a numpy integer be explicitly coerced to int in this context when the type implements an __index__ method?

    @serhiy-storchaka
    Copy link
    Member

    My note was not directly related to your patch. It was a warning about general replacement of PyLong_AsSsize_t with PyNumber_AsSsize_t. There is a code for which this is dangerous.

    @smarnach
    Copy link
    Mannequin

    smarnach mannequin commented Jun 7, 2017

    The current behaviour of islice() seems inconsistent with the rest of Python. All other functions taking start, stop and step arguments like slice(), range() and itertools.count() do accept integer-like objects. The code given as "roughly equivalent" in the documentation of islice() accepts integer-like objects, and so does regular list slicing. In fact, the __index__() method was introduced in PEP-357 specifically for slicing. In Python 2, islice() supported it as well. I think the expectation that islice() in Python 3 also supports it is entirely reasonable, and I can't see any strong arguments for breaking that assumption.

    @wroberts
    Copy link
    Mannequin

    wroberts mannequin commented Jun 7, 2017

    Github PR adds simple test, as well as an entry in Misc/NEWS.

    @rhettinger
    Copy link
    Contributor

    New changeset 0ecdc52 by Raymond Hettinger (Will Roberts) in branch 'master':
    bpo-30537: use PyNumber in itertools.islice instead of PyLong (bpo-1918)
    0ecdc52

    @rhettinger
    Copy link
    Contributor

    Applied patch. In theory, this was reasonable. Since islice() deals with indices, checking __index__ seems reasonable.

    On the other hand, stable high-performance code was changed without any known reasonable use cases, and no one will ever likely see a benefit. A cheap atomic operation was replaced with a more complex and less thread-safe chain of calls (I'm no longer even sure that the test cases cover all the possible code paths).

    Users might be better off by not seeing an unhelpful error message when passing in a numpy.int32, or they might be worse-off by having lost a cue that they were writing inefficient code (which invisibly creates temporary integer objects on every call when presumably the whole reason for using numpy was a concern for efficiency).

    @jdemeyer
    Copy link
    Contributor

    Just to note that this bug affects SageMath too: https://trac.sagemath.org/ticket/24528

    Thanks for fixing!

    @embray
    Copy link
    Contributor

    embray commented Jan 12, 2018

    Users might be better off by not seeing an unhelpful error message when passing in a numpy.int32, or they might be worse-off by having lost a cue that they were writing inefficient code (which invisibly creates temporary integer objects on every call when presumably the whole reason for using numpy was a concern for efficiency).

    While it's true there are many Numpy users who don't understand what they're doing, any time they go into the Python interpreter they're losing the benefits of Numpy anyways (which are fast vectorized operations on arrays).

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants