Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate implicit truncating when convert Python numbers to C integers: use __index__, not __int__ #80229

Closed
serhiy-storchaka opened this issue Feb 20, 2019 · 20 comments
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 36048
Nosy @gvanrossum, @mdickinson, @vstinner, @serhiy-storchaka, @hroncok, @remilapeyre, @arhadthedev
PRs
  • bpo-36048: Use __index__() instead of __int__() for implicit conversion if available. #11952
  • bpo-17576: Strict __int__ and __index__ return types; operator.index always uses __index__ #13740
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-02-25.16:13:58.711>
    created_at = <Date 2019-02-20.06:43:05.294>
    labels = ['interpreter-core', 'type-feature', '3.8']
    title = 'Deprecate implicit truncating when convert Python numbers to C integers: use __index__, not __int__'
    updated_at = <Date 2021-12-09.20:32:01.235>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-12-09.20:32:01.235>
    actor = 'calin.culianu'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-02-25.16:13:58.711>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2019-02-20.06:43:05.294>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36048
    keywords = ['patch']
    message_count = 20.0
    messages = ['336041', '336050', '336069', '336071', '336107', '336127', '336534', '340944', '358943', '360080', '408054', '408077', '408087', '408095', '408098', '408146', '408150', '408153', '408156', '408159']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'mark.dickinson', 'vstinner', 'serhiy.storchaka', 'hroncok', 'remi.lapeyre', 'arhadthedev', 'calin.culianu']
    pr_nums = ['11952', '13740']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue36048'
    versions = ['Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    Currently, C API functions that convert a Python number to a C integer like PyLong_AsLong() and argument parsing functions like PyArg_ParseTuple() with integer converting format units like 'i' use the __int__() special method for converting objects which are not instances of int or int subclasses. This leads to dropping the fractional part if the object is not integral number (e.g. float, Decimal or Fraction). In some cases, there is a special check for float, but it does not prevent truncation for Decimal, Fraction and other numeric types.

    For example:

    >>> chr(65.5)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: integer argument expected, got float
    >>> chr(Decimal('65.5'))
    'A'

    The proposed PR makes all these functions using __index__() instead of __int__() if available and emit a deprecation warning when __int__() is used for implicit conversion to a C integer.

    >>> chr(Decimal('65.5'))
    <stdin>:1: DeprecationWarning: an integer is required (got type decimal.Decimal)
    'A'

    In future versions only __index__() will be used for the implicit conversion, and __int__() will be used only in the int constructor.

    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Feb 20, 2019
    @serhiy-storchaka
    Copy link
    Member Author

    Numerous explicit calls of PyNumber_Index() which are used to protect from passing non-integral types to PyLong_AsLong() and like can be removed after the end of the deprecation period. I tried to mark calls which can be removed with comments, but virtually all calls can be removed.

    @vstinner
    Copy link
    Member

    I like the idea. Rejecting float but not decimal.Decimal is inconsistent. __index__ has been written explicitly for this purpose.

    I'm always confused and lost in subtle details of the Python and C API in how they handle numbers, so I wrote some notes for myself:
    https://pythondev.readthedocs.io/numbers.html

    Some functions accept only int, others use __int__, others __index__. Some functions silently truncate into 32 or 64-bit signed integer. Some other raise a OverflowError or ValueError...

    @vstinner vstinner changed the title Deprecate implicit truncating when convert Python numbers to C integers Deprecate implicit truncating when convert Python numbers to C integers: use __index__, not __int__ Feb 20, 2019
    @vstinner
    Copy link
    Member

    See also bpo-34423: "Overflow when casting from double to time_t, and_PyTime_t" or "How to reduce precision loss when converting arbitrary number to int or float?".

    @serhiy-storchaka
    Copy link
    Member Author

    I am not sure what to with the int constructor. Should it try __index__ before __int__? Or just make __index__ without __int__ setting the nb_int slot as was proposed by Nick in bpo-33039?

    @remilapeyre
    Copy link
    Mannequin

    remilapeyre mannequin commented Feb 20, 2019

    I am not sure what to with the int constructor. Should it try __index__ before __int__? Or just make __index__ without __int__ setting the nb_int slot as was proposed by Nick in bpo-33039?

    Shouldn't it try only __int__ since this will default to __index__ once bpo-33039 is closed?

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 6a44f6e by Serhiy Storchaka in branch 'master':
    bpo-36048: Use __index__() instead of __int__() for implicit conversion if available. (GH-11952)
    6a44f6e

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Apr 26, 2019

    Relevant NumPy issue: numpy/numpy#13412

    @mdickinson
    Copy link
    Member

    Serhiy: any thoughts about what version should be targeted for eventual removal of the functionality deprecated in this PR? 3.10?

    @vstinner
    Copy link
    Member

    Serhiy: any thoughts about what version should be targeted for eventual removal of the functionality deprecated in this PR? 3.10?

    IMHO it should be done at the *very beginning* of a release cycle. Python 3.9.0 alpha 1 *and* alpha 2 versions have already been released. It's too late for the 3.9 cycle.

    I'm fine with 3.10.

    @arhadthedev
    Copy link
    Member

    Here is a report that this change breaks PyQt5 on Fedora:

    <https://github.com/python/cpython/pull/11952#issuecomment-989298404\>

    [...]

    Why do I care? This breaks tons of existing PyQt5 code out there, for example. I wasn't aware of this change to the language until Fedora shipped Python 3.10 and everything broke. So much stuff that uses PyQt5 is broken now. Good job guys!!

    [...]

    Even though the rest of comment is emotional, we need to check if the problem really has place and the PR needs to be retracted until PyQt5 is ported to newer Python C API.

    @arhadthedev
    Copy link
    Member

    The reporter gave more details (<https://github.com/python/cpython/pull/11952#issuecomment-989430968\>):

    Literally this is ok in C++ with Qt:

    float x = 2.3, y = 1.1;
    auto p = QPoint(x, y); // QPoint only takes 2 int params.. this works in C++; floats can be always be implicitly converted to int
    

    Same code, more or less, in Python3.10 is now broken:

    x = 2.3
    y = 1.1
    p = QPoint(x, y)  # This fails, where previously it worked on every Python version since the age of the dinosaurs...
    

    Note that most of the API for PyQt5 is auto-generated from the function signatures of the C++. So in this case QPoint takes 2 ints for its c'tor (just like in C++).. and breaks on Python 3.10 if given floats, when previously it worked just fine with either ints or floats. This is just 1 example. But many codebases that use PyQt5 are hit by breakages like this one now.

    @mdickinson
    Copy link
    Member

    @arhadthedev: Thanks for highlighting the issue.

    we need to check if the problem really has place and the PR needs to be retracted until PyQt5 is ported to newer Python C API

    I'm not particularly impressed by the arguments from cculianu. This was the right change to make: this is very general machinery, and we've seen many real issues over the years resulting from implicit acceptance of floats or Decimal objects where an integer is expected.

    It may well be that for some *specific* libraries like PyQt5 it makes sense to make a different choice. And indeed, PySide6 has done exactly that:

    Python 3.10.0 (default, Nov 12 2021, 12:32:57) [Clang 12.0.5 (clang-1205.0.22.11)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from PySide6.QtCore import QPoint
    >>> QPoint(2, 3)
    PySide6.QtCore.QPoint(2, 3)
    >>> QPoint(2.1, 3.3)
    PySide6.QtCore.QPoint(2, 3)

    So no, I don't believe this change should be reverted. At best, we could re-introduce the deprecation warnings and delay the full implementation of the change. But the deprecation warnings were present since Python 3.8, and so either the PyQt5 developers noticed them but didn't want to make the change, or didn't notice them. Either way, it's difficult to see what difference extending the deprecation warning period would make. Moreover, the new behaviour is already released, in Python 3.10.0 and 3.10.1, and the code churn would likely be more annoying than helpful.

    I would suggest to cculianu that they take this up with the PyQt5 developers.

    @serhiy-storchaka
    Copy link
    Member Author

    This issue was closed more than 2.5 years ago. The changes were made in 3.8, not in 3.10. BTW, 3.8 only accepts security fixes now. Please open a new issue for new discussion.

    @mdickinson
    Copy link
    Member

    For the record, bpo-37999 is the issue that turned the deprecation warnings into errors for Python 3.10. (But as Serhiy says, please open a new issue, or start a discussion on one of the mailing lists.)

    @calinculianu
    Copy link
    Mannequin

    calinculianu mannequin commented Dec 9, 2021

    Hi, I'm cculianu, the reporting user.

    May I get a link or some background for the motivation for this change? It seems to me that there are vague allusions to "Decimal -> int cause problems in past", or some such, and I'd like to read the arguments presented as to what problems in particular, and how this change fixes those problems.

    Mathematically, and by convention in computer science, conversions from float-like types -> int are always well defined. It seems strangely un-Pythonic to restrict things in such a way. I am very surprised by this change, quite frankly. It's the wrong direction to go in, for this language, is my intuitive feeling.

    Anyway, very curious about what the rationale was for this. Thanks.

    @serhiy-storchaka
    Copy link
    Member Author

    See bpo-660144 which made float values be rejected in most cases where an integer is expected rather of silently truncating them. It was at 2003. Guido mentioned that is an age-old problem, so perhaps you can find older discussions on the tracker or mailing lists.

    It was a partial solution. It caught unexpected floats, but not other non-integer numbers. Later, the __index__ special method was introduced specially to distinguish objects which can be implicitly converted to integer from these which can be explicitly converted to integer. And finally we fixed that age-old problem.

    If you have other questions, please ask them on mailing lists, forums or other resources.

    @calinculianu
    Copy link
    Mannequin

    calinculianu mannequin commented Dec 9, 2021

    Ok, well I found this in that issue you linked to:

    This is the original "problem":

    type __mul__ is wierd:
    
    >> 'a'.__mul__(3.4)
    'aaa'
    >>> [1].__mul__(3.4)
    [1, 1, 1]
    

    And this is Guido's response:

    The problem is that on the one hand you want "i" to accept
    other int-ish types that have an __int__ method. But on the
    other hand you don't want it to accept float, which also has
    an __int__ method. Or other float-ish types.


    You have to understand -- I don't see a problem here!

    Not once is there any discussion as to *why* this is really a problem. So what? Your float gets truncated and treated as an int. So?

    If you are calling into a C api -- why is this a problem exactly when **in the C Language itself** by convention and by compiler rules anyway -- this is **perfectly ok**.

    Restricting Python in this way -- why? What is the benefit? What problem is being solved? I did all the digging in the world here and cannot at all discern what, exactly, is being solved at all. It seems like some bizarre aesthetic thing, at best. Sorry.. maybe I am wrong. Please point me to where the real problem is...

    Again, I reiterate **mathematically**, by convention in computer science, the conversion from float -> int is well defined. Moreover, you have the legacy __int__ method already that *can define it* for specific types. And Python has worked this way since the age of the dinosaurs (2002!!).

    So.. in python this conversion is well defined. And has always been, by __int__.

    I think the introduction of __index__ versus __int__ may be a mistake, especially if it is designed to "solve" just this problem -- which as yet, has to be clearly argued for as to **why** it's a problem!

    This boggles my mind, guys. I.. am speechless. Please tell me I'm wrong. To me it seems you guys imagined a problem that doesn't exist, then "solved" it with extra API stuff, and the actual end result was -- real-world problems created ex-nihilo for real-world programs, manifesting themselves finally in 2021.

    You have to understand that .. to me as a C++ programmer.. I see no problem. I just see Python all of a sudden becoming less nice.

    @gvanrossum
    Copy link
    Member

    Claim, you need to tone down the rhetoric. Python is not C++ and user expectations are different. You are the one who is fighting windmills here.

    @calinculianu
    Copy link
    Mannequin

    calinculianu mannequin commented Dec 9, 2021

    Ok, Guido, thanks. Fair enough.

    So regardless: what is the problem exactly if a C function expects an int but gets given a float? Why does such strictness matter, in that it required a whole other set of __index__ etc to be added? I am having trouble seeing the actual problem, which was taken as a given in all the archaeology I have done on this ancient "issue" -- but is it a given? There is no problem... only the one created by this change, really.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants