This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: set.repr breaches docstring contract
Type: behavior Stage: resolved
Components: ctypes Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Ylem, josh.r, steven.daprano
Priority: normal Keywords:

Created on 2019-11-19 21:27 by Ylem, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg356991 - (view) Author: Cat Chenal (Ylem) Date: 2019-11-19 21:27
S = {19,8,-1,25,0,1,2,3,4,5,6,7}
print('Set S = {{19,8,-1,25,0,1,2,3,4,5,6,7}}')

The set is represented by a new string-ordered set:
print(f'Its repr is:\n{S}\n')
Output:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 19, 25, -1}

This is a breach of the contract stated in the docstring: "Build an unordered collection of unique elements."
msg356997 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-11-19 22:13
I'm sorry, I don't understand what part of the documentation you think is violated here. The docs say that sets are unordered, but of course when printing the elements have to be given in *some* order.

The given output seems unordered to me: -1 is smaller than 25, but it appears last in the output.

The specific order you see will depend on the specific values in the set, as well as the order that they were inserted, deleted, and/or re-inserted in some arbitrary way. For example:

    >>> repr({4, 5, 2**31+1, 2, 2**31+2, 3, 2**31, 0})
    '{0, 2147483648, 2, 3, 2147483650, 2147483649, 5, 4}'


I don't think there is a bug here. Can you explain what you think the bug is, in detail please?
msg357024 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-11-20 03:49
To be clear, the docstring is explicitly disclaiming any ordering contract. If you're reading "unordered" as meaning "not reordered" (like a list or tuple, where the elements appear in insertion order), that's not what "unordered" means here. It means "arbitrary order". As it happens, the hashcodes of small integers correspond to their numerical values, (mostly, -1 is a special case), so if no collisions occur and the numbers are sequential, the ordering will often look like it was sorted in semi-numerical order, as in your case.

That doesn't mean it's performing sorting, it just means that's how the hashes happened to distribute themselves across the buckets in the set. A different test case with slightly more distributed numbers won't create the impression of sorting:

>>> print({-5, -1, 13, 17})
{17, -5, 13, -1}

For the record, I chose that case to use CPython implementation details to produce a really unordered result (all the numbers are bucketed mod 8 in a set that small, and this produces no collisions, with all values mod 8 different from the raw value). On other versions of CPython, or alternate interpreters, both your case and mine could easily come out differently.

Point is, this isn't a bug, just a quirk in the small int hash codes.

Steven: I think they thought it was sorted in some string-related way, explaining (to them) why -1 was out of place (mind you, if it were string sorted, -1 would come first since the minus sign is ASCIIbetically first, 19 would fall between 1 and 2, and 25 between 2 and 3, so it doesn't hold up).

There's no bug here.
msg357079 - (view) Author: Cat Chenal (Ylem) Date: 2019-11-20 15:36
Thank you for pointing out my lack of clarity: I apologize.
I probably should not have split this issue in two (issue38855).

My confusion stems from the fact that I expected the unpacking of a set to return the same output as that obtained from the unpacking of a list.
From my testing, I gather that the unpacking of a set is performed via its repr, which uses "some ordering".

In closing, I want to note two points:

1. repr is apparently platform-dependent (here: Python 3.6.7 [MSC v.1900 64 bit (AMD64)]):

The code given by steven.daprano run in a jupyter lab cell or in VS Code yields a different output:
>>> repr({4, 5, 2**31+1, 2, 2**31+2, 3, 2**31, 0})
'{2147483648, 2147483649, 2, 2147483650, 4, 5, 3, 0}'

2. Testing reviewer's assertion: "The specific order you see will depend on the specific values in the set, as well as the order that they were inserted, deleted, and/or re-inserted in some arbitrary way."
This counter example, where element 0 is moved to the second position, shows that there is not such order dependence: 
>>> repr({4, 0, 5, 2**31+1, 2, 2**31+2, 3, 2**31})
'{0, 2147483649, 2, 2147483650, 4, 5, 3, 2147483648}'
msg357102 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-11-20 20:31
> My confusion stems from the fact that I expected the unpacking of a 
> set to return the same output as that obtained from the unpacking of a 
> list.

Why did you expect that?

Sets aren't lists. Lists are ordered, so they hold their items in a 
specific order. Sets are unordered, so there is no guarantee what order 
you will see when you unpack them.

If you create the list [foo, bar, baz] then the output will always be 
[foo, bar, baz] on every platform. That's a guarantee.

Sets are unordered, as documented, so there are no guarantee about what 
order you will see: it might be {foo, baz, bar} or {bar, baz, foo} or 
{foo, bar, baz} or {baz, foo, bar}, any permutation is equally valid, 
regardless of what order you created the set.

> 1. repr is apparently platform-dependent

Quite likely. Since there's no guarantee what order you will see, 
there's no guarantee that the order won't change from platform to 
platform, or version to version.

> 2. Testing reviewer's assertion: "The specific order you see will 
> depend on the specific values in the set, as well as the order that 
> they were inserted, deleted, and/or re-inserted in some arbitrary 
> way."
> This counter example, where element 0 is moved to the second position, 
> shows that there is not such order dependence:

Your example shows that the output order changes when you change the 
input order, in an unpredicatable, arbitrary way, just like I said. 
That's not a counter-example.
History
Date User Action Args
2022-04-11 14:59:23adminsetgithub: 83034
2019-11-20 20:31:45steven.dapranosetmessages: + msg357102
2019-11-20 15:36:33Ylemsetmessages: + msg357079
2019-11-20 03:49:39josh.rsetstatus: open -> closed

nosy: + josh.r
messages: + msg357024

resolution: not a bug
stage: resolved
2019-11-19 22:13:28steven.dapranosetnosy: + steven.daprano
messages: + msg356997
2019-11-19 21:27:41Ylemcreate