Issue20630
Created on 2014-02-15 00:06 by ncoghlan, last changed 2014-02-25 05:50 by rhettinger.
Messages (9) | |||
---|---|---|---|
msg211251 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2014-02-15 00:06 | |
Currently, it's a bit annoying to sort collections containing "None" values in Python 3. While the default behaviour isn't going to change, it would be good to offer standard "none_first" and "none_last" helps (inspired by the SQL NULL FIRST and NULL LAST ordering control). Suggested home: functools (since that is where the total_ordering class decorator already lives), but collections would also be a reasonable choice (as this feature mostly relates to sorting containers) The minimal option (suggested by Peter Otten): def none_first(v): return v is not None, v def none_last(v): return v is None, v A more complex alternative would be to provide general purpose SortsFirst and SortsLast singletons: @functools.total_ordering class _AlwaysLesser: def __eq__(self, other): return isinstance(other, _AlwaysLesser): def __lt__(self, other): return not isinstance(other, _AlwaysLesser): @functools.total_ordering class _AlwaysGreater: def __eq__(self, other): return isinstance(other, _AlwaysGreater): def __gt__(self, other): return not isinstance(other, _AlwaysGreater): SortsFirst = _AlwaysLesser() SortsLast = _AlwaysGreater() def none_first(v): return SortsFirst if v is None else v def none_last(v): return SortsLast if v is None else v The advantage of the latter more complex approach is that you can embed the SortsFirst and SortsLast values inside a tuple as part of a more complex key, whereas the simple solution only handles the case where the entire value is None. (Inspired by Chris Withers's python-dev thread: https://mail.python.org/pipermail/python-dev/2014-February/132332.html) |
|||
msg212103 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2014-02-24 14:45 | |
> Currently, it's a bit annoying to sort collections > containing "None" values in Python 3 I think we should seriously consider whether to restore None's ability to compare with other entries. Removing this capability has been a major PITA and is an obstacle for people converting code to Python 3. The need to create helper function work-arounds is a symptom, not a cure. |
|||
msg212121 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2014-02-24 17:29 | |
I've haven't yet seen anyone complain about the inability to compare None except in the specific context of sorting. If it is in fact specific to sorting, then this specific symptom and "the problem" are in fact the same thing ;-) |
|||
msg212123 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2014-02-24 17:32 | |
Both Nick's proposals look ok to me. |
|||
msg212144 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2014-02-24 21:29 | |
It occurred to me the current names are a bit misleading when using "reverse=True", so low/high is likely a better naming scheme than first/last. I think I'll propose a patch for six before doing anything to the standard library - this is already an issue for some forward ports, so at least adding a "none_low" sort key that is a no-op on Py2 makes sense. |
|||
msg212145 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2014-02-24 21:33 | |
And in case that last comment worried anyone - I won't commit *anything* related to this to the standard library until after creating a PyPI "sortlib" module that includes both this and an "order_by_key" class decorator, and we have consensus that the proposed changes are reasonable. |
|||
msg212146 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2014-02-24 21:38 | |
> If it is in fact specific to sorting, then this specific symptom > and "the problem" are in fact the same thing ;-) The first rule of tautology club is the first rule of tautology club ;-) FWIW, we had to add a work-around for this in pprint._safe_key class. Without that work-around, it was difficult to work with JSON-style data hierarchies: # wouldn't pprint() without the _safe_key() work-around: temperatures = {'Jan': 25.2, 'Feb': 22.3, 'Mar': None, 'Apr': 19.1, 'May': 22.2, 'Jun': None, 'July': 22.3} I think this will be typical for the kind of issue people will encounter when using None as a placeholder for missing data. FWIW, if None stays non-comparable, Nick's additions look fine to me. I just think it easier for everyone to restore None's universal comparability rather than adding work-arounds for the problems caused by removing that capability. |
|||
msg212147 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2014-02-24 21:50 | |
I suspect if we'd thought of it back in the 3.0 or 3.1 time frame then giving the Py3 None a consistent "sorts low" behaviour would have been more likely. At this stage of the Py3 life cycle, though, it seems simpler overall to remain consistent with earlier Py3 releases. |
|||
msg212164 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2014-02-25 05:50 | |
> At this stage of the Py3 life cycle, though, > it seems simpler overall to remain consistent > with earlier Py3 releases. Given that so few users have converted, it is simpler to become consistent with Py2.7 and to not introduce compensating features. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2014-02-25 05:50:44 | rhettinger | set | messages: + msg212164 |
2014-02-24 23:37:17 | Arfrever | set | nosy:
+ Arfrever |
2014-02-24 21:50:46 | ncoghlan | set | messages: + msg212147 |
2014-02-24 21:38:58 | rhettinger | set | messages: + msg212146 |
2014-02-24 21:33:06 | ncoghlan | set | messages: + msg212145 |
2014-02-24 21:29:20 | ncoghlan | set | messages: + msg212144 |
2014-02-24 17:32:04 | pitrou | set | nosy:
+ pitrou messages: + msg212123 |
2014-02-24 17:29:47 | tim.peters | set | nosy:
+ tim.peters messages: + msg212121 |
2014-02-24 14:45:16 | rhettinger | set | nosy:
+ rhettinger messages: + msg212103 |
2014-02-21 19:00:41 | cvrebert | set | nosy:
+ cvrebert |
2014-02-15 00:06:16 | ncoghlan | create |