classification
Title: Add sorting helpers for collections containing None values
Type: enhancement Stage:
Components: Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, cvrebert, ncoghlan, pitrou, rhettinger, tim.peters
Priority: normal Keywords:

Created on 2014-02-15 00:06 by ncoghlan, last changed 2014-02-25 05:50 by rhettinger.

Messages (9)
msg211251 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-02-15 00:06
Currently, it's a bit annoying to sort collections containing "None" values in Python 3. While the default behaviour isn't going to change, it would be good to offer standard "none_first" and "none_last" helps (inspired by the SQL NULL FIRST and NULL LAST ordering control).

Suggested home: functools (since that is where the total_ordering class decorator already lives), but collections would also be a reasonable choice (as this feature mostly relates to sorting containers)

The minimal option (suggested by Peter Otten):

    def none_first(v):
        return v is not None, v

    def none_last(v):
        return v is None, v

A more complex alternative would be to provide general purpose SortsFirst and SortsLast singletons:

    @functools.total_ordering
    class _AlwaysLesser:
        def __eq__(self, other):
            return isinstance(other, _AlwaysLesser):
        def __lt__(self, other):
            return not isinstance(other, _AlwaysLesser):

    @functools.total_ordering
    class _AlwaysGreater:
        def __eq__(self, other):
            return isinstance(other, _AlwaysGreater):
        def __gt__(self, other):
            return not isinstance(other, _AlwaysGreater):

    SortsFirst = _AlwaysLesser()
    SortsLast = _AlwaysGreater()

    def none_first(v):
        return SortsFirst if v is None else v
    def none_last(v):
        return SortsLast if v is None else v

The advantage of the latter more complex approach is that you can embed the SortsFirst and SortsLast values inside a tuple as part of a more complex key, whereas the simple solution only handles the case where the entire value is None.

(Inspired by Chris Withers's python-dev thread: https://mail.python.org/pipermail/python-dev/2014-February/132332.html)
msg212103 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-02-24 14:45
> Currently, it's a bit annoying to sort collections 
> containing "None" values in Python 3

I think we should seriously consider whether to restore None's ability to compare with other entries.   Removing this capability has been a major PITA and is an obstacle for people converting code to Python 3.

The need to create helper function work-arounds is a symptom, not a cure.
msg212121 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2014-02-24 17:29
I've haven't yet seen anyone complain about the inability to compare None except in the specific context of sorting.  If it is in fact specific to sorting, then this specific symptom and "the problem" are in fact the same thing ;-)
msg212123 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-02-24 17:32
Both Nick's proposals look ok to me.
msg212144 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-02-24 21:29
It occurred to me the current names are a bit misleading when using
"reverse=True", so low/high is likely a better naming scheme than
first/last.

I think I'll propose a patch for six before doing anything to the standard
library - this is already an issue for some forward ports, so at least
adding a "none_low" sort key that is a no-op on Py2 makes sense.
msg212145 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-02-24 21:33
And in case that last comment worried anyone - I won't commit *anything*
related to this to the standard library until after creating a PyPI
"sortlib" module that includes both this and an "order_by_key" class
decorator, and we have consensus that the proposed changes are reasonable.
msg212146 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-02-24 21:38
> If it is in fact specific to sorting, then this specific symptom
> and "the problem" are in fact the same thing ;-)

The first rule of tautology club is the first rule of tautology club ;-)

FWIW, we had to add a work-around for this in pprint._safe_key class.  Without that work-around, it was difficult to work with JSON-style data hierarchies:

# wouldn't pprint() without the _safe_key() work-around:
temperatures = {'Jan': 25.2, 'Feb': 22.3, 'Mar': None, 'Apr': 19.1,
                'May': 22.2, 'Jun': None, 'July': 22.3}

I think this will be typical for the kind of issue people will encounter when using None as a placeholder for missing data.

FWIW, if None stays non-comparable, Nick's additions look fine to me.  I just think it easier for everyone to restore None's universal comparability rather than adding work-arounds for the problems caused by removing that capability.
msg212147 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-02-24 21:50
I suspect if we'd thought of it back in the 3.0 or 3.1 time frame then
giving the Py3 None a consistent "sorts low" behaviour would have been more
likely.

At this stage of the Py3 life cycle, though, it seems simpler overall to
remain consistent with earlier Py3 releases.
msg212164 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-02-25 05:50
> At this stage of the Py3 life cycle, though,
> it seems simpler overall to remain consistent 
> with earlier Py3 releases.

Given that so few users have converted, it is
simpler to become consistent with Py2.7 and
to not introduce compensating features.
History
Date User Action Args
2014-02-25 05:50:44rhettingersetmessages: + msg212164
2014-02-24 23:37:17Arfreversetnosy: + Arfrever
2014-02-24 21:50:46ncoghlansetmessages: + msg212147
2014-02-24 21:38:58rhettingersetmessages: + msg212146
2014-02-24 21:33:06ncoghlansetmessages: + msg212145
2014-02-24 21:29:20ncoghlansetmessages: + msg212144
2014-02-24 17:32:04pitrousetnosy: + pitrou
messages: + msg212123
2014-02-24 17:29:47tim.peterssetnosy: + tim.peters
messages: + msg212121
2014-02-24 14:45:16rhettingersetnosy: + rhettinger
messages: + msg212103
2014-02-21 19:00:41cvrebertsetnosy: + cvrebert
2014-02-15 00:06:16ncoghlancreate