classification
Title: collections.Counter.least_common
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: davidcoallier, ezio.melotti, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2013-01-18 12:24 by davidcoallier, last changed 2017-10-28 18:56 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
collections.Counter.least_common.patch davidcoallier, 2013-01-18 12:24 collections.Counter.least_common(n) method. review
16994.patch davidcoallier, 2013-01-18 20:04 Fixed a typo in the documentation, this contains all changes from the review. review
Messages (8)
msg180189 - (view) Author: David Coallier (davidcoallier) Date: 2013-01-18 12:24
The `collections.Counter` library contains very useful methods for playing with dicts and sets (mainly the most_common()) function.

Even though it is fairly trivial to retrieve the least common elements in a Counter() by doing Counter(...).most_common(n)[:-n:-1] I believe that for the sake of consistency, the `least_common` method should also be available. 

The attached patch contains the following:
- The method definition in Lib/__init__.py to support least_common;
- The tests to make sure least_common behaves as expected;
- The documentation for 2.7 and the general documentation.

Everywhere `most_common` was mentioned, you will now find `least_common` mentioned as well.
msg180204 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-18 18:17
I left a review on Rietveld.
Since this is a new feature it can't go on 2.7, and it should target 3.4 instead.
msg180211 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-18 18:57
What is use case for this method? Actually, any element which is not in a collection, is a least common for this collection.
msg180213 - (view) Author: David Coallier (davidcoallier) Date: 2013-01-18 19:30
Latest patch after code review round #1
msg180217 - (view) Author: David Coallier (davidcoallier) Date: 2013-01-18 19:54
Hi there @serhiy.storchaka, 

Consider the case where one would calculate the k-combination of set S. When the set has `n` elements, the number of k-combination is equal to its binomial coefficient. e.g. ( n!/( (k!(n-k)! ).

One method of statistically optimising the computation is to remove n least common elements from the set S. 

I do agree that this new method is merely for consistency because right now it is quite easy to simply do c.most_common(...)[:-(n+1):-1] to get th e least common elements. 

The goal of this patch is to make it intuitive to get the least common elements of a Counter, therefore making it easy to remove them from a collection.

Does it make any sense to you?
msg180238 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-19 09:49
> One method of statistically optimising the computation is to remove n least common elements from the set S.

May be you need not remove least common elements from the set, but *get* a set of (len(S)-n) most common elements?

> Does it make any sense to you?

Frankly, not very much.

Note, that least common element is not defined in most cases. Usually there are only a few most (or even one) common elements, but a lot of least common elements. Result of least_common(n) is practically random due to hash randomization.

Another note is that most_common()[:-n] in many cases faster than least_common(n) for n >> 1. This is right for most_common(n) too, but it usually used for very small n (in particular for n=1) and this has more sense.
msg305156 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-28 11:25
This proposition was discussed also on Python-Ideas (https://mail.python.org/pipermail/python-ideas/2017-March/045215.html).

I think it should be rejected. In general this doesn't make sense. In rare cases when you want to to get the least common items in the Counter (I don't know any practical examples), you can take a list of all items sorted by the value and take few last items.
msg305165 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-10-28 18:56
Thank you for the suggestion, but I'm going to mark it as rejected for the reasons listed in the other posts.
History
Date User Action Args
2017-10-28 18:56:50rhettingersetstatus: pending -> closed
resolution: rejected
messages: + msg305165

stage: patch review -> resolved
2017-10-28 11:25:19serhiy.storchakasetstatus: open -> pending
assignee: rhettinger
messages: + msg305156
2013-01-19 09:49:49serhiy.storchakasetmessages: + msg180238
2013-01-18 20:04:18davidcoalliersetfiles: + 16994.patch
2013-01-18 19:54:49davidcoalliersetmessages: + msg180217
2013-01-18 19:32:12davidcoalliersetfiles: - collections.Counter.least_common.patch
2013-01-18 19:30:10davidcoalliersetfiles: + collections.Counter.least_common.patch

messages: + msg180213
2013-01-18 18:57:59serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg180211
2013-01-18 18:17:15ezio.melottisetversions: + Python 3.4, - Python 2.7
nosy: + ezio.melotti

messages: + msg180204

stage: patch review
2013-01-18 13:12:19mark.dickinsonsetnosy: + rhettinger
2013-01-18 12:24:33davidcoalliercreate