This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients SilentGhost, rhettinger
Date 2009-06-29.16:11:14
SpamBayes Score 2.2183366e-12
Marked as misclassified No
Message-id <1246291876.47.0.886916138579.issue6370@psf.upfronthosting.co.za>
In-reply-to
Content
Your proposed function assumes the input is a sequence and it cannot
handle not a general purpose iterable (it gives the wrong answer when
the latter is submitted).

Also, it does two lookups per item which can be expensive if the
elements have costly hash functions (such as Decimal objects, tuple
objects, and strings that haven't been interned).  

And, it does not return a dict subclass, so it cannot provide all of the
methods currently offered by Counter (such as most_common() and
elements()), nor does it override regular dict methods that do not make
sense for counters (such as the update() method).

Test data:

# case that give *wrong* answer because the input isn't reiterable
>>> unique(c.lower() for c in 'Abracadabra')

# case that is slower because of expensive multiple lookups
>>> unique([abs(Decimal(x)/4) for x in range(-2000, 2000)])

# case that fails because update() was not overridden
>>> c = unique(range(10))
>>> c.update(range(5))

# cases that do not provide needed subclass methods
>>> unique('abracadabra').most_common()
>>> unique('abracadabra').elements()

Though the code for unique() is hopeless, I will take a look at the
"self[elem] = self.get(elem, 0) + 1" approach.  That shows some promise.
History
Date User Action Args
2009-06-29 16:11:16rhettingersetrecipients: + rhettinger, SilentGhost
2009-06-29 16:11:16rhettingersetmessageid: <1246291876.47.0.886916138579.issue6370@psf.upfronthosting.co.za>
2009-06-29 16:11:15rhettingerlinkissue6370 messages
2009-06-29 16:11:14rhettingercreate