Message78015
My inclination is to not include this as a basic C coded itertool
because it holds potentially all of the data in memory (generally, not a
characteristic of an itertool) and because I don't see it as a basic
building block (itertools are intended to be elemental, composable
components of an iterator algebra). Also, the pure python equivalent of
dedup() is both easy to write and runs efficiently (it gains little from
being recoded in C).
Instead, I'm think of adding two recipes to the itertools docs:
def unique_everseen(iterable, key=None):
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for elem in iterable:
if elem not in seen:
seen_add(elem)
yield elem
else:
for elem in iterable:
k = key(elem)
if k not in seen:
seen_add(k)
yield elem
def unique_lastseen(iterable, key=None):
# unique_lastseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_lastseen('ABBCcAD', str.lower) --> A B C A D
return imap(next, imap(itemgetter(1), groupby(iterable, key))) |
|
Date |
User |
Action |
Args |
2008-12-18 08:38:21 | rhettinger | set | recipients:
+ rhettinger, thomaspinckney3 |
2008-12-18 08:38:21 | rhettinger | set | messageid: <1229589501.39.0.58372422351.issue4615@psf.upfronthosting.co.za> |
2008-12-18 08:38:20 | rhettinger | link | issue4615 messages |
2008-12-18 08:38:19 | rhettinger | create | |
|