This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Groupby Is Roughly Equivalent To ...
Type: Stage:
Components: Documentation Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: docs@python, greg.solomon, mdk, rhettinger
Priority: low Keywords:

Created on 2016-12-11 20:16 by greg.solomon, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (3)
msg282943 - (view) Author: Greg Solomon (greg.solomon) Date: 2016-12-11 20:16
https://docs.python.org/3/library/itertools.html#itertools.groupby

I found the "equivalent" code a bit hard to read. Is there any merit in changing it to something a bit like the following ?

Kind Rgds, Greg

class groupby:
    def __init__( self , iterable , key_function = ( lambda x: x ) ):
        self.iterable = iter( iterable )
        self.key_function = key_function
        self.FINISHED = object()
        try:
            self.next_value = next( self.iterable )
        except StopIteration: 
            self.next_value = self.FINISHED
    def __iter__( self ):
        return self
    def __next__( self ):
        if self.next_value == self.FINISHED:
            raise StopIteration
        self.group_key_value = self.key_function( self.next_value )
        return ( self.group_key_value , self._group() )
    def _group( self ):
        while self.next_value != self.FINISHED \
          and self.group_key_value == self.key_function( self.next_value ):
            yield self.next_value
            try:
                self.next_value = next( self.iterable )
            except StopIteration:
                self.next_value = self.FINISHED 
        return
msg282947 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-11 20:54
I renamed your function groupby2 to compare it with itertools.groupby and tested but:

>>> print(list(groupby2(['A', 'B'])))

does not returns, looks like your implementation have a bug, so I tried:

>>> for k in groupby2(['A', 'B']):
...     print(k)

and I'm getting loads of:

('A', <generator object groupby2._group at 0x7f0476809f10>)
('A', <generator object groupby2._group at 0x7f0476851f68>)
('A', <generator object groupby2._group at 0x7f0476809f10>)
('A', <generator object groupby2._group at 0x7f0476851f68>)
('A', <generator object groupby2._group at 0x7f0476809f10>)
('A', <generator object groupby2._group at 0x7f0476851f68>)
('A', <generator object groupby2._group at 0x7f0476809f10>)
('A', <generator object groupby2._group at 0x7f0476851f68>)
('A', <generator object groupby2._group at 0x7f0476809f10>)
('A', <generator object groupby2._group at 0x7f0476851f68>)

You may also want to test your implementation against https://github.com/python/cpython/blob/master/Lib/test/test_itertools.py#L699
msg282968 - (view) Author: Greg Solomon (greg.solomon) Date: 2016-12-12 07:52
Oh, I get it. There needs to be a next(self.it) loop in __next__ as well as in _grouper in case the user doesn't call _grouper.

My test was 
for ( k , g ) in groupby( L ):
    print ( k , len( list( g ) ) )
so I was executing _grouper on every row.

Thanks !!!
History
Date User Action Args
2022-04-11 14:58:40adminsetgithub: 73125
2016-12-12 07:52:05greg.solomonsetstatus: open -> closed
resolution: not a bug
messages: + msg282968
2016-12-12 06:02:41rhettingersetpriority: normal -> low
assignee: docs@python -> rhettinger

nosy: + rhettinger
2016-12-11 20:54:02mdksetnosy: + mdk
messages: + msg282947
2016-12-11 20:16:23greg.solomoncreate