Message 256641 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	josh.r
Recipients	abarry, josh.r, rhettinger, seblin, socketpair
Date	2015-12-18.06:14:51
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1450419292.03.0.880629333393.issue25898@psf.upfronthosting.co.za>
In-reply-to

Content
A utility like this seems like it would belong in `itertools`, not `collections`. It should also ideally avoid fully realizing the sequence so it could work with iterators/generators as well; PySequence_Fast will force creation of a `list`/`tuple` of the whole sequence when in practice, a `deque` with a maxlen could be used to only maintain the necessary window into the "haystack". It would also help to have a pure Python implementation (and until you have one, it's probably overkill to write the C accelerator) for other Python distributions, and to serve as a baseline for comparison to see if a C accelerator is justified. Something like this might be a decent point of comparison: def has_subsequence(it, searchseq, *, all=all, map=map, eq=operator.eq): searchseq = tuple(searchseq) if not searchseq: return True # Empty sequence in everything window = collections.deque(itertools.islice(it, len(searchseq)-1), len(searchseq)) for x in it: window.append(x) if all(map(eq, window, searchseq)): return True return False

A utility like this seems like it would belong in `itertools`, not `collections`. It should also ideally avoid fully realizing the sequence so it could work with iterators/generators as well; PySequence_Fast will force creation of a `list`/`tuple` of the whole sequence when in practice, a `deque` with a maxlen could be used to only maintain the necessary window into the "haystack".

It would also help to have a pure Python implementation (and until you have one, it's probably overkill to write the C accelerator) for other Python distributions, and to serve as a baseline for comparison to see if a C accelerator is justified.  Something like this might be a decent point of comparison:

def has_subsequence(it, searchseq, *, all=all, map=map, eq=operator.eq):
    searchseq = tuple(searchseq)
    if not searchseq:
        return True  # Empty sequence in everything
    window = collections.deque(itertools.islice(it, len(searchseq)-1), len(searchseq))
    for x in it:
        window.append(x)
        if all(map(eq, window, searchseq)):
            return True
    return False

History
Date	User	Action	Args
2015-12-18 06:14:52	josh.r	set	recipients: + josh.r, rhettinger, socketpair, abarry, seblin
2015-12-18 06:14:52	josh.r	set	messageid: <1450419292.03.0.880629333393.issue25898@psf.upfronthosting.co.za>
2015-12-18 06:14:52	josh.r	link	issue25898 messages
2015-12-18 06:14:51	josh.r	create