Author rhettinger
Recipients TFinley, mark.dickinson, rhettinger
Date 2009-01-05.08:20:08
SpamBayes Score 0.000577777
Marked as misclassified No
Message-id <1231143613.5.0.378833969324.issue4816@psf.upfronthosting.co.za>
In-reply-to
Content
Will spend a while mulling this over and taking it under advisement.

Initially, I am disinclined for several reasons.

1.  For many use cases, r>n is an error condition that should not pass
silently.

2.  While it's possible to make definitions of comb/perm that define the
r>n case as having an empty result, many definitions do not.  See the
wikipedia article on permutations:

"""
In general the number of permutations is denoted by P(n, r), nPr, or
sometimes P_n^r, where:

    * n  is the number of elements available for selection, and
    * r  is the number of elements to be selected (0 ≤ r ≤ n).

For the case where r = n  it has just been shown that P(n, r) = n!. The
general case is given by the formula:

    P(n, r) = n! / (n-r)!.
"""

That discussion is typical and I think it important the number of items
returned matches the formula (which fails when r>n because (n-r)! would
call for a negative factorial).

Also see the wikipedia article on combinations (the "choose" function)
which also expresses a factorial formula that fails for r>n.  No mention
is made for special handling for r>n.

3. For cases where you need a more inclusive definition that assigns a
zero length result to cases where r>n, the workaround is easy. 
Moreover, in such cases, there is some advantage to being explicit about
those cases being handled.  In the example provided by the OP, the
explicit workaround is:

   all(triangle_respected(*triple) for triple in
itertools.combinations(group, 3) if len(group)>=3)

or if you factor-out the unvarying constant expression in the inner-loop:

   len(group)>=3 or all(triangle_respected(*triple) for triple in
itertools.combinations(group, 3))

For other cases, it may be preferable to write your own wrapper:

   def myperm(pool, r=None):
      'custom version of perm returning an empty iterator when r>n'
      if r is not None and r > len(pool):
          return iter([])
      return itertools.permutations(pool, r)

I like this because it is explicit about its intended behavior but
doesn't slow-down the common case where some work actually gets done.

4. Don't want to break any existing code that relies on the ValueError
being thrown.  It's too late to change this for 2.6 and 3.0 and possibly
for 2.7 and 3.1 without having a deprecation period.  While this hasn't
been out long, it will have been once 2.7 is released.  Code that relies
on the ValueError may be hard to spot because it is implicitly relying
on the ValueError aborting the program for invalid input (invalid in the
sense of how the result is being used).

5. I personally find the r>n case to be weird and would have a hard time
explaining it to statistics students who are counting permutations and
relating the result back to the factorial formulas.

On the flip side, I do like Mark's thought that the r>n case being empty
doesn't muck-up the notion of combinations as returning all "r-length
subsequences of elements from the input iterable."  Also, I can see a
parallel in the string processing world where substring searches using
__contains__ simply return False instead of raising an exception when
the substring is longer that the string being searched.

Those are my initial thoughts.  Will ponder it a bit more this week.
History
Date User Action Args
2009-01-05 08:20:13rhettingersetrecipients: + rhettinger, mark.dickinson, TFinley
2009-01-05 08:20:13rhettingersetmessageid: <1231143613.5.0.378833969324.issue4816@psf.upfronthosting.co.za>
2009-01-05 08:20:12rhettingerlinkissue4816 messages
2009-01-05 08:20:08rhettingercreate