classification
Title: itertools: unlabelled balls in boxes
Type: enhancement Stage:
Components: Versions: Python 3.3
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Phillip.M.Feldman, Phillip.M.Feldman@gmail.com, cvrebert, mark.dickinson, rhettinger, terry.reedy
Priority: low Keywords:

Created on 2011-09-11 20:01 by Phillip.M.Feldman, last changed 2011-09-25 02:50 by Phillip.M.Feldman. This issue is now closed.

Files
File name Uploaded Description Edit
math-408-notes-b.pdf Phillip.M.Feldman, 2011-09-20 18:46
Messages (16)
msg143874 - (view) Author: Phillip Feldman (Phillip.M.Feldman) Date: 2011-09-11 20:01
The current set of combinatorial functions in `itertools` does not include unlabelled balls in labeled boxes and unlabelled balls in unlabelled boxes.  If the boxes have no capacity limits (i.e., can store an unlimited number of balls), then the unlabelled balls in labelled boxes can be handled with existing tools.  But, this is a special case.  Various people have developed working Python implementations for the "unlabelled balls in labelled boxes with capacity limits" problem.  I believe that "unlabelled balls in unlabelled boxes with capacity limits" can be handled with a minor variation on the same algorithm.  It would be a great benefit to have this capability in `itertools`.
msg143895 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-12 12:13
> "unlabelled balls in unlabelled boxes with capacity limits"

What does this mean?  If the boxes are unlabelled, how can they have individual capacity limits?  Or do you mean just a single limit that applies to all boxes?
msg143897 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-12 12:21
> "unlabelled balls in labelled boxes with capacity limits"

Out of curiosity, what was the application that you needed this for?

This one feels a bit too specialized to me to be worth adding to the itertools library;  I see itertools more as providing a collection of general-purpose tools that the user can combine for specific situations, and it's easy to do the above with itertools.combinations plus some filtering of the results to satisfy the capacity constraints.  Admittedly, that's less-efficient than a custom-built recursive solution.
msg143920 - (view) Author: Phillip M. Feldman (Phillip.M.Feldman@gmail.com) Date: 2011-09-12 16:41
Hello Mark,

This is a fair question.  Suppose that I have three boxes with capacity
limits of 3, 2, and 1, and that there are three balls in total.  Two of the
possible distributions are the following:

2, 0, 1
2, 1, 0

Capacity limits of the individual boxes must be observed when distributing
the balls.  Even though the second and third boxes have different
capacities, we must treat the above two distributions of balls as
equivalent.

Combinatorics problems involving boxes with capacity limits arise in such
application domains as physics and reliability.

Phillip

On Mon, Sep 12, 2011 at 5:13 AM, Mark Dickinson <report@bugs.python.org>wrote:

>
> Mark Dickinson <dickinsm@gmail.com> added the comment:
>
> > "unlabelled balls in unlabelled boxes with capacity limits"
>
> What does this mean?  If the boxes are unlabelled, how can they have
> individual capacity limits?  Or do you mean just a single limit that applies
> to all boxes?
>
> ----------
> nosy: +mark.dickinson
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12961>
> _______________________________________
>
msg143935 - (view) Author: Phillip M. Feldman (Phillip.M.Feldman@gmail.com) Date: 2011-09-12 19:59
Here's an example of a problem from an entirely different domain:

An error control coding scheme can correct up to 3 errors in the header of a
packet and up to one error in the body of a packet. A given message is
divided into four consecutive packets. Find all possible correctible
distributions of 6 errors among the four packets, treating the order of the
four packets as significant.

Phillip

On Mon, Sep 12, 2011 at 9:41 AM, Phillip M. Feldman
<report@bugs.python.org>wrote:

>
> Phillip M. Feldman <Phillip.M.Feldman@gmail.com> added the comment:
>
> Hello Mark,
>
> This is a fair question.  Suppose that I have three boxes with capacity
> limits of 3, 2, and 1, and that there are three balls in total.  Two of the
> possible distributions are the following:
>
> 2, 0, 1
> 2, 1, 0
>
> Capacity limits of the individual boxes must be observed when distributing
> the balls.  Even though the second and third boxes have different
> capacities, we must treat the above two distributions of balls as
> equivalent.
>
> Combinatorics problems involving boxes with capacity limits arise in such
> application domains as physics and reliability.
>
> Phillip
>
> On Mon, Sep 12, 2011 at 5:13 AM, Mark Dickinson <report@bugs.python.org
> >wrote:
>
> >
> > Mark Dickinson <dickinsm@gmail.com> added the comment:
> >
> > > "unlabelled balls in unlabelled boxes with capacity limits"
> >
> > What does this mean?  If the boxes are unlabelled, how can they have
> > individual capacity limits?  Or do you mean just a single limit that
> applies
> > to all boxes?
> >
> > ----------
> > nosy: +mark.dickinson
> >
> > _______________________________________
> > Python tracker <report@bugs.python.org>
> > <http://bugs.python.org/issue12961>
> > _______________________________________
> >
>
> ----------
> nosy: +Phillip.M.Feldman@gmail.com
> Added file: http://bugs.python.org/file23132/unnamed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12961>
> _______________________________________
>
msg144147 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-09-16 18:21
Feature requests only apply to 3.3.

To echo Mark: there are an indefinite number of combinatorial generators that could be added. The itertools module should only have a few of the most generally useful, especially in combination with other tools. An extensive combinatorics package would be a different project. (And I have done a bit of work in this direction.)
msg144226 - (view) Author: Phillip Feldman (Phillip.M.Feldman) Date: 2011-09-18 03:30
"The itertools module should only have a few of the most generally useful, especially in combination with other tools."

Balls-in-boxes _is_ one of the most basic of the canonical combinatorial problems.  You can verify this by opening any text on combinatorics or combinatorial probability.

"An extensive combinatorics package would be a different project. (And I have done a bit of work in this direction.)"

It sounds as though your project is in its infancy, but I'd like to see what you've got if you are willing to share.
msg144267 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-19 13:14
> Balls-in-boxes _is_ one of the most basic of the canonical combinatorial problems.

Agreed.  But in its basic form, this is covered by itertools.combinations, isn't it?  It may be worth a doc recipe showing how to use itertools.combinations for this (especially since the naive implementation---e.g., the one I posted on comp.lang.python, doesn't get boundary cases right).
msg144273 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-19 13:41
And using combinations_with_replacement, it's even a one-liner:

>>> balls_in_boxes = lambda n, k: (tuple(map(c.count, range(k))) for c in combinations_with_replacement(range(k), n))
>>> for item in balls_in_boxes(5,3): print(item)
... 
(5, 0, 0)
(4, 1, 0)
(4, 0, 1)
(3, 2, 0)
(3, 1, 1)
(3, 0, 2)
(2, 3, 0)
(2, 2, 1)
(2, 1, 2)
(2, 0, 3)
(1, 4, 0)
(1, 3, 1)
(1, 2, 2)
(1, 1, 3)
(1, 0, 4)
(0, 5, 0)
(0, 4, 1)
(0, 3, 2)
(0, 2, 3)
(0, 1, 4)
(0, 0, 5)
msg144347 - (view) Author: Phillip Feldman (Phillip.M.Feldman) Date: 2011-09-20 18:46
Mark: I disagree with your claim that "in its basic form, this is covered by itertools.combinations".  If you open the attached text on elementary combinatorics and go to page 11, you will see a table that lays out six of the eight most basic types of occupancy problem.  Note that only two of the six are handled by itertools.combinations.
msg144349 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-20 19:01
> Mark: I disagree ...

Okay, I misunderstood.  I thought you were still talking about the unlabelled balls in labelled boxes problem, which is an isomorphic problem to that solved by combinations_with_replacement.

It looks as though you're now widening the scope of your proposal.  Please could you give clear descriptions of exactly what functionality you're proposing to add?
msg144355 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-09-20 22:43
Any additions would need to be motivated by real world problems.  The issue is that adding more generators makes the module harder to learn and remember, so tools are not usually added "for the sake of completeness".
msg144366 - (view) Author: Phillip Feldman (Phillip.M.Feldman) Date: 2011-09-21 04:33
Ideally, I'd like to see support for all combinations of the following occupancy problem features:

- Labeled and unlabeled boxes
- Labeled and unlabeled balls
- Empty boxes allowed and empty boxes forbidden
- Boxes with no capacity limits and with capacity limits

There are 2^4= 16 combinations in total.  As has been previously noted, some of these can be handled with the existing toolset, but the majority cannot.
msg144367 - (view) Author: Phillip Feldman (Phillip.M.Feldman) Date: 2011-09-21 04:44
With the exception of the "empty boxes forbidden" category, I've come across all of these at one time or another, many in the context of error control coding (data communications).  Much of the early work on occupancy problems was motivated by physics and the theory of probability, but if one does Internet searches on occupancy problems, it seems as though there is now more interest in such area as the biological sciences, genetic testing, wildlife monitoring, and the like.
msg144372 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-09-21 08:41
Sorry Phillip, I'm closing this feature request because the benefits would likely be outweighed by the costs (maintenance, learning curve, module complexity, etc).  It was not the goal of the module to become a complete combinatorics toolkit; rather, the goal was to provide a set of practical tools for reasonably common needs (i.e. generating combinations of test case inputs).

You are invited to create a full-featured third-party combinatorics module and put it on PyPi. If the quality of the module is high, I will provide links to it from the itertools docs.  If the external module becomes popular (demonstrable user demand), I'll re-open this feature request for further discussion.
msg144517 - (view) Author: Phillip Feldman (Phillip.M.Feldman) Date: 2011-09-25 02:50
Raymond-

I think that you may have overestimated the complexity of the problem.  In about 5 hours, I coded, debugged, and documented a set of generator functions to solve the general formulation (including box limits) for three occupancy problems:

- unlabeled balls in labeled boxes,
- unlabeled_balls_in_unlabeled_boxes, and
- labeled_balls_in_unlabeled_boxes.

(I will add labeled balls in labeled boxes, although this case is less interesting).

On the other hand, I agree with your comment that the general need for these has not yet been demonstrated.  I will post my code to PyPI and we'll see what happens.
History
Date User Action Args
2011-09-25 02:50:51Phillip.M.Feldmansetmessages: + msg144517
2011-09-21 08:41:36rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg144372
2011-09-21 04:44:57Phillip.M.Feldmansetmessages: + msg144367
2011-09-21 04:33:38Phillip.M.Feldmansetmessages: + msg144366
2011-09-20 22:43:04rhettingersetmessages: + msg144355
2011-09-20 19:01:24mark.dickinsonsetmessages: + msg144349
2011-09-20 18:46:54Phillip.M.Feldmansetfiles: + math-408-notes-b.pdf

messages: + msg144347
2011-09-19 13:41:03mark.dickinsonsetmessages: + msg144273
2011-09-19 13:14:46mark.dickinsonsetmessages: + msg144267
2011-09-18 03:30:25Phillip.M.Feldmansetmessages: + msg144226
2011-09-17 18:26:12rhettingersetpriority: normal -> low
2011-09-17 16:55:20georg.brandlsettitle: unlabelled balls in boxes -> itertools: unlabelled balls in boxes
2011-09-17 15:48:25ezio.melottisetfiles: - unnamed
2011-09-17 15:48:21ezio.melottisetfiles: - unnamed
2011-09-16 18:21:45terry.reedysetnosy: + terry.reedy

messages: + msg144147
versions: + Python 3.3, - Python 2.7
2011-09-13 04:21:53cvrebertsetnosy: + cvrebert
2011-09-12 19:59:26Phillip.M.Feldman@gmail.comsetfiles: + unnamed

messages: + msg143935
2011-09-12 16:41:02Phillip.M.Feldman@gmail.comsetfiles: + unnamed

messages: + msg143920
nosy: + Phillip.M.Feldman@gmail.com
2011-09-12 12:21:39mark.dickinsonsetassignee: rhettinger
messages: + msg143897
2011-09-12 12:13:35mark.dickinsonsetnosy: + mark.dickinson
messages: + msg143895
2011-09-11 20:01:50Phillip.M.Feldmancreate