This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author cdleary
Recipients cdleary, georg.brandl
Date 2010-01-11.06:52:27
SpamBayes Score 2.220446e-16
Marked as misclassified No
Message-id <1263192751.1.0.719382931138.issue7674@psf.upfronthosting.co.za>
In-reply-to
Content
I was just reading through this ACM article that enumerates some of the issues with the select function in .NET: http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

select.select() currently suffers from the same documentation problem where the behavior with duplicate and/or out-of-range file descriptors in one of the sequences (i.e. rlist) is not described.

Given the current implementation of seq2set in trunk it appears that:

1. A ValueError is raised when a given file descriptor is out of range. (Typically a result of the programmer passing a non-fd value, since FD_SETSIZE is "normally at least equal to the maximum number of descriptors supported by the system.")

2. Duplicate file descriptor numbers are collapsed into the fd_set, and are therefore idempotent at a system API level.

However, the language-level support code generally assumes no duplication, as there is a fixed size array of (FD_SETSIZE + 1) pylist entries (one additional for a sentinel value). Although there is a TODO to dynamically size that to the largest targeted file descriptor number, that would still assume one PyObject per file descriptor in the input sequences.

The set2list function used to produce a return value will, however, return duplicates: for each value in the input list, if the corresponding fd is set, that pyobject is added to the return list.


Proposed Changes
----------------

At a glance it would seem that the Right Thing to do is to collapse duplicates in the input, as if we created a set(AsFileDescriptor(o) for o in input_list), so that no duplicates will be returned in the result; however, you *can* have a heterogeneous input list with a fileno like 5 and a file-like object whose fileno() resolved to 5, in which case you don't want to arbitrarily choose only one of those PyObjects to return. Therefore, I'm thinking it's probably best to leave it as-is and document it.

In any case, if we want to explicitly allow duplicates in the input list we should probably make the pylist arrays into dynamically sized structures in the sizes of the corresponding input lists for correctness.

If this all makes sense I'll be happy to come up with a module/documentation/unit test patch.
History
Date User Action Args
2010-01-11 06:52:31cdlearysetrecipients: + cdleary, georg.brandl
2010-01-11 06:52:31cdlearysetmessageid: <1263192751.1.0.719382931138.issue7674@psf.upfronthosting.co.za>
2010-01-11 06:52:29cdlearylinkissue7674 messages
2010-01-11 06:52:27cdlearycreate