Issue 7674: select.select() corner cases: duplicate fds, out-of-range fds

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/51923

classification

Title:	select.select() corner cases: duplicate fds, out-of-range fds
Type:	behavior	Stage:
Components:	Extension Modules	Versions:	Python 3.2

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	berker.peksag, cdleary, docs@python, exarkun, georg.brandl, tshepang
Priority:	normal	Keywords:

Created on 2010-01-11 06:52 by cdleary, last changed 2022-04-11 14:56 by admin.

Messages (2)
msg97578 - (view)	Author: Chris Leary (cdleary)	Date: 2010-01-11 06:52
I was just reading through this ACM article that enumerates some of the issues with the select function in .NET: http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext select.select() currently suffers from the same documentation problem where the behavior with duplicate and/or out-of-range file descriptors in one of the sequences (i.e. rlist) is not described. Given the current implementation of seq2set in trunk it appears that: 1. A ValueError is raised when a given file descriptor is out of range. (Typically a result of the programmer passing a non-fd value, since FD_SETSIZE is "normally at least equal to the maximum number of descriptors supported by the system.") 2. Duplicate file descriptor numbers are collapsed into the fd_set, and are therefore idempotent at a system API level. However, the language-level support code generally assumes no duplication, as there is a fixed size array of (FD_SETSIZE + 1) pylist entries (one additional for a sentinel value). Although there is a TODO to dynamically size that to the largest targeted file descriptor number, that would still assume one PyObject per file descriptor in the input sequences. The set2list function used to produce a return value will, however, return duplicates: for each value in the input list, if the corresponding fd is set, that pyobject is added to the return list. Proposed Changes ---------------- At a glance it would seem that the Right Thing to do is to collapse duplicates in the input, as if we created a set(AsFileDescriptor(o) for o in input_list), so that no duplicates will be returned in the result; however, you can have a heterogeneous input list with a fileno like 5 and a file-like object whose fileno() resolved to 5, in which case you don't want to arbitrarily choose only one of those PyObjects to return. Therefore, I'm thinking it's probably best to leave it as-is and document it. In any case, if we want to explicitly allow duplicates in the input list we should probably make the pylist arrays into dynamically sized structures in the sizes of the corresponding input lists for correctness. If this all makes sense I'll be happy to come up with a module/documentation/unit test patch.
msg109992 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2010-07-11 11:20
Chris, to me it's as clear as mud but please produce a doc patch anyway. :)

History
Date	User	Action	Args
2022-04-11 14:56:56	admin	set	github: 51923
2018-09-27 19:45:42	berker.peksag	set	nosy: + berker.peksag
2014-02-03 17:04:27	BreamoreBoy	set	nosy: - BreamoreBoy
2013-07-31 13:31:50	tshepang	set	nosy: + tshepang
2010-07-11 11:44:01	pitrou	set	assignee: docs@python -> versions: + Python 3.2 nosy: + exarkun components: - Documentation
2010-07-11 11:20:23	BreamoreBoy	set	assignee: georg.brandl -> docs@python messages: + msg109992 nosy: + BreamoreBoy, docs@python
2010-01-11 06:52:29	cdleary	create