This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: readline memory corruption when sys.stdin fd >= FD_SETSIZE for select()
Type: crash Stage: needs patch
Components: Extension Modules Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, gregory.p.smith, marienz, martin.panter, xiang.zhang
Priority: normal Keywords:

Created on 2017-03-02 19:48 by gregory.p.smith, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
crash_readline_fdset.py gregory.p.smith, 2017-03-02 19:48
Pull Requests
URL Status Linked Edit
PR 540 closed marienz, 2017-03-07 10:04
Messages (9)
msg288825 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-03-02 19:48
The readline module causes memory corruption (sometimes a crash) when the sys.stdin file descriptor is out of bounds for its FD_SET() call within readline.c's readline_until_enter_or_signal() function.

https://github.com/python/cpython/blob/master/Modules/readline.c#L1228

A tiny program reproducing this problem is attached.

FD_SET should not be used if the file descriptor is too large for use in select() (ie: >= FD_SETSIZE).  OTOH, we should probably just ditch select() entirely and use poll() here so that this issue does not exist.

On Python 2.7-3.6 we probably need to preserve both select and poll options for platform compatibility reasons since those shipped that way.

For Python 3.7 I suggest we stop supporting platforms that do not have poll() unless anyone knows of any that actually exist.
msg288826 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-03-02 19:55
Do you see a chance that the issue could be abused? IMO an attacker can't control FD number easily.
msg288829 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-03-02 20:28
It doesn't seem very abusable... Though of a server accepts enough remote
connections and uses input() and swaps out stdin after remote uses up fds
with connections...  That's a lot of circumstances at once. Rare
application.

We ran into it with an interactive program controlling a bunch of things so
it had lots of fds open when a tty stdin was swapped in for input.

On Thu, Mar 2, 2017, 11:55 AM Christian Heimes <report@bugs.python.org>
wrote:

>
> Christian Heimes added the comment:
>
> Do you see a chance that the issue could be abused? IMO an attacker can't
> control FD number easily.
>
> ----------
> nosy: +christian.heimes
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue29700>
> _______________________________________
>
msg288831 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-02 21:06
Be careful, some OSes have limited support for “poll”, “kqueue”, etc with terminals, and I ended up using “select” in the test suite:

https://bugs.python.org/issue26870#msg265604
https://github.com/python/cpython/commit/79f561d126d09d6d7ea1457a2a6ef267d93e6448
msg288832 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-03-02 21:08
Bummer. I wish select could die. :)

On Thu, Mar 2, 2017, 1:06 PM Martin Panter <report@bugs.python.org> wrote:

>
> Martin Panter added the comment:
>
> Be careful, some OSes have limited support for “poll”, “kqueue”, etc with
> terminals, and I ended up using “select” in the test suite:
>
> https://bugs.python.org/issue26870#msg265604
>
> https://github.com/python/cpython/commit/79f561d126d09d6d7ea1457a2a6ef267d93e6448
>
> ----------
> nosy: +martin.panter
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue29700>
> _______________________________________
>
msg289098 - (view) Author: Marien Zwart (marienz) * Date: 2017-03-06 13:04
On Python 3, crash_readline_fdset.py does not crash for me, because its input() contains a check documented as:

       /* We should only use (GNU) readline if Python's sys.stdin and
          sys.stdout are the same as C's stdin and stdout, because we
          need to pass it those. */

and calls sys.stdin.getline() instead.

I don't understand why this was added (eba769657a32cb08d96f021f40c79a54ade0bffc's commit message "Make input9) behave properly with the new I/O library" does not explain it).

PyOS_Readline does still take sys_stdin and sys_stdout arguments, but the only callers in CPython itself pass the actual stdin and stdout. Not sure if it's still worth fixing (maybe just turn it from a crash into an error if the fd is too high, but don't add an alternative implementation?).

On Python 2, I can fix it, but then I hit the same problem in readline itself (http://git.savannah.gnu.org/cgit/readline.git/tree/input.c#n518).

So I suppose the next step is reporting it there, and see if they're interested in fixing it (looks like readline isn't currently using anything more fancy than select() and pselect(), and there's a few more calls to those that would probably also need fixing...). Doesn't seem useful to fix it here first.
msg289133 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-06 21:01
“Input9)” is probably a typo for “input()”.

In Python 2, sys.stdin etc are by default wrappers around <stdio.h>’s “stdin” etc, and can easily be wrappers around other <stdio.h> FILE objects, so the PyOS_Readline API and Python’s “readline” module pass these <stdio.h> objects directly to rl_instream etc.

In Python 3, sys.stdin etc are not directly related to <stdio.h> objects. But the PyOS_Readline API was not changed, and the “readline” module still uses the rl_instream API which requires a <stdio.h> FILE object. So the new code decides if it is reasonable to substitute <stdio.h> “stdin” etc objects for the Python “sys” objects.
msg289836 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-19 00:40
Marien’s pull request is for 2.7 and adds two new paths when raw_input is called:

* On Linux (actually glibc), use “poll” rather than “select”
* In other cases, if sys.stdin cannot be used with “select”, raise ValueError

Marien admits that even in the best case, some versions of the Readline library still crash because they use “select” themselves. Marien: can you identify which versions? Gnu Readline, or the wrapper in Editline? New versions vs superseded versions?

Assuming we add some catch-all behaviour like the ValueError, any further parts like the Linux “poll” implementation seem like new features, and are only appropriate for the next version of Python (3.7 atm). But it seems the bug would be hard to trigger in Python 3, so I doubt the Linux-specific implementation is worthwhile.

IMO the simplest solution may be to use one of the existing fallbacks if “select” is not usable. Either the other readline_until_enter_or_signal implementation (which hooks SIGINT rather than using the callback API), or PyOS_StdioReadline, or the raw_input fallback for non-interactive mode.

I would only bother with the complication of “poll” if there is a strong use-case. Gregory: do you really care if the “readline” module is used, or would a fallback like PyOS_StdioReadline be fine in your case (which is used if you never import the “readline” module)?
msg292627 - (view) Author: Marien Zwart (marienz) * Date: 2017-04-30 12:14
GNU readline 7.0 introduces the problematic use of pselect(). GNU readline 6.3 is unaffected (and presumably older versions as well). I didn't check other implementations.

I'll see if I can get this fixed in GNU readline. The non-select code is still there, and the NEWS file says pselect is used because SIGWINCH interrupts it, while it doesn't interrupt read(). So maybe readline can fall through to the old code if a high FD is used, since it doesn't regress functionality that much (while adding poll support to readline would run into similar portability issues as we're seeing here).
History
Date User Action Args
2022-04-11 14:58:43adminsetgithub: 73886
2017-04-30 12:14:38marienzsetmessages: + msg292627
2017-03-19 00:40:35martin.pantersetmessages: + msg289836
2017-03-07 10:04:18marienzsetpull_requests: + pull_request448
2017-03-06 21:01:38martin.pantersetmessages: + msg289133
2017-03-06 13:12:17xiang.zhangsetnosy: + xiang.zhang
2017-03-06 13:04:48marienzsetmessages: + msg289098
2017-03-06 10:37:44marienzsetnosy: + marienz
2017-03-02 21:08:06gregory.p.smithsetmessages: + msg288832
2017-03-02 21:06:21martin.pantersetnosy: + martin.panter
messages: + msg288831
2017-03-02 20:28:54gregory.p.smithsetmessages: + msg288829
2017-03-02 19:55:20christian.heimessetnosy: + christian.heimes
messages: + msg288826
2017-03-02 19:48:15gregory.p.smithcreate