classification
Title: multiprocessing.process using os.close(sys.stdin.fileno) instead of sys.stdin.close()
Type: Stage:
Components: Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jnoller Nosy List: OG7, grahamd, i000, jnoller, ptn
Priority: low Keywords: patch

Created on 2009-02-19 05:21 by grahamd, last changed 2009-06-30 17:13 by jnoller. This issue is now closed.

Files
File name Uploaded Description Edit
papy_gui.py i000, 2009-04-16 12:11 InteractiveConsole inside TkinterText
stdin_patch_v2.diff jnoller, 2009-06-30 03:03
Messages (12)
msg82457 - (view) Author: Graham Dumpleton (grahamd) Date: 2009-02-19 05:21
In multiprocessing.process it contains the code:

    def _bootstrap(self):
       ....
            if sys.stdin is not None:
                try:
                    os.close(sys.stdin.fileno())
                except (OSError, ValueError):
                    pass

This code should probably be calling sys.stdin.close() and not 
os.close(sys.stdin.fileno()).

The code as is will fail if sys.stdin had been replaced with an alternate 
file like object, such as StringIO, which doesn't have a fileno() method.
msg82579 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-02-21 19:53
Joshua Judson Rosen to python-list
	
	
Jesse Noller <jnoller@gmail.com> writes:
>
> On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
> <Graham.Dumpleton@gmail.com> wrote:
> > Why is the multiprocessing module, ie., multiprocessing/process.py, in
> > _bootstrap() doing:
> >
> >  os.close(sys.stdin.fileno())
> >
> > rather than:
> >
> >  sys.stdin.close()
> >
> > Technically it is feasible that stdin could have been replaced with
> > something other than a file object, where the replacement doesn't have
> > a fileno() method.
> >
> > In that sort of situation an AttributeError would be raised, which
> > isn't going to be caught as either OSError or ValueError, which is all
> > the code watches out for.
>
> I don't know why it was implemented that way. File an issue on the
> tracker and assign it to me (jnoller) please.

My guess would be: because it's also possible for sys.stdin to be a
file that's open in read+*write* mode, and for that file to have
pending output buffered (for example, in the case of a socketfile).

There's a general guideline, inherited from C, that one should ensure
that the higher-level close() routine is invoked on a given
file-descriptor in at most *one* process after that descriptor has
passed through a fork(); in the other (probably child) processes, the
lower-level close() routine should be called to avoid a
double-flush--whereby buffered data is flushed out of one process, and
then the *same* buffered data is flushed out of the (other)
child-/parent-process' copy of the file-object.

So, if you call sys.stdin.close() in the child-process in
_bootstrap(), then it could lead to a double-flush corrupting output
somewhere in the application that uses the multiprocessing module.

You can expect similar issues with just about /any/ `file-like objects'
that might have `file-like semantics' of buffering data and flushing
it on close, also--because you end up with multiple copies of the same
object in `pre-flush' state, and each copy tries to flush at some point.

As such, I'd recommend against just using .close(); you might use
something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
`else' clause unconditionally calls sys.stdin.close(), then you still
have double-flush problems if someone's set sys.stdin to a file-like
object with output-buffering.

I guess you could try calling that an `edge-case' and seeing if anyone
screams. It'd be sort-of nice if there was finer granularity in the
file API--maybe if file.close() took a boolean `flush' argument....
msg86024 - (view) Author: Marcin Cieslik (i000) Date: 2009-04-16 12:11
Hello,

I believe I am the edge-case. I've written a minimalist python
Tkinter-shell around Tkinter.Text and code.InteractiveConsole by
hi-jacking stdin, stdout and stderr. It "hangs" when using
multiprocessing pool.

to reproduce run papy_gui.py
and type:
>>> from math import sqrt
>>> from multiprocessing import Pool
>>> p = Pool()
>>> print p
<multiprocessing.pool.Pool object at 0xb723738c>
>>> p.map(sqrt, [1,2,3])
<<no more output ever>>
msg89195 - (view) Author: OG7 (OG7) Date: 2009-06-10 11:19
Using sys.stdin.close() fixes issues 5155 and 5331.
msg89198 - (view) Author: Graham Dumpleton (grahamd) Date: 2009-06-10 11:39
Worth noting is that Python documentation in:

http://docs.python.org/library/stdtypes.html

says:

"""
file.fileno()
Return the integer “file descriptor” that is used by the underlying 
implementation to request I/O operations from the operating system. This 
can be useful for other, lower level interfaces that use file 
descriptors, such as the fcntl module or os.read() and friends.

Note File-like objects which do not have a real file descriptor should 
not provide this method!
"""

So, if sys.stdin is replaced with a file like object, then the code 
should not be assuming that fileno() even exists.

At the moment the code doesn't check for AttributeError which is what is 
going to be raised for trying to access non existent fileno() method.

"""
>>> import StringIO
>>> s=StringIO.StringIO("xxx")
>>> s.fileno()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: StringIO instance has no attribute 'fileno'
"""

Thus error propagates. The better check though would be to use:

    def _bootstrap(self):
       ....
            if sys.stdin is not None and hasattr(sys.stdin, "fileno"):
                try:
                    os.close(sys.stdin.fileno())
                except (OSError, ValueError):
                    pass

That is, only actually call fileno() if it is present.

This would solve the problem for the original reported issue by making 
it actually adhere to what Python documentation says about existence of 
fileno().

This change wouldn't necessarily solve other peoples related issues 
though.
msg89199 - (view) Author: Pablo Torres Navarrete (ptn) Date: 2009-06-10 13:58
Wouldn't it be more pythonic to just try sys.stdin.fileno() and catch 
the AtributeError too?

def _bootstrap(self):
       ....
            try:
                os.close(sys.stdin.fileno())
            except AtributeError:
                sys.stdin.close()
            except (OSError, ValueError):
                pass
msg89205 - (view) Author: OG7 (OG7) Date: 2009-06-10 15:12
Closing the stdin fd without closing the stdin file is very dangerous.
It means that stdin will now access some random resource, for example, a
pipe created with os.pipe().

Closing stdin is useful to let the parent be alone in reading from it.
It can be achieved by replacing stdin by open('/dev/null'). The original
stdin can be closed or left to be garbage collected.

The "double flush" case is impossible to handle in general. It's the
libc's responsibility for standard file objects and sockets. But it
can't be helped (except by putting a warning in the documentation) if
someone combines multiprocessing with non-fork-safe code that keeps its
own buffers and doesn't check for a pid change.

So that code in _bootstrap should be:
sys.stdin.close()
sys.stdin = open(os.devnull)
msg89522 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-06-19 15:14
Attached is a patch which calls close() first, and then attempts to close 
the fd. In the case of an attribute errors (fileno doesn't exist) we 
simply set it to devnull. 

This is just a thought, feedback welcome - right now this allows this 
fixes issue 5155 and 5331 (and this issue)
msg89523 - (view) Author: OG7 (OG7) Date: 2009-06-19 15:55
Please do this:

--- a/multiprocessing/process.py
+++ b/multiprocessing/process.py
@@ -225,7 +225,8 @@ class Process(object):
             self._children = set()
             self._counter = itertools.count(1)
             try:
-                os.close(sys.stdin.fileno())
+                sys.stdin.close()
+                sys.stdin = open(os.devnull)
             except (OSError, ValueError):
                 pass
             _current_process = self

One shouldn't close the fileno after the file has been closed. The
stdlib raises an error, and the open(os.devnull) won't be reached. If no
error was thrown, it would be worse. This would be closing a fileno that
doesn't belong to sys.stdin anymore and may be used somewhere else.

The open(os.devnull) bit is both so that no one tries to do anything
with the old stdin anymore, and to let applications attempt to read from
stdin without an IOError.
msg89529 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-06-19 21:53
On Fri, Jun 19, 2009 at 11:55 AM, OG7<report@bugs.python.org> wrote:
>
> OG7 <onyxg7@users.sourceforge.net> added the comment:
>
> One shouldn't close the fileno after the file has been closed. The
> stdlib raises an error, and the open(os.devnull) won't be reached. If no
> error was thrown, it would be worse. This would be closing a fileno that
> doesn't belong to sys.stdin anymore and may be used somewhere else.
>
> The open(os.devnull) bit is both so that no one tries to do anything
> with the old stdin anymore, and to let applications attempt to read from
> stdin without an IOError.

Fair enough, I was worried a bit about skipping the
os.close(sys.stdin.fileno()) - the tests pass with both (because the
close fixes the basic problem). I need to come up with an appropriate
documentation note about the double flush issue, add in the tests I
gleaned from the other bugs and commit it. I'll post the full
doc/tests/code patch before doing so.
msg89900 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-06-30 03:03
Attached is a patch with docs, tests and the fix as suggested by OG7 (whom 
I can't add to the ACKS without a real name). Please check the additional 
doc note I added to the all platforms programming guidelines.
msg89943 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-06-30 17:13
Committed in r73708 on trunk
History
Date User Action Args
2011-01-08 11:33:02pitroulinkissue10174 superseder
2009-06-30 17:13:43jnollersetstatus: open -> closed
resolution: fixed
messages: + msg89943
2009-06-30 12:22:55jnollersetfiles: - 0001-Fix-issue-5313.patch
2009-06-30 12:22:43jnollersetfiles: - stdin-patchv1.patch
2009-06-30 03:03:46jnollersetfiles: + stdin_patch_v2.diff

messages: + msg89900
2009-06-19 21:53:44jnollersetmessages: + msg89529
2009-06-19 15:55:24OG7setfiles: + 0001-Fix-issue-5313.patch

messages: + msg89523
2009-06-19 15:14:55jnollersetfiles: + stdin-patchv1.patch
keywords: + patch
messages: + msg89522
2009-06-10 15:12:45OG7setmessages: + msg89205
2009-06-10 13:58:01ptnsetnosy: + ptn
messages: + msg89199
2009-06-10 11:39:21grahamdsetmessages: + msg89198
2009-06-10 11:19:38OG7setnosy: + OG7
messages: + msg89195
2009-04-16 12:11:18i000setfiles: + papy_gui.py
nosy: + i000
messages: + msg86024

2009-03-29 14:39:17jnollersetpriority: low
2009-02-21 19:53:44jnollersetmessages: + msg82579
2009-02-19 16:25:32jnollersetassignee: jnoller
2009-02-19 05:21:10grahamdcreate