classification
Title: SysLogHandler closes connection before using it
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Julien.Palard, joshpurvis, python-dev, sbt, vinay.sajip
Priority: normal Keywords:

Created on 2013-05-15 13:49 by Julien.Palard, last changed 2013-05-16 22:03 by python-dev. This issue is now closed.

Messages (11)
msg189283 - (view) Author: Julien Palard (Julien.Palard) Date: 2013-05-15 13:49
I have a script that close its socket to /dev/log immediatly before using it, causing it to fail, here is the code :
{{{
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging
import logging.handlers
import daemon
from daemon.pidlockfile import PIDLockFile

logger = logging.getLogger('twitterCounter')
logger.addHandler(logging.handlers.SysLogHandler(address='/dev/log'))
logger.setLevel(logging.DEBUG)

logger.info("Hello")

with daemon.DaemonContext():
    logger.info("World !")
}}}

and here is an strace :
{{{
strace -s999 -f  /tmp/test.py 2>&1 | grep -C2 ^connect
// Outside daemonContext, all is OK
// Note that inside daemonContext, all is OK if we do not log outside daemonContext.

close(3)                                = 0
socket(PF_FILE, SOCK_DGRAM, 0)          = 3
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = 0
sendto(3, "<14>Hello\0", 10, 0, NULL, 0) = 10
getuid()                                = 1001
--

// Second log, inside daemonContext, with the erroneous "socket(); close()" :


socket(PF_FILE, SOCK_DGRAM, 0)          = 3
close(3)                                = 0
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = -1 EBADF (Bad file descriptor)
close(3)                                = -1 EBADF (Bad file descriptor)

// As the previous try has failed, SysLogHandler seems to give another try with different parameters, failing too as expected socket type is DGRAM :
socket(PF_FILE, SOCK_STREAM, 0)         = 3
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = -1 EPROTOTYPE (Protocol wrong type for socket)
}}}
msg189323 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2013-05-15 23:13
Isn't this a case of handles being closed in the child after a fork, by the daemon module? Have you investigated the files_preserve option in DaemonContext?
msg189326 - (view) Author: Josh Purvis (joshpurvis) Date: 2013-05-15 23:37
Ironically, I ran into this same exact issue today, and I have investigated the `files_preserve` param, with no luck. I'm not too familiar with the internals here, but from what I can tell it works with FileHandler, but not the SysLogHandler. 

If you try to add the syslog handler to the files_preserve list it has no effect. It seems to need a stream, and SysLogHandler doesn't have the stream attribute.

# This works for FileHandler's
log = logging.getLogger('MyLog')
fh = logging.FileHandler('/some/file')

with daemon.DaemonContext(files_preserve=[fh.stream, ]):
   log.warn("In the belly of the beast.")
msg189346 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2013-05-16 09:59
The python-daemon documentation states, about files_preserve:

"Elements of the list are file descriptors (as returned by a file object's `fileno()` method) or Python `file` objects. Each specifies a file that is not to be closed during daemon start."

Notice that file objects are just a convenience - filenos() can be passed. The following, slightly modified script works as expected:

import logging
import logging.handlers
import daemon

logger = logging.getLogger('twitterCounter')
handler = logging.handlers.SysLogHandler(address='/dev/log')
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

logger.info("Hello, ")

with daemon.DaemonContext(files_preserve=[handler.socket.fileno()]):
    logger.info("world!")

Output in syslog after running the above:

May 16 10:58:42 eta-oneiric64 Hello, 
May 16 10:58:42 eta-oneiric64 world!
msg189376 - (view) Author: Julien Palard (Julien.Palard) Date: 2013-05-16 13:20
I understand the files_preserve parameter, the bug I'm filling is the innability of SysLogHandler to reopen the socket, although it tries :

// DaemonContext closing all FDs:
close(3)                                = 0
close(2)                                = 0
close(1)                                = 0
close(0)                                = 0

[...]

// Trying to send "World !" to the closed socket (developper missusing files_preserve
sendto(3, "<14>World !\0", 12, 0, NULL, 0) = -1 EBADF (Bad file descriptor)

// Reopening socket, with good parameters
socket(PF_FILE, SOCK_DGRAM, 0)          = 3

// WTF ? For me, the bug is here, why do we close it ?
// That's not the DaemonContext that closes the file here, as we already are in daemonContext, all files were closed before by the DaemonContext, so for me it's SysLogHandler who's closing here and it's a bug :
close(3)                                = 0

// Trying to connect to a closed socket ... will fail )o:
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = -1 EBADF (Bad file descriptor)

// Reclosing it ? ok ... why not as we don't know that it were closed.
close(3)                                = -1 EBADF (Bad file descriptor)

// Trying another socket type, cause first try failed, but failed cause the close(), it may have not been closed and have succeed. So this try may no apprear normally :
socket(PF_FILE, SOCK_STREAM, 0)         = 3
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = -1 EPROTOTYPE (Protocol wrong type for socket)
msg189386 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2013-05-16 16:56
I see what you're saying now, but there's no explicit close in logging, so it's coming from somewhere lower down. Let's examine what happens when we try to emit the record:

-> def emit(self, record):
(Pdb) 
> /usr/lib/python2.7/logging/handlers.py(791)emit()
-> msg = self.format(record) + '\000'
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(796)emit()
-> prio = '<%d>' % self.encodePriority(self.facility,
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(797)emit()
-> self.mapPriority(record.levelname))
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(799)emit()
-> if type(msg) is unicode:
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(803)emit()
-> msg = prio + msg
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(804)emit()
-> try:
(Pdb) p msg
'<14>world!\x00'
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(805)emit()
-> if self.unixsocket:
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(806)emit()
-> try:
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(807)emit()

==============================================================
Here is where we try to send the message
==============================================================
-> self.socket.send(msg)
(Pdb) n
error: (9, 'Bad file descriptor')
> /usr/lib/python2.7/logging/handlers.py(807)emit()
-> self.socket.send(msg)
(Pdb) s
> /usr/lib/python2.7/logging/handlers.py(808)emit()
==============================================================
It failed, as expected, so we go to the exception handler.
==============================================================
-> except socket.error:
(Pdb) 
> /usr/lib/python2.7/logging/handlers.py(809)emit()
==============================================================
We are going to try and connect again.
==============================================================
-> self._connect_unixsocket(self.address)
(Pdb) s
--Call--
> /usr/lib/python2.7/logging/handlers.py(737)_connect_unixsocket()
-> def _connect_unixsocket(self, address):
(Pdb) 
> /usr/lib/python2.7/logging/handlers.py(738)_connect_unixsocket()
-> self.socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
==============================================================
Created the socket at line 738. Line 739 is a comment.
==============================================================
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(740)_connect_unixsocket()
-> try:
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(741)_connect_unixsocket()
-> self.socket.connect(address)
(Pdb) n
error: (9, 'Bad file descriptor')
==============================================================
Line 740 is a try: statement.
At line 741, called connect, got an EBADF, and there's no
intervening call to close().
==============================================================
> /usr/lib/python2.7/logging/handlers.py(741)_connect_unixsocket()
-> self.socket.connect(address)
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(742)_connect_unixsocket()
-> except socket.error:
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(743)_connect_unixsocket()
-> self.socket.close()
==============================================================
This close is just trying to tidy up in the exception handler.
==============================================================
(Pdb) n
> /usr/lib/python2.7/logging/handlers.py(744)_connect_unixsocket()
-> self.socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
==============================================================
We'll try to open using TCP, which will fail.
==============================================================

To summarise:

line 738 is a call to socket.socket(AF_UNIX, SOCK_DGRAM)
line 739 is a comment "# syslog may require either ..."
line 740 is a try:
line 741 is a call to socket.connect(address)

There is no close() called by logging between socket.socket() and socket.connect(), so the close seems to be coming from inside one of those two calls to the socket module.
msg189389 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2013-05-16 17:14
We'll try this with a simple script which doesn't use logging at all:

import os
import socket

MSG1 = '<14>Hi, \x00'
MSG2 = '<14>there!\x00'

sock = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
sock.connect('/dev/log')
sock.send(MSG1)
os.close(sock.fileno()) # what daemonizing does
try:
    sock.send(MSG2)
except socket.error as e:
    print(e)
    print('Trying to reconnect:')
    sock = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
    try:
        sock.connect('/dev/log')
    except socket.error as e:
        print('Oh look, reconnecting failed: %s' % e)

When we run this, we get:

[Errno 9] Bad file descriptor
Trying to reconnect:
Oh look, reconnecting failed: [Errno 9] Bad file descriptor

And the strace output looks like this:

socket(PF_FILE, SOCK_DGRAM, 0)          = 3
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = 0
sendto(3, "<14>Hi, \0", 9, 0, NULL, 0)  = 9

===================================================
The next close() is the os.close(sock.fileno())
===================================================
close(3)                                = 0
sendto(3, "<14>there!\0", 11, 0, NULL, 0) = -1 EBADF (Bad file descriptor)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc4a90e5000
write(1, "[Errno 9] Bad file descriptor\n", 30) = 30
write(1, "Trying to reconnect:\n", 21)  = 21
socket(PF_FILE, SOCK_DGRAM, 0)          = 3
===================================================
Created the socket via a call to socket.socket().
===================================================
close(3)                                = 0
===================================================
No idea where this close() comes from, but it's
the same as in the logging case.
===================================================
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 10) = -1 EBADF (Bad file descriptor)
===================================================
connect() fails, just as in the logging case.
===================================================
write(1, "Oh look, reconnecting failed: [E"..., 60) = 60

So, while it seems to be a bug, it's not a logging bug. It seems to be
connected to the fact that os.close() closes the socket fd out from under the socket object, which somehow then causes a close() to be called...
msg189397 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-16 18:12
The line

    sock = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)

overwrites the old broken socket with a new one with the same fd.  The old socket's destructor closes the fd of the new socket.
msg189400 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2013-05-16 18:35
> The old socket's destructor closes the fd of the new socket.

Aha! Nice one. But what's the correct fix? I suppose a

self.sock = None

before every self.sock = socket.socket call would fix seem this, and while I can certainly make this change in SysLogHandler, isn't this a more general problem?
msg189403 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-16 19:17
Rather than

    self.sock = None

I would do

    self.sock.close()

which should work better for non-refcounted Pythons.

Of course it would be better to do this immediately after forking (i.e. before any more fds are created), otherwise you could still accidentally zap the fd of some other object.

If you can't do this immediately after forking then maybe it is better to move inherited potentially broken objects to a garbage list to prevent garbage collection.
msg189417 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-16 22:03
New changeset d91da96a55bf by Vinay Sajip in branch '2.7':
Issue #17981: Closed socket on error in SysLogHandler.
http://hg.python.org/cpython/rev/d91da96a55bf

New changeset 590b865aa73c by Vinay Sajip in branch '3.3':
Issue #17981: Closed socket on error in SysLogHandler.
http://hg.python.org/cpython/rev/590b865aa73c

New changeset f2809c7a7c3c by Vinay Sajip in branch 'default':
Closes #17981: Merged fix from 3.3.
http://hg.python.org/cpython/rev/f2809c7a7c3c
History
Date User Action Args
2013-05-16 22:03:12python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg189417

resolution: fixed
stage: resolved
2013-05-16 19:17:31sbtsetmessages: + msg189403
2013-05-16 18:35:49vinay.sajipsetmessages: + msg189400
2013-05-16 18:12:10sbtsetnosy: + sbt
messages: + msg189397
2013-05-16 17:14:57vinay.sajipsetmessages: + msg189389
2013-05-16 16:56:22vinay.sajipsetmessages: + msg189386
2013-05-16 13:20:03Julien.Palardsetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg189376
2013-05-16 09:59:42vinay.sajipsetstatus: open -> closed

messages: + msg189346
2013-05-15 23:37:26joshpurvissetstatus: pending -> open

messages: + msg189326
2013-05-15 23:13:49vinay.sajipsetstatus: open -> pending
resolution: not a bug
messages: + msg189323
2013-05-15 21:41:02joshpurvissetnosy: + joshpurvis
2013-05-15 21:04:14terry.reedysetnosy: + vinay.sajip
2013-05-15 13:49:55Julien.Palardcreate