classification
Title: Fix for bug 30378 regressed SysLogHandler by making it resolve addresses at initialization instead of in `.emit()`
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: calcheng, iritkatriel, ngie, vinay.sajip
Priority: normal Keywords:

Created on 2019-10-02 21:32 by ngie, last changed 2021-10-18 23:26 by iritkatriel.

Messages (4)
msg353775 - (view) Author: Enji Cooper (ngie) * Date: 2019-10-02 21:32
The change made for bug 30378 caused a regression in our code by making
lookups for SysLogHandler addresses at init time, instead of making them more lazy. Example:

>>> import logging.handlers
>>> LOGGER = logging.getLogger("logger")
>>> LOGGER.addHandler(logging.handlers.SysLogHandler(address=('resolvessometimesbutnotthefirsttime.com', logging.handlers.SYSLOG_UDP_PORT)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/logging/handlers.py", line 767, in __init__
    ress = socket.getaddrinfo(host, port, 0, socktype)
socket.gaierror: [Errno -2] Name or service not known

This is exacerbated by the fact that a lot of code the organization I work for (and to be honest, I have written in the past) initializes global loggers once at import time. If the SysLogHandler is added to the global logger and the DNS resolution isn't possible (/etc/hosts is empty on Unix or does not contain the entry, and DNS out is to lunch), it will fall on its face at initialization time.

There needs to be a more graceful way of initializing loggers like this, or a way of delaying the host resolution, so it fails gracefully and doesn't crash the entire program (restoring the previous behavior).

Example: SMTPHandler doesn't do name resolution until it calls emit, which seems like a much more logical place to do the operation (especially since DNS records can change or become invalid).
msg353780 - (view) Author: Enji Cooper (ngie) * Date: 2019-10-02 22:15
Capturing more context here, based on internal discussion: other handlers are doing address resolution in `emit()` (HTTPHandler, SMTPHandler), which is expensive. In order for SysLogHandler to not regress behavior and not become expensive, performance-wise, it would probably be best to use `functools.lru_cache()`, using the address and a timeout as the key when resolving the addresses to avoid always doing address resolutions, e.g., DNS lookups: https://docs.python.org/3/library/functools.html#functools.lru_cache .
msg353907 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2019-10-04 08:40
This could perhaps be handled by just swallowing errors in the constructor (as is done for Unix sockets, for example) and trying to connect in emit() if there is no connection.

Not sure I want to go down the caching route - it complicates things for what is a rare use case, plus raises the question of allowing configurable timeouts for the cache, etc.

This seems a rare use case, so I have things that have a higher priority on my list of things to do, but I'll certainly review a patch to address this.
msg404247 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-10-18 23:26
Reproduced on 3.11.

Changing type because crash typically refers to segfault or hang in the interpreter rather than a valid exception.
History
Date User Action Args
2021-10-18 23:26:10iritkatrielsetversions: + Python 3.10, Python 3.11, - Python 3.7, Python 3.8
nosy: + iritkatriel

messages: + msg404247

type: crash -> behavior
2019-10-04 08:40:20vinay.sajipsetmessages: + msg353907
versions: - Python 2.7, Python 3.5, Python 3.6
2019-10-03 01:37:41xtreaksetnosy: + vinay.sajip
2019-10-02 22:15:33ngiesetmessages: + msg353780
2019-10-02 21:32:35ngiesettitle: Fix for bug 30378 regressed SysLogHandler -> Fix for bug 30378 regressed SysLogHandler by making it resolve addresses at initialization instead of in `.emit()`
2019-10-02 21:32:12ngiecreate