This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: DatagramHandler doing DNS lookup on every log message
Type: performance Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bmerry, eric.smith
Priority: normal Keywords:

Created on 2022-03-29 08:47 by bmerry, last changed 2022-04-11 14:59 by admin.

Messages (9)
msg416247 - (view) Author: Bruce Merry (bmerry) * Date: 2022-03-29 08:47
logging.DatagramHandler uses socket.sendto to send the messages. If the given address is a hostname rather than an IP address, it will do a DNS lookup every time.

I suspect that fixing issue 14855 will also fix this, since fixing that issue requires resolving the hostname to determine whether it is an IPv4 or IPv6 address to create a suitable socket.

I've run into this on 3.8, but tagging 3.10 since the code still looks the same.
msg416270 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-03-29 16:01
If you don’t look it up every time, how do you deal with DNS timeouts?
msg416279 - (view) Author: Bruce Merry (bmerry) * Date: 2022-03-29 17:49
> If you don’t look it up every time, how do you deal with DNS timeouts?

Do you mean expiring the IP address when the TTL is reached? I suppose that could be an issue for a long-running service, and I don't have a good answer to that. Possibly these days with services meshes and load-balancers it is less of a concern since a logging server can move without changing its IP address.

But it's important for a logging system not to block the service doing the logging (which is one reason for using UDP in the first place). I only discovered this issue because of some flaky DNS servers that would occasionally take several seconds to answer a query, and block the whole asyncio event loop while it waited.

At a minimum it would be useful to document it, so that you know it's something to be concerned about when using DatagramHandler.
msg416286 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-03-29 19:27
> Do you mean expiring the IP address when the TTL is reached? 

Yes, that's what I mean. Isn't the resolver library smart enough to cache lookups and handle the TTL timeout by itself?
msg416287 - (view) Author: Bruce Merry (bmerry) * Date: 2022-03-29 19:44
> Yes, that's what I mean. Isn't the resolver library smart enough to cache lookups and handle the TTL timeout by itself?

Apparently not in this case - with tcpdump I can see the DNS requests being fired off several times a second. I'll need to check what the TTL actually is though.
msg416288 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-03-29 19:46
Hmm. I'm not sure we should try to work around a bad resolver issue. What's your platform, and how did you install Python?
msg416289 - (view) Author: Bruce Merry (bmerry) * Date: 2022-03-29 20:01
> Hmm. I'm not sure we should try to work around a bad resolver issue. What's your platform, and how did you install Python?

Fair point. It's Ubuntu 20.04, running inside Docker, with the default Python (3.8). I've also reproduced it outside Docker (again Ubuntu 20.04 with system Python). The TTL is 30s, so I'm not sure why systemd-resolved isn't caching it for messages logged several times a second.

Even if the system has a local cache though, it's not ideal to have logging block when the TTL expires, particularly in an event-driven (asyncio) service. Updating the address in a background thread while continuing to log to the old address might be better. But my use case is particularly real-time (even 10ms of latency is problematic), and maybe that shouldn't drive the default behaviour.

I blame the lack of standard POSIX functions for doing DNS lookups asynchronously and in a way that provides TTL information to the client.
msg416290 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-03-29 20:06
> I blame the lack of standard POSIX functions for doing DNS lookups asynchronously and in a way that provides TTL information to the client.

I totally agree with that!

And I agree it would be nice to have some way of doing non-blocking lookups when the TTL expires (or whenever it decides it needs to do a lookup). But it's going to be non-trivial, I fear.
msg416327 - (view) Author: Bruce Merry (bmerry) * Date: 2022-03-30 07:10
> But it's going to be non-trivial, I fear.

Yeah. Maybe some documentation is achievable in the short term though, so that users who care more about latency than changing DNS are aware that they should do the lookup themselves?
History
Date User Action Args
2022-04-11 14:59:57adminsetgithub: 91305
2022-03-30 07:10:36bmerrysetmessages: + msg416327
2022-03-29 20:06:38eric.smithsetmessages: + msg416290
2022-03-29 20:01:18bmerrysetmessages: + msg416289
2022-03-29 19:46:12eric.smithsetmessages: + msg416288
2022-03-29 19:44:11bmerrysetmessages: + msg416287
2022-03-29 19:27:57eric.smithsetmessages: + msg416286
2022-03-29 17:49:35bmerrysetmessages: + msg416279
2022-03-29 16:01:32eric.smithsetnosy: + eric.smith
messages: + msg416270
2022-03-29 08:47:39bmerrysettype: performance
2022-03-29 08:47:19bmerrycreate