New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
socket.inet_aton parsing issue on some libc versions #81676
Comments
The socket.inet_aton Python function uses libc's inet_aton function. This, on some implementations, for example the latest GNU C Library (glibc 2.29 as of today), accepts whitespace with trailing characters after a valid IP. An example can be seen below:
The problem is, some users might use this function to also validate whether a given host or ip string is valid, especially taking into account Python's "ask for forgiveness and not permission" motto (or a coding style?). This will introduce security vulnerabilities, like command injections, if it is used with insecure os.system or subprocess.* functions. For example, an ip of "1.1.1.1 ; whoami" will result in returning an output of both ping and whoami commands.
An initial report sent with this issue contains an example and potential security issue in ssl builtin module which has been already fixed (https://bugs.python.org/issue37463). It is also worth mentioning that the inet_aton documentation - https://docs.python.org/3/library/socket.html#socket.inet_aton - states that the validation depends on the underlying implementation:
Which is nice but:
Ideally, the problem should be fixed on the libc side, but I don't know if it is going to be. The first comment from a similar issue on glibc bugtracker - https://sourceware.org/bugzilla/show_bug.cgi?id=20018 - states:
However I am not sure if it won't be fixed or if it was simply forgotten during fixing the linked issue as it probably wasn't reported as a separate one. Because of that I think Python should add an additional validation of inet_aton input: raise a validation error when the input contains a whitespace. In case this solution would be rejected because of a need/want to be consistent with C's inet_aton, we should add a security warning that using this function to validate IP addresses is dangerous on some libc version, along with a description/example what exactly is wrong with it. --- man 3 inet, the relevant description part below ---
--- |
From : https://docs.python.org/3.9/library/subprocess.html#security-considerations Security Considerations Unlike some other popen functions, this implementation will never implicitly call a system shell. This means that all characters, including shell metacharacters, can safely be passed to child processes. If the shell is invoked explicitly, via shell=True, it is the application’s responsibility to ensure that all whitespace and metacharacters are quoted appropriately to avoid shell injection vulnerabilities. When using shell=True, the shlex.quote() function can be used to properly escape whitespace and shell metacharacters in strings that are going to be used to construct shell commands. |
indeed, can confirm that a string starting with valid ip address and a space followed with anything, will not trigger the exception. do_ping('1.1.1.1 1.256.300.1 ; whoami') => no exception |
January 2019, Florian Weimer created the issue "Deprecate inet_addr, inet_aton" in glibc: inet_aton() ignores extra string "for historic reasons". More info at: "For historic reasons, inet_addr and inet_aton accept trailing garbage. Some parsers rely on this (for example, libresolv when it parses “nameserver” directives in /etc/resolv.conf)." -- To clarify the complex situation: glibc getaddrinfo() has been fixed in glibc to ignore extra string, but not inet_aton(). getaddrinfo() fix: |
I take the freedom of assigning CVE-2016-10739 to this Python issue, even if CVE-2016-10739 was reported to the glibc (not to Python). "In the GNU C Library (aka glibc or libc6) through 2.28, the getaddrinfo function would successfully parse a string that contained an IPv4 address followed by whitespace and arbitrary characters, which could lead applications to incorrectly assume that it had parsed a valid string, without the possibility of embedded HTTP headers or other potentially dangerous substrings." |
CVE-2016-10739 was filed against glibc. We cannot re-use a CVE number from another product in CPython. You can only reference that a CVE causes a security bug. |
One solution would to be reimplement socket.inet_aton() with inet_pton() internally. inet_pton() is well specified and standard (POSIX). inet_aton() is not ("inet_aton() is not specified in POSIX.1, but is available on most systems." says its Linux manual page). >>> socket.inet_pton(socket.AF_INET, "1.2.3.4")
b'\x01\x02\x03\x04'
>>> socket.inet_pton(socket.AF_INET, "1.2.3.4 extra string")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: illegal IP address string passed to inet_pton Problems:
>>> socket.inet_aton("127")
b'\x00\x00\x00\x7f'
>>> socket.inet_pton(socket.AF_INET, "127")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: illegal IP address string passed to inet_pton |
No, that is not necessary. inet_aton() only accepts IPv4 addresses. Therefore an emulation with inet_pton() would look like this: >>> import functools, socket
>>> inet_aton = functools.partial(socket.inet_pton, socket.AF_INET)
>>> inet_aton('127.0.0.1')
b'\x7f\x00\x00\x01'
Yes, that might be a problem. |
Attached inet_aton_pton.py is a proof-of-concept which reimplements inet_aton() using inet_pton(). It supports a.b.c.d, a.b.c, a.b and a formats. I'm not sure which exception should be raised in case of parsing error. socket.inet_aton() raises OSError"illegal IP address string passed to inet_aton") using C code:
|
Its a while since this has been reported. I think inet_aton_pton.py is fine, though, a commit adding it should explain why we do it this way. @vstinner can you push this patch further or do you want me to do it? |
Regarding the exception case: seems like OSError, since that's what originally was done and we don't want to break users of this code. |
Ping @vstinner @tiran @websurfer5 @aldwinaldwin @ambv can we finally get some love here and prioritize this so it is fixed? This issue can cause security vulnerabilities if this function is used in a security relevant context and is assumed to raise an error on invalid IP strings; it should probably get a CVE too... |
So far, nobody proposed a fix, and it seems like nobody is available to work on a fix. Do you want to work on a fix? |
I suppose that the bare minimum that we can do is documentation: document this surprising behavior, and maybe give advices on how to avoid it. In the glibc, only getaddrinfo() was fixed. inet_aton() was not changed, likely for backward compatibility reasons. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: