This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: socket.getaddrinfo(host) doesn't ensure that host.encode() returns a byte string
Type: crash Stage:
Components: Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Rosuav, pkt, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2015-07-22 07:02 by pkt, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
poc_getaddr.py pkt, 2015-07-22 07:02
dont_assert.patch Rosuav, 2015-09-11 09:30 review
dont_assert_with_test.patch Rosuav, 2015-09-11 09:39 review
Messages (8)
msg247094 - (view) Author: paul (pkt) Date: 2015-07-22 07:02
eck(idna));
# (gdb) 
# 
# Program received signal SIGABRT, Aborted.
# 0xb77a6d4c in __kernel_vsyscall ()
# 
# "host" argument can be set to a subclass of unicode with a custom "encode" 
# method. "encode" returns unexpected type. assert is not compiled in release
# mode, so this will lead to a type confusion later on.
msg247097 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-07-22 07:28
5513	        idna = _PyObject_CallMethodId(hobj, &PyId_encode, "s", "idna");
5514	        if (!idna)
5515	            return NULL;
5516	        assert(PyBytes_Check(idna));

The assertion fails because the custom string type in poc_getaddr.py returns an integer, not a byte string.

IMHO we should call PyUnicode_AsEncodedObject() instead of calling the encode() method.
msg247098 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-07-22 07:28
@paul: are you fuzzing Python?
msg247101 - (view) Author: paul (pkt) Date: 2015-07-22 08:56
@haypo:
I'd be happy to implement all my fuzzer ideas if my bugs were patched in a timely manner.

At this moment I have multiple bugs submitted over 2 months ago, which still aren't patched. Without patches, hackerone won't accept these issues, so my incentive to work on python is removed.
msg250456 - (view) Author: Chris Angelico (Rosuav) * Date: 2015-09-11 09:30
ISTM this is a case where Python's core shouldn't be using assert. It's possible for userland code to trigger an assertion failure, which means it should be a regular if(..) raise. Patch attached.

@haypo, what do you mean by "fuzzing"? Is there something I've missed here?
msg250457 - (view) Author: Chris Angelico (Rosuav) * Date: 2015-09-11 09:39
Oops, forgot to add a test. Using a variant of poc_getaddr.py to construct something which fails on current CPython tip, and passes with the patch.
msg250461 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-11 10:43
New changeset 2bff115e6ba0 by Victor Stinner in branch '3.4':
Issue #24684: socket.socket.getaddrinfo() now calls
https://hg.python.org/cpython/rev/2bff115e6ba0

New changeset 0c13674cf8b5 by Victor Stinner in branch '2.7':
Issue #24684: socket.socket.getaddrinfo() now calls
https://hg.python.org/cpython/rev/0c13674cf8b5
msg250462 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-11 10:45
Ok, I fixed the bug in Python 2.7, 3.4, 3.5 and 3.6. (Python 2.7 was also impacted for custom *unicode* strings.)

Thanks for your bug report paul!

> ISTM this is a case where Python's core shouldn't be using assert. It's possible for userland code to trigger an assertion failure, which means it should be a regular if(..) raise.

Right, this check is implemented in PyUnicode_AsEncodedString(). Moreover, PyUnicode_AsEncodedString() calls directly the codec, it doesn't call the encode() method of the input string.

(Sorry, I wrote PyUnicode_AsEncodedObject() which has a different purpose.)

> @haypo, what do you mean by "fuzzing"?

This: https://en.wikipedia.org/wiki/Fuzz_testing
History
Date User Action Args
2022-04-11 14:58:19adminsetgithub: 68872
2015-09-11 10:45:32vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg250462

versions: + Python 2.7
2015-09-11 10:43:09python-devsetnosy: + python-dev
messages: + msg250461
2015-09-11 09:39:08Rosuavsetfiles: + dont_assert_with_test.patch

messages: + msg250457
2015-09-11 09:30:14Rosuavsetfiles: + dont_assert.patch

nosy: + Rosuav
messages: + msg250456

keywords: + patch
2015-07-22 08:56:35pktsetmessages: + msg247101
2015-07-22 07:28:17vstinnersetmessages: + msg247098
2015-07-22 07:28:05vstinnersetnosy: + vstinner
title: Type confusion in socket module -> socket.getaddrinfo(host) doesn't ensure that host.encode() returns a byte string
messages: + msg247097

versions: + Python 3.4, Python 3.6
2015-07-22 07:02:52pktcreate