New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io.open() doesn't check for embedded NUL characters #58056
Comments
>>> open("\x00abc")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '' Contrast with 2.x open(): >>> open("\x00abc")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: file() argument 1 must be encoded string without NULL bytes, not str |
I took a deep dive into parts of CPython that were unknown to me :) and dug up the following: Methods like os.stat or even os.open convert the file name using "et" in PyArg_ParseTuple[AndKeywords]. OTOH, open() and io.open() just hand through the object as "O" to the respective low-level io module. The result in 2.7 is that file() tries to convert it for it's own usage eventually – which fails as seen. While a more explicit error message wouldn't hurt, this seems safe to me insofar. In 3.3, file() aka Modules/_io/fileio.c , io_open does no such thing because it seems to handle fds as "nameobj" as well and does a wide range of checks on the argument. After io_open is certain that nameobj is a file name, it uses PyObject_AsCharBuffer()on bytes and PyUnicode_FromObject() + encoding magic on unicode to get an encoded string as a file name. Neither does a check for NUL bytes so the (w)open(er) that follows reacts as demonstrated by Antoine. I presume fixing/breaking PyObject_AsCharBuffer()/PyUnicode_FromObject() is out of question so the most obvious part to fix would be the conversion block inside io_open. Should I have a stab at it or do you disagree with my approach? |
Yes, fixing the conversion block is probably the right approach. >>> open(b"\x00")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: ''
) |
JFTR, file()'s C equivalent is fileio_init and not io_open, I lost track of all the opens. ;) |
So I have good news and bad news. The good is: I fixed it for non-Win platforms and the patch is truly beautiful: Lib/test/test_builtin.py | 6 ++++++ ;) Two problems:
It's just about the case when it's called with a Unicode path name. The current code looks like as following: if (PyUnicode_Check(nameobj)) {
widename = PyUnicode_AsUnicode(nameobj);
if (widename == NULL)
return -1;
} We can't use the nifty PyUnicode_FSConverter because we want to keep Unicode. So I assume the way to go would be the C equivalent of somthing like: if '\0' in widename:
raise TypeError() Right? I hope someone would be so kind to implement it, otherwise the patch attached passes all test on Linux and Mac (except for test_recursion_limit but that fails currently for me on the Mac even without my patch). |
Indeed, there seems to be no mechanism available to forbid NUL chars under Windows (for Python 3.x): >>> open("LICENSE\x00foobar")
<_io.TextIOWrapper name='LICENSE\x00foobar' mode='r' encoding='cp1252'>
>>> os.stat("LICENSE\x00foobar")
nt.stat_result(st_mode=33206, st_ino=2251799813779714, st_dev=0, st_nlink=1, st_uid=0, st_gid=0, st_size=15132, st_atime=8589934592, st_mtime=8589934592, st_ctime=1301169903) I think PyUnicode_AsUnicode should grow a NUL char check in Python 3.3, since it doesn't return the size anyway. I don't think we can do that in previous versions, though, so we need an alternate strategy. Scanning the unicode string for NUL characters is enough. That should be easy by using PyUnicode_AsUnicodeAndSize. As for the patch:
|
Since the NUL-scanning will be useful for Modules/posixmodule.c as well, perhaps it should be done as a private _PyUnicode_HasNULChars() function. |
See also bpo-13849 |
I have fixed the refleak, added a _PyUnicode_HasNULChars and integrated it into the Win32-unicode-if-branch. Couldn't test it due to lack of win32 – the function itself is tested though. |
With Georg's kind help I added some improvements:
In related news, I'm also adding tests for fileio since the last patch. |
The patch works under Windows here (on branch default). |
New changeset 572bb8c265c0 by Antoine Pitrou in branch '3.2': New changeset 6bb05ce1cd1f by Antoine Pitrou in branch 'default': |
I've made small changes and committed the patch in 3.2 and 3.3. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: