New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows: path_converter() leaks memory for Unicode filenames #72387
Comments
Memory leak spotted by the issue bpo-28195: path_converter() calls PyUnicode_AsWideCharString() which allocates a new buffer at each call, but this buffer is never released. On Python 3.5, PyUnicode_AsWideCharString() was used which handles internally the memory buffer and so release the memory later. Attached patch fixes the regression introduced in Python 3.6 beta 1 by the change e20c7d8a8187 ("Issue bpo-27781: Change file system encoding on Windows to UTF-8 (PEP-529)"). |
I didn't push the fix myself, because Steve Dower maybe made the change for a specific reason? |
It's not clear to me that Py_UNICODE is guaranteed to be wchar_t for all time, that's all. If it is, go ahead. Otherwise the path_t object has the ability to clean up after itself, so perhaps it should be used here? |
Deprecated functions and types of the C API The :c:type:`Py_UNICODE` has been deprecated by :pep:`393` and will be Unicode functions and methods using :c:type:`Py_UNICODE` and
(From Doc/whatsnew/3.3.rst) |
Steve Dower added the comment:
Right now, it's the case and path_converter() leaks memory, so I On Windows, it makes sense to continue to use the UTF-16 encoded
Maybe, but I'm not interested to write a more complex patch for The advantage of the old code (and my patch) is to only convert a |
Serhiy Storchaka added the comment:
Right... I tried to deprecate and remove all functions using
Sadly, it's not exactly the same: PyUnicode_AsWideCharString returns a Victor |
path_converter.patch LGTM for 3.6 (and 3.7, 3.8), but we should find better solution for future versions. Could you please add a comment that PyUnicode_AsUnicodeAndSize is deprecated, but used because it is the simplest and the most efficient way for now? |
New changeset 6232e610e310 by Victor Stinner in branch '3.6': |
Ok, I pushed my simple fix. Feel free to modify the code if you see a better way to encode paths on Windows ;-) But it's a much larger project, and I'm not really interested to jump again in this silly deprecated APIs :-) It seems like the status quo is that Py_UNICODE is going to stay longer than expected and since it "just works", nobody really cares :-) I close this issue since the initial bug (memory leak) is now fixed. |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: