New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ntpath.abspath fails for long str paths #48321
Comments
On my system (Windows Server 2008 SP1 - 64-bit, Python 2.5.2 - 32-bit),
simple actions like:
>>> help(help) # Or any function
or
>>> import tempfile
>>> f = tempfile.mktemp()
result in this (rather confusing) error:
TypeError: _getfullpathname() argument 1 must be (buffer overflow), not str Apparently, _getfullpathname() chokes on certain paths if they are not |
The (buffer overflow) message indicates that the argument is too long to |
Running help() or mktemp() causes _getfullpathname to be called with the whole system path (791 characters). If you pass that to _getfullpathname as str it throws the aforementioned TypeError. If it's passed as unicode, it returns an empty string. |
According to http://msdn.microsoft.com/en-us/library/aa364963.aspx,
:-(
We should allocate new buffer with this size and retry like already And one decision problem... What should we do when too long str is |
I am not sure to understand. Do you mean the whole PATH environment |
I don't have it offhand, but it was the whole PATH environment variable, complete with semicolons. That's probably the *real* bug. Whatever was passing that into abspath didn't seem to mind getting back an empty string (although that may have been further processed in the function, I didn't follow past the call to _getfullpathname).
abspath should be able to be called with str or unicode of arbitrary lengths. Consumers of it shouldn't have to be concerned with the platform implementation when it can be smoothed over by the module. Whether this is done in abspath or _getfullpathname probably isn't too important, since end-users generally shouldn't be calling _getfullpathname, directly. |
Indeed. Do you happen to have the complete traceback of the failing |
The problem was that somehow, on our systems, the TEMP environmental variable had been copied over with PATH. Most likely some batch file tried to store a copy of PATH, without realizing the significance of TEMP. [groan] Anyway, I still think that it's a bug that abspath() can't be called with a perfectly good str path, because of limitations with the windows api. I edited the bug title to reflect the actual bug. The str path length could be checked and upgraded to the Unicode version, if necessary (or try again with the unicode version, in the case of an exception). I think it's important to ensure that when abspath() is called with str, it returns str, even if it was upgraded to the unicode call. |
I think attached patch "fix_getfullpathname.patch" will fix unicode After some investigation, GetFullPathNameA fails if output size is more This is test for unicode issue. ///////////////////////////////////////////////////////// import unittest
import ntpath
import os
class TestCase(unittest.TestCase):
def test_getfullpathname(self):
for count in xrange(1, 1000):
name = u"x" * count
path = ntpath._getfullpathname(name)
self.assertEqual(os.path.basename(path), name)
if __name__ == '__main__':
unittest.main() |
And error number via GetLastError() is vogus, sometimes 0, sometimes others. |
Or, if PyArg_ParseTuple overflowed or GetFullPathNameA failed, (not This inverses flow if (unicode_file_names()) {
/* unicode */
}
/* ascii */ # Maybe it would be nice if convert_to_unicode() functionality is built Be care, this is quick hack, so maybe buggy. I confirmed test_os and ///////////////////////////////////////////////////// import unittest
import ntpath
import os
class TestCase(unittest.TestCase):
def test_getfullpathname(self):
for c in ('x', u'x'):
for count in xrange(1, 1000):
name = c * count
path = ntpath._getfullpathname(name)
self.assertEqual(os.path.basename(path), name)
if __name__ == '__main__':
unittest.main() |
Fixed unicode issue in r67154(trunk). I'm not sure how to handle long |
ocean-city, can you please backport this to the 2.5 branch? |
As a follow-up: please don't forget to add Misc/NEWS entries (both for |
Sorry for clarification. Which should I backport r67154 or |
As it is apparently not clear what change exactly needs to be applied to I would appreciate if somebody could summarize what still needs to be |
GetFullPathNameW may return the required buffer size (non-zero value) But original poster hopes abspath() should return correct result for -- |
I doubt this issue exists on Python >= 3.2. See also bpo-1776160. |
I don't see that any further work can be done here owing to limitations of the Windows str API. Note that the same argument can be applied to bpo-1776160. |
Can we close this as I'm not aware of any possible way to fix this? See also bpo-1776160. |
Windows system and C runtime calls that take paths could be restricted to wide-character APIs, such as calling GetFullPathnameW in this case, or _wexecve instead of execve (bpo-23462). Then for bytes paths an extension can call PyUnicode_FSDecoder (PyUnicode_DecodeMBCS). In posxmodule.c this can be handled in the path_converter function. path_converter path_converter could be moved to Python/fileutils.c to make it available for use by other modules such as io. |
I'm closing this as a resolved issue. Python 2 is approaching end of life, and I don't see a pressing need for my suggestion in msg237007. We should be using unicode for paths in Windows anyway since its file systems are natively Unicode. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: