msg142583 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-08-20 23:47 |
We have our own quirky implementation of C realpath() in posixpath.py.
It would probably be simpler, safer and more efficient to simply call the POSIX function instead (but it most be exposed somewhere, by posixmodule.c I suppose).
|
msg142587 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-08-20 23:57 |
The Python implementation (os.path.realpath) was introduced 10 years ago by the issue #461781.
The Python implemtation has known bugs:
- #9949: os.path.realpath on Windows does not follow symbolic links
- #11397: os.path.realpath() may produce incorrect results
See also the issue #1298813: _Py_wrealpath() must use canonicalize_file_name() if available.
|
msg142591 - (view) |
Author: Ross Lagerwall (rosslagerwall) |
Date: 2011-08-21 05:04 |
An initial patch.
The first problem is that before, os.path.realpath('') == os.path.realpath('.')
but this is not true for realpath(3).
Thus, the test suite does not run because it relies on this behaviour.
|
msg142619 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2011-08-21 13:53 |
+ Return a string representing the canonicalized pathname of *path*.
This sounds a bit strange to me; “Return a string representing the canonical version of *path*”.
Would it be a good idea to use the full docstring of posixpath.realpath to os.realpath and then just do “realpath = os.realpath” in posixpath?
|
msg142909 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-08-24 20:11 |
> It would probably be simpler, safer and more efficient to simply call
> the POSIX function instead (but it most be exposed somewhere, by
> posixmodule.c I suppose).
Indeed.
> Would it be a good idea to use the full docstring of posixpath.realpath
> to os.realpath and then just do “realpath = os.realpath” in posixpath?
Yes. That would also be a way to add the '.' default argument.
> An initial patch.
I'm actually concerned because realpath doesn't seem as portable as I first thought. Here's what my man page says:
"""
OSIX.1-2001 says that the behavior if resolved_path is NULL is implementation-defined. POSIX.1-2008 specifies the behavior described in this page.
"""
So I'm not sure that all implementations support passing a NULL buffer as argument.
Also:
"""
Solaris may return a relative pathname when the path argument is relative.
"""
Well, I guess we can also commit this patch and see if a buildbot breaks...
|
msg142913 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-08-24 20:18 |
i12801.patch is not correct: on Windows, you should never encode a filename to bytes. Use PyUnicode_Decode() + _Py_wrealpath(), or a #ifdef (as many other posixmodule functions).
I prefer to reuse _Py_wrealpath because we will have to add special cases: realpath(NULL) and also maybe canonicalize_file_name().
|
msg142925 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-08-24 21:22 |
Patch to get #ifdef REALPATH_SUPPORT_NULL:
diff --git a/configure.in b/configure.in
--- a/configure.in
+++ b/configure.in
@@ -1539,6 +1539,17 @@ if test "$have_pthread_t" = yes ; then
#endif
])
fi
+
+AC_MSG_CHECKING(for realpath with NULL)
+realpath_support_null=no
+AC_COMPILE_IFELSE([
+ AC_LANG_PROGRAM([[#include <stdlib.h>]], [[char *p = realpath(__FILE__, NULL);]])
+],[realpath_support_null=yes],[])
+AC_MSG_RESULT($realpath_support_null)
+if test "$have_pthread_t" = yes ; then
+ AC_DEFINE(REALPATH_SUPPORT_NULL, 1,
+ [Define if the realpath() function supports NULL as the second argument])
+fi
CC="$ac_save_cc"
AC_SUBST(OTHER_LIBTOOL_OPT)
|
msg142929 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-08-24 21:28 |
> I prefer to reuse _Py_wrealpath because we will have
> to add special cases: realpath(NULL) and also maybe
> canonicalize_file_name().
I read that canonicalize_file_name(name) just calls realpath(name, NULL), so we can maybe avoid this GNU exception.
I realized that _Py_wrealpath() returns wchar_t*, so it's maybe better to take the first approach (#ifdef MS_WINDOWS).
|
msg142932 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-08-24 21:44 |
> Patch to get #ifdef REALPATH_SUPPORT_NULL:
I'm not really familiar with autotools, but I have the impression that
this will only check that the given code snippet compiles (and it
will), and we want to check that a NULL buffer is supported at
runtime.
Am I missing something?
|
msg142943 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-08-24 22:59 |
Le mercredi 24 août 2011 23:45:00, vous avez écrit :
> Charles-François Natali <neologix@free.fr> added the comment:
> > Patch to get #ifdef REALPATH_SUPPORT_NULL:
> I'm not really familiar with autotools, but I have the impression that
> this will only check that the given code snippet compiles (and it
> will), and we want to check that a NULL buffer is supported at
> runtime.
> Am I missing something?
Oh, you are right. I don't know autotools, I'm trying to learn.
|
msg143152 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-08-29 16:07 |
Well, if we use two different paths based on the libc version, it might not be a good idea, since behaviour can be different in some cases.
It would be nice to know if some modern platforms have a non-compliant realpath().
|
msg143163 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-08-29 17:22 |
> Well, if we use two different paths based on the libc version, it might not be a good idea, since behaviour can be different in some cases.
Indeed.
> It would be nice to know if some modern platforms have a non-compliant realpath().
Alas, it doesn't seem to hold for OpenBSD:
http://old.nabble.com/Make-realpath(3)-conform-to-SUSv4-td32031895.html
A patch supporting NULL was committed two months ago, which means we
probably can't push this forward.
I've been quite disappointed by POSIX lately...
|
msg143164 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-08-29 17:24 |
> Alas, it doesn't seem to hold for OpenBSD:
> http://old.nabble.com/Make-realpath(3)-conform-to-SUSv4-td32031895.html
>
> A patch supporting NULL was committed two months ago, which means we
> probably can't push this forward.
>
> I've been quite disappointed by POSIX lately...
:-)
|
msg143165 - (view) |
Author: Ross Lagerwall (rosslagerwall) |
Date: 2011-08-29 17:25 |
> I've been quite disappointed by POSIX lately...
POSIX the standard, or the implementers??
|
msg143167 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-08-29 17:55 |
> POSIX the standard, or the implementers??
>
Both :-)
For those wondering why we can't use PATH_MAX (ignoring the buffer
overallocation), here's why:
https://www.securecoding.cert.org/confluence/display/cplusplus/FIO02-CPP.+Canonicalize+path+names+originating+from+untrusted+sources
"""
Avoid using this function. It is broken by design since (unless using
the non-standard resolved_path == NULL feature) it is impossible to
determine a suitable size for the output buffer, resolved_path.
According to POSIX a buffer of size PATH_MAX suffices, but PATH_MAX
need not be a defined constant, and may have to be obtained using
pathconf(3). And asking pathconf(3) does not really help, since on the
one hand POSIX warns that the result of pathconf(3) may be huge and
unsuitable for mallocing memory. And on the other hand pathconf(3) may
return -1 to signify that PATH_MAX is not bounded.
The libc4 and libc5 implementation contains a buffer overflow (fixed
in libc-5.4.13). As a result, set-user-ID programs like mount(8) need
a private version."""
|
msg146640 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-10-30 14:57 |
Another problem with the POSIX realpath(3): it can fail with ENOENT if the target path doesn't exist, whereas the current Python implementation "succeeds" even if a component in the path doesn't exist.
Note the quotes around "succeeds", because I'm not sure whether this behavior is a good thing: the result is just plain wrong. I wonder whether we should break backwards compatibility on this particular point...
|
msg153916 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2012-02-21 23:51 |
realpath.patch:
- Add os.realpath()
- posixpath.realpath() uses os.realpath(), or os.getcwd() if the path is an empty string
- os.realpath() uses canonicalize_file_name() if available, or use realpath() with a buffer of MAXPATHLEN bytes
Don't use realpath(path, NULL) because it is not possible to test if realpath() supports NULL: OpenBSD did segfault if the buffer was NULL. Testing the support in configure is not a good idea because it depends on the version of the C library. You may use a more recent C library to compile Python for example.
genericpath, macpath and os2emxpath are not patched to use os.realpath(), even if it is available. I don't know if these modules should be patched. ntpath uses the native Windows function (if available): GetFullPathNameW().
|
msg154247 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2012-02-25 13:08 |
> - os.realpath() uses canonicalize_file_name() if available, or use realpath() with a buffer of MAXPATHLEN bytes
MAXPATHLEN is not necessarily defined (e.g. on the Hurd): if it's not
defined, it is set either to MAX_PATH (if it's defined an greater than
1024), or arbitrarily to 1024.
Thus, we can't pass it to realpath(3), since we don't know for sure
that realpath(3) won't return a string greater than MAXPATHLEN, which
could result in a buffer overflow.
Also, there's the problem that realpath(3) returns ENOENT, whereas
os.path.realpath() doesn't:
without patch:
$ python -c "import os; os.path.realpath('/nosuchfile')"
with patch:
$ ./python -c "import os; os.path.realpath('/nosuchfile')"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/cf/python/cpython/Lib/posixpath.py", line 385, in realpath
return os.realpath(filename)
FileNotFoundError: [Errno 2] No such file or directory: '/nosuchfile'
|
msg223878 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2014-07-24 19:42 |
Shall we close this, since realpath(3) is fundamentally broken, and pathlib now does The Right Thing?
|
msg223883 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2014-07-24 19:53 |
I agree, Python should not use the C function realpath() for all the reasons give in the issue.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:20 | admin | set | github: 57010 |
2014-07-24 19:53:55 | vstinner | set | status: open -> closed resolution: not a bug messages:
+ msg223883
|
2014-07-24 19:42:57 | neologix | set | messages:
+ msg223878 |
2012-02-26 08:07:21 | Arfrever | set | nosy:
+ Arfrever
|
2012-02-25 13:08:13 | neologix | set | messages:
+ msg154247 |
2012-02-21 23:51:28 | vstinner | set | files:
+ realpath.path
messages:
+ msg153916 |
2011-10-30 14:57:10 | neologix | set | messages:
+ msg146640 |
2011-08-29 17:55:44 | neologix | set | messages:
+ msg143167 |
2011-08-29 17:25:21 | rosslagerwall | set | messages:
+ msg143165 |
2011-08-29 17:24:01 | pitrou | set | messages:
+ msg143164 |
2011-08-29 17:22:18 | neologix | set | messages:
+ msg143163 |
2011-08-29 16:07:12 | pitrou | set | messages:
+ msg143152 |
2011-08-29 15:41:40 | matejcik | set | nosy:
+ matejcik
|
2011-08-24 22:59:21 | vstinner | set | messages:
+ msg142943 |
2011-08-24 21:45:00 | neologix | set | messages:
+ msg142932 |
2011-08-24 21:31:04 | santoso.wijaya | set | nosy:
+ santoso.wijaya
|
2011-08-24 21:28:33 | vstinner | set | messages:
+ msg142929 |
2011-08-24 21:22:15 | vstinner | set | messages:
+ msg142925 |
2011-08-24 20:18:44 | vstinner | set | messages:
+ msg142913 |
2011-08-24 20:11:58 | neologix | set | messages:
+ msg142909 |
2011-08-21 13:53:39 | eric.araujo | set | nosy:
+ eric.araujo messages:
+ msg142619
|
2011-08-21 05:04:04 | rosslagerwall | set | files:
+ i12801.patch keywords:
+ patch messages:
+ msg142591
|
2011-08-20 23:57:43 | vstinner | set | messages:
+ msg142587 |
2011-08-20 23:47:30 | pitrou | create | |