Author serhiy.storchaka
Recipients serhiy.storchaka
Date 2016-05-11.18:54:43
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1462992884.19.0.13993979036.issue27002@psf.upfronthosting.co.za>
In-reply-to
Content
For now posixpath.realpath() don't raise an exception if encounter broken link. Instead it just lefts broken link name and following path components unresolved. This is dangerous since broken link name can be collapsed with following ".." and resulting valid path can point at wrong location. May be this is even security issue.

On other hand, Path.resolve() raises an exception when encounters broken link. This is not always desirable, there is a wish to make it more lenient. See issue19717 for more information.

The readlink utility from GNU coreutils has three mode for resolving file path:

       -f, --canonicalize
              canonicalize by following every symlink in every component of the given name recursively; all but the last component must exist

       -e, --canonicalize-existing
              canonicalize by following every symlink in every component of the given name recursively, all components must exist

       -m, --canonicalize-missing
              canonicalize by following every symlink in every component of the given name recursively, without requirements on components existence

Current behavior of posixpath.realpath() is matches (besides one minor detail) to `readlink -m`. The behavior of Path.resolve() matches `readlink -e`.

Proposed preliminary patch implements the support of all three modes in posixpath.realpath(): CAN_MISSING, CAN_ALL_BUT_LAST and CAN_EXISTING. It exactly matches the behavior of readlink. The default mode is CAN_MISSING.

There is minor behavior difference in the default mode. If there is a file "file", a link "link" that points to "file" and a broken link "broken", then "broken/../link" was resolved to "link" and now it is resolved to "file".

The patch lacks the documentation. Ternary flag looks as not the best API. Binary flag would be better. But I don't know what can be dropped. CAN_MISSING is needed for compatibility, but it looks less useful and may be insecure (not more than normpath()). CAN_EXISTING and CAN_ALL_BUT_LAST is needed in different cases. I think that in many cases CAN_ALL_BUT_LAST is actually needed instead of CAN_MISSING.

After resolving this issue the solution will be adopted for Path.resolve().
History
Date User Action Args
2016-05-11 18:54:44serhiy.storchakasetrecipients: + serhiy.storchaka
2016-05-11 18:54:44serhiy.storchakasetmessageid: <1462992884.19.0.13993979036.issue27002@psf.upfronthosting.co.za>
2016-05-11 18:54:44serhiy.storchakalinkissue27002 messages
2016-05-11 18:54:44serhiy.storchakacreate