Title: Add function to get common path prefix
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3
Status: closed Resolution: duplicate
Dependencies: Superseder: new os.path function to extract common prefix based on path components
Assigned To: Nosy List: cmcqueen1975, eric.araujo, laxrulz777, loewis, martin.panter, ncoghlan, serhiy.storchaka, skip.montanaro, techtonik
Priority: normal Keywords: needs review, patch

Created on 2008-12-27 04:00 by skip.montanaro, last changed 2022-04-11 14:56 by admin. This issue is now closed.

msg78338 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2008-12-27 04:00
os.path.commonprefix returns the common prefix of a list of paths taken character-by-character.  This can 
return invalid paths.  For example, os.path.commonprefix(["/export/home/dave", "/etc/passwd"]) will return "/e", which likely has no meaning as a path, at least in the context of the input list.

Ideally, os.path.commonprefix would operate component-by-component, but people rely on the existing 
character-by-character operation, so it has been so far impossible to change semantics.  There are several 
possible ways to solve this problem.  One, change how commonprefix behaves.  Two, add a flag to 
commonprefix to allow it to operate component-by-component if desired.  Three, add a new function to 

I personally prefer the first option.  Aside from the semantic change though, it presents the problem of 
where to put the old definition of commonprefix.  It's clearly of some use or people wouldn't have co-
opted it for non-filesystem use.  It could go in the string module, but that's been living a life in limbo 
since the creation of string methods.  People have been loathe to add new functionality there.  The second 
option seems to me like would just be a hack on top of already broken behavior and probably require the 
currently slightly broken behavior as the default to boot, so I won't go there.  Since option one is 
perhaps not going to be available to me, I've implemented the third option as a new function, 
commonpathprefix.  See the attached patch.  It includes test cases and documentation changes.
msg78339 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-12-27 04:24
A new function sounds like a good solution to me. How about just calling
it "os.path.commonpath" though?

I agree having a path component based prefix function in os.path is
highly desirable, particularly since the addition of relpath in 2.6:

base_dir = os.path.commonpath(paths)
rel_paths = [os.path.relpath(p, base_dir) for p in paths]
msg78529 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-12-30 13:24
The documentation should explain what a "common path prefix" is. It
can't be the path to a common parent directory, since the new function
doesn't allow mixing absolute and relative directories. As Phillip Eby
points out, it also doesn't account for case-insensitivity that some
file systems or operating systems implement, nor does it take into
account short file names on Windows.
msg78530 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2008-12-30 13:51
I think we need to recognize the inherent limitations of what we can expect
to do.  It is perfectly reasonable for a user on Windows to import posixpath
and call posixpath.commonpathprefix.  The function won't have access to the
actual filesystems being manipulated.  Same for Unix folks importing ntpath
and manipulating Windows paths.  While we can make it handle
case-insensitivity, I'm no sure we can do much, if anything, about shortened

Also, as long as we are considering case sensitivity, what about HFS on Mac

msg78532 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-12-30 13:55
1. The discussion on python-dev shows that the current documentation of
os.path.commonprefix is incorrect - it technically works element by
element rather than character by character (since it will handle
sequences other than strings, such as lists of path components)

2. Splitting on os.sep is not the correct way to break a string into
path components. Instead, os.path.split needs to be applied repeatedly
until "head" is a single character (a single occurrence of os.sep or
os.altsep for an absolute path) or empty (for a relative path).
(Alternatively, but with additional effects on the result, the
separators can be normalised first with os.path.normpath or

  For Windows, os.path.splitunc and os.path.splitdrive should also be
invoked first, and if either returns a non-empty string, that should
become the first path component (with the remaining components filled in
as above)

3. Calling any or all of
abspath/expanduser/expandvars/normcase/normpath/realpath is the
responsibility of the library user as far as os.path.commonprefix is
concerned. Should that behaviour be retained for an os.path.commonpath
function, or should some of them (such as os.path.abspath) be called
msg78533 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-12-30 14:05
The regex based approach to the component splitting when os.altsep is
defined obviously works as well. Duplicating the values of sep and
altsep in the default regex that way grates a little though...
msg111589 - (view) Author: Craig McQueen (cmcqueen1975) Date: 2010-07-26 02:28
msg227699 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-27 16:45
There is more developed patch in issue10395.
msg227707 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2014-09-27 18:28
Feel free to close this ticket. I long ago gave up on it.
msg293143 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-05-05 21:53
Issue 10395 added “os.path.commonpath” in 3.5.
