|
msg121038 - (view) |
Author: Ronald Oussoren (ronaldoussoren) *  |
Date: 2010-11-12 15:14 |
The documentation for os.path.commonprefix notes:
os.path.commonprefix(list)
Return the longest path prefix (taken character-by-character) that is a prefix of all paths in list. If list is empty, return the empty string (''). Note that this may return invalid paths because it works a character at a time.
And indeed:
>>> os.path.commonprefix(['/usr/bin', '/usr/bicycle'])
'/usr/bi'
This is IMHO useless behaviour for a function in the os.path namespace, I'd expect that os.path.commonprefix works with path elements (e.g. that the call above would have returned '/usr').
|
|
msg121039 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2010-11-12 15:37 |
Indeed, that behavior seems completely useless.
I've verified that it works the same in 2.5.1.
|
|
msg121040 - (view) |
Author: Eric V. Smith (eric.smith) *  |
Date: 2010-11-12 15:48 |
Although there are test cases in test_genericpath that verify this behavior, so apparently it's intentional.
|
|
msg121043 - (view) |
Author: Ronald Oussoren (ronaldoussoren) *  |
Date: 2010-11-12 16:18 |
That's why I write 'broken by design' in the title.
A "fix" for this will have to a new function, if any get added (I've written a unix implementation that finds the longest shared path several times and can provide an implementation and tests when others agree that this would be useful)
|
|
msg121063 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2010-11-12 19:35 |
This goes back to issue400788 and
http://mail.python.org/pipermail/python-dev/2000-July/005897.html
http://mail.python.org/pipermail/python-dev/2000-August/008385.html
Skip changed it to do something meaningful (more than ten years ago), Mark Hammond complained that it was backwards incompatible, Tim Peters argued that you shouldn't change a function if the documented behavior matches the implementation, and Skip reverted the change and added more documentation to make the actual behavior more explicit.
It may be useless, but it's certainly not broken. In addition, it's very likely that applications of it rely on the very semantics that it has.
In any case, anybody proposing a change should go back and re-read the old threads.
|
|
msg121106 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2010-11-13 03:12 |
Indeed, as I remember it there are people using commonprefix as a string function in situations having nothing to do with os paths.
I'm changing the title to reflect the fact that this is really a feature request for a new function. IMO it is a reasonable feature request. Finding a name for it ought to be an interesting exercise.
I think that this should only be accepted if there is also a windows implementation.
|
|
msg141535 - (view) |
Author: Eric Snow (eric.snow) *  |
Date: 2011-08-01 21:21 |
You can already get the better prefix using os.path, albeit less efficiently. Here's an example:
def commondirname(paths):
subpath = os.path.commonprefix(paths)
for path in paths:
if path == subpath:
return subpath
else:
return os.path.join(os.path.split(subpath)[0], "")
However, would it be better to implicitly normalize paths first rather than doing a character-by-character comparison? Here is an unoptimized demonstration of what I mean:
def commondirname(paths):
result = ""
for path in paths:
path = os.path.normcase(os.path.abspath(path))
if not result:
result = path
else:
while not path.startswith(result + os.path.sep):
result, _ = os.path.split(result)
if os.path.splitdrive(result)[1] == os.path.sep:
return result
return result
|
|
msg174663 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-11-03 18:34 |
Rafik is working on os.path.commonpath for the bug day.
|
|
msg174818 - (view) |
Author: Rafik Draoui (rafik) |
Date: 2012-11-04 16:10 |
Here is a patch with an implementation of os.path.commonpath, along with tests and documentation. At the moment, this is only implemented for POSIX, as I don't feel like I know enough about Windows to tackle drive letters and UNC in paths without spending some more time on it.
This probably needs more tests for corner cases.
|
|
msg174819 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-04 16:44 |
> At the moment, this is only implemented for POSIX, as I don't feel like I know enough about Windows to tackle drive letters and UNC in paths without spending some more time on it.
Just use splitdrive() and first ensure that all drivespecs are same, then find common prefix for pathspecs.
|
|
msg174941 - (view) |
Author: Rafik Draoui (rafik) |
Date: 2012-11-05 20:45 |
Here is a new patch addressing some of storchaka review comments, and implementing a version in ntpath.
For the Windows version, I did as proposed in msg174819, but as I am not familiar with the semantics and subtleties of paths in Windows maybe this version of ntpath.commonpath is too simplistic and would return wrong results in some cases. I would like someone more knowledgeable in Windows to take care of it, or maybe just provide a test suite with lots of different corner cases that I could use to provide a better implementation.
|
|
msg175493 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-13 09:24 |
Some conclusions of discussion at Python-ideas (http://comments.gmane.org/gmane.comp.python.ideas/17719):
1. commonpath() should eat double slashes in input (['/usr/bin', '/usr//bin'] -> '/usr/bin'). In any case the current implementation eats slashes on output (['/usr//bin', '/usr//bin'] -> '/usr/bin', not '/usr//bin').
2. commonpath() should raise an exception instead of returning None on incompatible input.
3. May be commonpath() should eat also '.' components and return '.' instead of '' when relative paths have no common prefix. I am not sure.
In general the current patch looks good enough.
|
|
| Date |
User |
Action |
Args |
| 2012-12-29 22:12:25 | serhiy.storchaka | set | assignee: serhiy.storchaka |
| 2012-11-13 09:24:17 | serhiy.storchaka | set | messages:
+ msg175493 |
| 2012-11-13 00:18:05 | rafik | set | files:
+ patch10395-3 |
| 2012-11-05 20:45:58 | rafik | set | files:
+ patch10395-2
messages:
+ msg174941 |
| 2012-11-04 16:44:13 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg174819
|
| 2012-11-04 16:13:09 | serhiy.storchaka | set | stage: needs patch -> patch review |
| 2012-11-04 16:10:38 | rafik | set | files:
+ patch10395 nosy:
+ rafik messages:
+ msg174818
|
| 2012-11-03 18:34:30 | eric.araujo | set | messages:
+ msg174663 versions:
+ Python 3.4, - Python 3.3 |
| 2011-08-01 21:21:59 | eric.snow | set | nosy:
+ eric.snow messages:
+ msg141535
|
| 2011-08-01 18:11:05 | santa4nt | set | nosy:
+ santa4nt
versions:
+ Python 3.3, - Python 3.2 |
| 2011-07-31 02:39:52 | Roman.Evstifeev | set | nosy:
+ Roman.Evstifeev
|
| 2010-11-19 15:49:24 | eric.araujo | set | nosy:
+ eric.araujo
|
| 2010-11-13 03:12:31 | r.david.murray | set | nosy:
+ r.david.murray title: os.path.commonprefix broken by design -> new os.path function to extract common prefix based on path components messages:
+ msg121106
type: enhancement stage: needs patch |
| 2010-11-12 23:53:31 | ezio.melotti | set | nosy:
+ ezio.melotti
|
| 2010-11-12 19:35:22 | loewis | set | nosy:
+ loewis messages:
+ msg121063
|
| 2010-11-12 16:18:01 | ronaldoussoren | set | messages:
+ msg121043 |
| 2010-11-12 15:48:04 | eric.smith | set | messages:
+ msg121040 |
| 2010-11-12 15:37:44 | eric.smith | set | nosy:
+ eric.smith messages:
+ msg121039
|
| 2010-11-12 15:14:04 | ronaldoussoren | create | |