classification
Title: Add os.path.splitpath(path) function
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, blokeley, eric.araujo, giampaolo.rodola, martin.panter, paul.moore, pitrou, r.david.murray, rhettinger, santoso.wijaya, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2011-02-27 08:26 by blokeley, last changed 2015-04-20 09:04 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
issue11344.patch blokeley, 2011-04-29 14:23 os.path.splitpath() patch rev1
ospath_splitpath.patch serhiy.storchaka, 2012-12-02 20:46 review
ospath_splitpath_2.patch serhiy.storchaka, 2013-11-18 17:29 review
ospath_splitpath_3.patch serhiy.storchaka, 2014-07-13 19:13 review
ospath_splitpath_4.patch serhiy.storchaka, 2015-04-20 09:04 review
Messages (29)
msg129616 - (view) Author: blokeley (blokeley) Date: 2011-02-27 08:26
It is a common need to find the grandparent or great-grandparent (etc.) of a given directory, which results in this:

>>> from os.path import dirname
>>> mydir = dirname(dirname(dirname(path)))

Could a "height" parameter be added to os.path.dirname so it becomes:

>>> def dirname(path, height=1):

Then we could have usage like:

>>> path = '/ggparent/gparent/parent/myfile.txt'
>>> from os.path import dirname

>>> dirname(path)
/ggparent/gparent/parent

>>> dirname(path, 2)
/ggparent/gparent

>>> dirname(path, 3)
/ggparent

Perhaps we should throw ValueErrors for invalid height values:

>>> dirname(path, 10)
ValueError

>>> dirname(path, -1)
ValueError

Perhaps a height of 0 should do nothing:

>>> dirname(path, 0)
/ggparent/gparent/parent/myfile.txt

I can supply patches, unit tests and docs if you like.
msg129635 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-02-27 15:47
I'm -1 on this feature request.  I think it is an unnecessary complication of the API, especially since dirname corresponds to the unix shell 'dirname' command, which doesn't have such a feature.  If you need this feature in a particular application, it is easy to write a function to provide it.
msg129636 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-27 15:51
Well, on the other hand, it *is* a common need.
(and I don't think mimicking the shell is a design principle for Python)
msg129640 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-02-27 17:45
No, it isn't a design principle.  My point was that unix hasn't found it useful to add a level option to the dirname API.

I don't know that I personally have ever had occasion to peel off more than one directory level without also wanting to do something with the intermediate results, so perhaps I am not a good judge of how useful this would be.
msg130078 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-03-04 21:49
I am inclined to -1 also.

a. The proposed behavior is anti-obvious to me: the higher the height, the shorter the result. Calling param 'drop' would be better.

b. Not every one-liner should be wrapped.

>>> path.rsplit('/',0)[0]
'/ggparent/gparent/parent/myfile.txt'
>>> path.rsplit('/',1)[0]
'/ggparent/gparent/parent'
>>> path.rsplit('/',2)[0]
'/ggparent/gparent'
>>> path.rsplit('/',3)[0]
'/ggparent'

Note: above gives '' for maxsplit out of range, easily converted to exception in function wrapper.
msg130079 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-04 22:00
Except that dirname() isn't a one-liner, so you are giving rather bad advice here.
msg130081 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-04 22:04
As for use cases, I have used it quite commonly in test scripts in order to find out the base directory of the source tree (and/or other resources such as data files).

e.g.:

basepath = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
msg130107 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-03-05 07:58
> My point was that unix hasn't found it useful 
> to add a level option to the dirname API.

ISTM, that is a strong indication that this isn't needed in the form it has been proposed.

> I don't know that I personally have ever had 
> occasion to peel off more than one directory level 
> without also wanting to do something with the 
> intermediate results, so perhaps I am not a good 
> judge of how useful this would be.

I think this only arises when a known directory structure has been attached at some arbitrary point on a tree, so you might use a relative path like ../../bin/command.py in the shell.  To serve that use case, it would be better to have a function that splits all the components of the path into a list that's easily manipulated:

>>> oldpath = os.path.splitpath('/ggparent/gparent/parent/')
>>> newpath = oldpath[:-2] + ['bin', 'command.py']
>>> os.path.join(*newpath)
'/ggparent/bin/command.py'
msg130119 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-05 13:26
Ah, yes, splitpath is a function I've occasionally wanted.  I also remember being surprised that os.path.split didn't return such a list.
msg130344 - (view) Author: blokeley (blokeley) Date: 2011-03-08 17:47
os.path.splitpath() as described by rhettinger would solve the problem.

If I wrote the patches, tests and docs, what are the chances of it being accepted?
msg130345 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-08 17:53
> If I wrote the patches, tests and docs, what are the chances of it
> being accepted?

Rather high as far as I'm concerned. Be careful with semantics and implementation under Windows, though (you should probably take a look at existing functions in ntpath.py as a guideline).
msg130373 - (view) Author: blokeley (blokeley) Date: 2011-03-08 20:54
I started writing the patch against py2.7 but realised that 2.7 could be the last in the 2.x series. I'll write the patch against default tip.
msg134740 - (view) Author: blokeley (blokeley) Date: 2011-04-29 08:59
The unit tests on the cpython tip revision fail even before applying my patches and I'm afraid haven't got the time to debug the threading module or existing unit tests.

The traceback is:

C:\workspace\cpython\Lib\test> C:\Python32\python.exe test_ntpath.py

Traceback (most recent call last):
  File "test_ntpath.py", line 4, in <module>
    from test.support import TestFailed
  File "C:\workspace\cpython\Lib\test\support.py", line 14, in <module>
    import shutil
  File "C:\workspace\cpython\Lib\shutil.py", line 17, in <module>
    import bz2
  File "C:\workspace\cpython\Lib\bz2.py", line 13, in <module>
    import threading
  File "C:\workspace\cpython\Lib\threading.py", line 34, in <module>
    _info = _thread.info
AttributeError: 'module' object has no attribute 'info'


It happens with cpython hg rev 8eb794bbb967

If there's a quick fix for this, please advise and I'll get working. 

If not, I'll probably not have the time to fix it myself and then write the os.path.splitpath patches as well which would be a pity.
msg134749 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-04-29 11:32
Did you try a make distclean/configure/make?  _thread.info is a new attribute introduced by a relatively recent patch.
msg134770 - (view) Author: blokeley (blokeley) Date: 2011-04-29 14:23
My runtime came from the Python32 Windows installer and I don't have a C compiler on this machine. Therefore I updated to the 3.2 branch in hg and worked on that. This patch is pretty simple so should work on 3.3 without modifications.

I have attached my first iteration of the patch (patched against hg rev 56c187b81d2b).

Disclaimers and suspected issues:

* A path given as a byte array is converted to a string so 
  splitpath() only returns lists of strings and never lists of 
  byte arrays. I don't know if splitpath() should return a list 
  of byte arrays if the path was a byte array. The way split() 
  is tested implies not. Please advise.
  
* We might need more tests to cover more path variations on Windows.

* I haven't implemented splitpath() in os2emxpath.py because 
  I couldn't find test/test_os2emxpath.py or the equivalent. 
  Please advise if there is one or if I should create one.

Feedback and patches most welcome.
msg134873 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-04-30 15:55
To clarify one point: Python does not try to mimic the shell, but the os module exposes system calls as they are.

(Unrelated remark: pkgutil.get_data can replace a lot of uses of dirname(dirname(__file__)))
msg140271 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-13 15:27
I’m not sure this is correct for POSIX:
  splitpath('/gparent/parent/') returns ['gparent', 'parent']

/ is a real directory, it should be the ultimate parent of any path IIUC.

On a related note, using “parent” for the leaf file looks strange to me, I think something like this would make more sense:

 /gparent/parent/somedir/
msg176781 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-02 10:51
splitpath() should be equivalent to followed code (but be non-recursive and more effective):

def splitpath(path):
    head, tail = split(path)
    if head == path:
        return [head]
    else:
        return splitpath(head) + [tail]
msg176807 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-02 20:45
The proposed patch adds effective implementations of mentioned above algorithm.

splitpath() can be used for consecutive implementation of relpath() and commonpath().
msg178608 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-30 19:44
Please review. This function is very important for many applications (and it hard to get right).
msg178610 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-30 19:51
> Please review. This function is very important for many applications
> (and it hard to get right).

The pathlib module (PEP 428) has such functionality built-in.
msg203196 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-17 15:59
The pathlib module is not in the stdlib yet, while a patch for splitpath() waits for review almost a year.
msg203291 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-11-18 13:18
The ntpath.splitpath() version is easy to get lost in. It would probably help if you spelt out all the single-letter variable names, and explained that tri-state root/separator = None/True/False flag. Maybe there is a less convoluted way to write it too, I dunno.

Also, maybe it is worth clearly documenting a couple special properties of the result:

* The first element is always the root component (for an absolute path), or an empty string (for a relative path)
* The last element is an empty string if the path name ended in a directory separator, except when the path is a root directory
msg203310 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-18 17:29
Added examples and Martin's notes to the documentation. ntpath implementation rewrote with regular expressions (it is now shorter and perhaps more clear).
msg222967 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-07-13 19:13
Updated patch. Added private general implementation in genericpath and specialized implementations are now tested to return the same result as general implementation.
msg238564 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2015-03-19 21:55
pathlib is in the stdlib now (see previous comments), maybe this should be closed as obsolete.
msg239657 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-31 02:38
I think my use cases of splitpath() could be fulfilled by using Path.parts, Path.anchor, Path.relative_to(), etc. I am a bit sad that this never made it in, but I agree it is redundant with pathlib, and the issue should probably be closed.
msg239676 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-03-31 09:33
Assuming new code should be using pathlib, I agree this should probably be closed now as obsolete.

One proviso - pathlib objects don't take a bytestring path in the constructor. If there's a need for a low-level splitpath taking bytes objects, there may still be a benefit to this patch. Otherwise pathlib.Path(p).parts is a direct replacement AFAICT.
msg241625 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-04-20 09:04
I thought that splitpath() could be used in implementations of realpath(), relpath(), commonpath(), and in user code. But looks as realpath(), relpath() and commonpath() should use specialized inlined versions for efficiency, and user code can use more highlevel pathlib. For now there are no strong arguments for adding splitpath().

The patch is updated to the tip (for the case if it will needed in the future) and the issue is closed.
History
Date User Action Args
2015-04-20 09:04:13serhiy.storchakasetstatus: open -> closed
files: + ospath_splitpath_4.patch
messages: + msg241625

resolution: rejected
stage: patch review -> resolved
2015-03-31 09:33:05paul.mooresetnosy: + paul.moore
messages: + msg239676
2015-03-31 02:38:39martin.pantersetmessages: + msg239657
2015-03-19 21:55:52eric.araujosetmessages: + msg238564
2014-07-13 19:32:33brian.curtinsetnosy: - brian.curtin
2014-07-13 19:13:37serhiy.storchakasetfiles: + ospath_splitpath_3.patch

messages: + msg222967
versions: + Python 3.5, - Python 3.4
2014-02-11 01:42:14r.david.murraylinkissue894936 superseder
2013-11-18 17:30:00serhiy.storchakasetfiles: + ospath_splitpath_2.patch

messages: + msg203310
2013-11-18 13:18:04martin.pantersetmessages: + msg203291
2013-11-17 15:59:18serhiy.storchakasetmessages: + msg203196
2013-09-28 16:34:45giampaolo.rodolasetnosy: + giampaolo.rodola
2013-09-15 04:20:15martin.pantersetnosy: + martin.panter
2012-12-30 19:51:19pitrousetmessages: + msg178610
2012-12-30 19:44:52serhiy.storchakasetmessages: + msg178608
2012-12-29 22:06:06serhiy.storchakasetassignee: serhiy.storchaka
2012-12-03 08:01:42Arfreversetnosy: + Arfrever
2012-12-02 20:46:24serhiy.storchakasetfiles: + ospath_splitpath.patch
2012-12-02 20:45:37serhiy.storchakasetmessages: + msg176807
stage: patch review
2012-12-02 10:57:36serhiy.storchakasetversions: + Python 3.4, - Python 3.3
2012-12-02 10:51:59serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg176781
2011-07-13 15:27:29eric.araujosetmessages: + msg140271
2011-04-30 15:55:23eric.araujosetnosy: + eric.araujo
messages: + msg134873
2011-04-29 17:38:15santoso.wijayasetnosy: + santoso.wijaya
2011-04-29 14:25:27brian.curtinsetnosy: + brian.curtin
2011-04-29 14:23:44blokeleysetfiles: + issue11344.patch
keywords: + patch
messages: + msg134770
2011-04-29 11:32:06r.david.murraysetmessages: + msg134749
2011-04-29 08:59:38blokeleysetmessages: + msg134740
2011-03-08 20:54:34blokeleysetnosy: rhettinger, terry.reedy, pitrou, r.david.murray, blokeley
messages: + msg130373
title: Add height argument to os.path.dirname() -> Add os.path.splitpath(path) function
2011-03-08 17:53:47pitrousetnosy: rhettinger, terry.reedy, pitrou, r.david.murray, blokeley
messages: + msg130345
2011-03-08 17:47:55blokeleysetnosy: rhettinger, terry.reedy, pitrou, r.david.murray, blokeley
messages: + msg130344
2011-03-05 13:26:10r.david.murraysetnosy: rhettinger, terry.reedy, pitrou, r.david.murray, blokeley
messages: + msg130119
2011-03-05 07:58:59rhettingersetnosy: + rhettinger
messages: + msg130107
2011-03-04 22:04:11pitrousetnosy: terry.reedy, pitrou, r.david.murray, blokeley
messages: + msg130081
2011-03-04 22:00:00pitrousetnosy: terry.reedy, pitrou, r.david.murray, blokeley
messages: + msg130079
2011-03-04 21:49:07terry.reedysetnosy: + terry.reedy
messages: + msg130078
2011-02-27 17:45:01r.david.murraysetnosy: pitrou, r.david.murray, blokeley
messages: + msg129640
2011-02-27 15:51:40pitrousetnosy: + pitrou
messages: + msg129636
2011-02-27 15:47:25r.david.murraysetnosy: + r.david.murray
messages: + msg129635
2011-02-27 08:26:48blokeleycreate