This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: splitext of dotfiles, incl backwards compat and migration
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, ajaksu2, aptshansen, georg.brandl, jimjjewett, loewis, michael.foord, ncoghlan, sonderblade
Priority: normal Keywords: patch

Created on 2007-03-16 04:22 by aptshansen, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
splitext3.patch aptshansen, 2007-03-16 16:35 new patch w/ docs + tests for splitext
splitext-leading_dot+all_ext.patch aptshansen, 2007-03-16 22:36 New patch: adds all_ext, more tests, to splitext3.
Messages (15)
msg52245 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-16 04:22
The attached patch is for the *path.py files, the associated test suits, and the documentation-- the latter part which may need a staring at since I don't really know LaTeX very well. It's made against the HEAD in the trunk.

This is in response to issue #1462106, which has earned quite a bit of discussion on python-dev.

I am in complete agreement with the *intention* of that patch and its application, that the previous behavior was "wrong"; that splitext('.cshrc') should not return ('', '.cshrc').

However, the patch silently altered the semantics of the function instead of firmly failing, and doesn't allow for the fact that the previous documentation was ambiguous and as such people may (and apparently, sometimes did) actually consider the old behavior correct.

The attached patch adds a keyword parameter to splitext, "preserve_dotfiles", which at present defaults to False.

It might need a better name :P I suck at that.

When False, the behavior is to return ('', '.cshrc'), but also to issue a FutureWarning indicating that this will change in the future.

When True, the behavior is to return ('.cshrc', '').

The intention is to fix the 'error', while giving people time to migrate code which may have previousely been faulty to the correct result. Also, for those not of a deeply UNIX mindset, they can consider everything after the last dot-- regardless of how many-- an extension, even if it means there's no root. Viva la Windows Explorer.
msg52246 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-16 08:00
File Added: splitext2.patch
msg52247 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-03-16 15:50
[ignoring the question whether this change is acceptable in the first place - I'm apparently not qualified to make a determination here]

This implementation makes splitext accept arbitrary many keyword arguments. This is not good, it should only accept those that are documented.

It might be helpful if the warning was only issued if the keyword argument wasn't provided at all, assuming that whoever passes False knows what he does.
msg52248 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-16 16:35
File Added: splitext3.patch
msg52249 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-16 16:36
That's a good point; updated patch to throw an error if other keyword arguments are given.
msg52250 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-03-16 16:48
I don't understand why you have to use **kwargs.
msg52251 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-16 16:52
Well, because someone on the -dev list at some point raised an objection to taking an arbitrary positional argument in as a second argument and not throwing an error. Or something like that.

splitext('filename.ext', '.')

would make '.' head along into preserve_dotfiles

If that's not a concern, I could readily just type-check to make sure preserve_keywords is only one of (True, False) and if not raise an error.

Its a "keyword-only-argument", before such a thing is in :)
msg52252 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-16 22:36
I should probably not submit a patch in the *middle* of a conversation, huh? I thought it was at the point where people were just debating things like.. policy... and.. stuff. And not at a point where someone would come up a whole new idea that ties directly in.

Someone did! Nick Coghlan came up with a better way to spell the whole issue, and a logical companion feature; so I just implemented the patch that way. splitext(path, ignore_leading_dot=False, all_ext=False)

Also added a bunch to the test just to be anal about being sure that no combination of options would screw up the 'default' behavior.
File Added: splitext-leading_dot+all_ext.patch
msg52253 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-03-18 21:29
Please no, there is no need to bastardize both the implementation and
the specification of splitext just to appease those 0.0001% of its
users that has ever splitext:ed a dotfile. For the rest of the
functions 99.99% users, the extra flags are just dead and confusing
weight. Functions should not be kitchen sinks, they should do one
thing only. If there is a need to split Python-2.4.3.tar.bz2 into
("Python-2.4.3", ".tar.bz2") or other unix:y variants, such a function
would be much better added to shutil.

Please just choose whether splitext(".emacs") returns ("", ".emacs")
or (".emacs", "") and update the doc to be more explicit. Keep it
simple.
msg52254 - (view) Author: Jim Jewett (jimjjewett) Date: 2007-03-19 02:45
If it has to be a simple decision, we need to keep the current behavior;  test/test_ntpath.py has been explicitly verifying the current behavior since 2002.

The question is whether the spec is buggy enough to fix, either in 2.x or in 3.0.
msg52255 - (view) Author: Stephen Hansen (aptshansen) Date: 2007-03-19 19:43
As to Bastardization of Implementations; I think from the endless conversation it became obvious (to me at lest) that just what "extension" means is actually somewhat domain-specific, and splitext doesn't really do it's "one thing" very well. 

I think "ignore_leading_dot" is nessecary, regardless of which behavior is default and regardless of if it becomes determined warnings are bad. Dotfiles do exist: on the various *Nix's, and on Mac's too-- I have about a dozen on my primary machine which is a mac in my home directory. Taking them into consideration is important, and there doesn't seem to be a clear opinion on how to treat the dot. So let people decide. IMHO.

I think "all_ext" is quite useful in many situations; that'd be the one I'd actually have used on more then one occasion in the past. I never used splitext because of its lack. And not on UNIX :) My company develops for Windows and OSX.

Either way, I'm curious what the pronouncement will be :) I'll probably update the patch to do whatever is decided is right (assuming that the status-quo of the original modification isn't determined to be right) This doesn't actually affect me at all (due to never using splitext as above noted), I just made the patch because I had time, it was an interesting, and I had an opinion on what was right-- so figured I'd back it up :) It's been more a test-case on getting involved with contributing to python. And an interesting one. :)
msg84678 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009-03-30 22:10
Still needs a pronouncement. IMHO, should be included in the CoobBook or
PyPI if it doesn't get added to the standard lib.
msg98506 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-01-29 11:47
If I ever clear all the other issues off my list, I may get a chance to have a closer look at this one :)
msg110569 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-17 15:56
Add Michael Foord to nosy list as he raised #1462106 which refers to #1115886 where msg24154 states "fixed this in r54204".  Can this be closed?
msg110570 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-07-17 16:06
On Python 2.6.5:

>>> os.path.splitext('.cshrc')
('.cshrc', '')

I believe this can be closed.
History
Date User Action Args
2022-04-11 14:56:23adminsetgithub: 44728
2010-07-17 16:06:31michael.foordsetstatus: open -> closed
resolution: out of date
messages: + msg110570
2010-07-17 15:56:09BreamoreBoysetnosy: + BreamoreBoy, michael.foord

messages: + msg110569
versions: + Python 3.2, - Python 3.1, Python 2.7
2010-01-29 11:47:33ncoghlansetnosy: + ncoghlan
messages: + msg98506
2009-03-30 22:10:03ajaksu2setversions: + Python 3.1, Python 2.7, - Python 2.6
nosy: + ajaksu2

messages: + msg84678

type: enhancement
stage: patch review
2007-03-16 04:22:57aptshansencreate