classification
Title: `pathlib.Path.expanduser()` does not call `os.path.expanduser()`
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: andrei.avk, barneygale, eryksun, kj, serhiy.storchaka, steve.dower
Priority: normal Keywords: patch

Created on 2020-03-08 05:06 by barneygale, last changed 2021-09-06 17:52 by andrei.avk. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 18841 merged barneygale, 2020-03-08 06:01
PR 25277 merged barneygale, 2021-04-08 11:46
Messages (19)
msg363635 - (view) Author: Barney Gale (barneygale) * Date: 2020-03-08 05:06
`pathlib.Path.expanduser()` does not call `os.path.expanduser()`, but instead re-implements it. The implementations look pretty similar and I can't see a good reason for the duplication. The only difference is that `pathlib.Path.expanduser()` raises `RuntimeError` when a home directory cannot be resolved, whereas `os.path.expanduser()` returns the path unchanged.
msg363636 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-08 05:37
There are two reasons:

1. os.path.expanduser() returns the path unchanged when a home directory cannot be resolved, pathlib.Path.expanduser() raises an error. The latter behavior looks more robust, but we can't change os.path.expanduser().

2. os.path.expanduser() needs to split the path on components while pathlib.Path.expanduser() already has ready components. In some cases it may be more efficient.
msg363637 - (view) Author: Barney Gale (barneygale) * Date: 2020-03-08 05:58
We can check whether `os.path.expanduser()` returned a path beginning with "~" and raise a RuntimeError if so, right? On point #2, I'm not sure this optimization alone justifies the duplication. PR incoming...
msg363639 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-08 08:20
I see no reason to change the current code.
msg363644 - (view) Author: Barney Gale (barneygale) * Date: 2020-03-08 09:46
I see no reason for the duplication, and I can point to one concrete bug affecting your re-implementation of `expanduser` that doesn't affect the original, i.e. that a `KeyError` is raised on Windows when `"USERNAME"` is not present in `os.environ`, whereas all similar cases raise `RuntimeError`. These sorts of issues sneak in when you duplicate code - better to stick with the battle-hardened version rather than an inherently risky rewrite.
msg363647 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-08 10:41
It was discussed in issue19776.

Having separate implementation we can avoid design flaws of os.path.expanduser().
msg363652 - (view) Author: Barney Gale (barneygale) * Date: 2020-03-08 11:18
The only design flaw mentioned in that thread is that `os.path.expanduser()` returns the input unchanged if expansion fails, which is not very pythonic. However, such a problem doesn't necessitate a rewrite of `os.path.expanduser()`. Checking `result[:1]` is enough.

I think there's also a difference in the Windows heuristic in that pathlib checks whether `basename(%HOMEPATH%) == %USERNAME%` whereas `ntpath.expanduser` doesn't. But if that's really an issue it should probably be fixed in `ntpath` IMO, rather than having divergent implementations.
msg390449 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-07 17:11
I think this is worth unifying, but I'm concerned about making expanduser() return the original path on Windows - "~name" is a valid filename/relative path, and so code that does mkdir(expanduser("~nonuser/dir")) could create garbage in the current directory. I'd rather just raise OSError (or I guess RuntimeError, for consistency).

Long term, I'd like to see it switch to calling GetProfilesDirectory on Windows, but that's separate from this change, and doesn't invalidate this one.

Reading through the discussion, it seems like the primary concern about the change is "change for change sake". I think the amount and kind of code that's being removed is a good thing, and it's better represented as an "expand" step in the accessor than a "get" from the path.

So let's get it merged, probably(?) with a stronger error for the unknown users, but happy to be talked out of that. And only for 3.10.
msg390452 - (view) Author: Barney Gale (barneygale) * Date: 2021-04-07 17:21
Thanks for taking a look, Steve.

A couple things maybe worth noting:

Firstly, `os.path.expanduser()` is already documented to return the path unchanged if the home directory can't be resolved:

> If the expansion fails or if the path does not begin with a tilde, the path is returned unchanged.

Secondly, `ntpath.expanduser()` already returns the path unchanged if neither USERPROFILE nor HOMEPATH are in the environment.

An alternative would be to leave `ntpath.expanduser()` method alone, and forgo the slightly-improved error checking in `WindowsPath.expanduser()` in the name of conformity. Or perhaps we could add a `stict` parameter to `expanduser()`?

I can understand why this could be seen as change for change's sake. In fact this code removal greatly aids my work towards addressing bpo-24132.

Thanks again
msg390461 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-07 18:25
> Firstly, `os.path.expanduser()` is already documented to return the path unchanged if the home directory can't be resolved:

Ah, too bad. Doesn't prevent us from changing it, but hopefully it means that everyone using it is already checking the result and not blindly using it.

> I can understand why this could be seen as change for change's sake. In fact this code removal greatly aids my work towards addressing bpo-24132.

In light of this, did you mean to close the issue? Or do you still want to pursue making the change?
msg390477 - (view) Author: Barney Gale (barneygale) * Date: 2021-04-07 21:17
Apologies, I think I started writing a comment before your reply was posted, which undid your changes.
msg390494 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-07 22:50
New changeset 3f3d82b84823eb28abeedf317bbe107bbe7f6492 by Barney Gale in branch 'master':
bpo-39899: os.path.expanduser(): don't guess other Windows users' home directories if the basename of the current user's home directory doesn't match their username. (GH-18841)
https://github.com/python/cpython/commit/3f3d82b84823eb28abeedf317bbe107bbe7f6492
msg390497 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-04-07 23:29
For a "~user" path, the value of userhome should always be used if target_user == current_user. If for some reason the USERPROFILE environment variable isn't defined, the fallback "%HOMEDRIVE%%HOMEPATH%" does not necessarily end in the user's name. Example rewrite:

    if i != 1: #~user
        target_user = path[1:i]
        if isinstance(target_user, bytes):
            target_user = os.fsdecode(target_user)
        current_user = os.environ.get('USERNAME')

        if target_user != current_user:
            # Try to guess user home directory.  By default all user
            # profile directories are located in the same place and are
            # named by corresponding usernames.  If userhome isn't a
            # normal profile directory, this guess is likely wrong,
            # so we bail out.
            if current_user != basename(userhome):
                return path
            userhome = join(dirname(userhome), target_user)
msg390504 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-08 07:15
AFAIK you can set arbitrary path as user home directory. So home directories of different users can even not be on the same disk, and the last component of the path can be different from the user name.

os.path.expanduser() has many flaws, and it just "guess" the home directory for other users. It is difficult to fix os.path.expanduser() due to backward compatibility, but we should do better in pathlib from start.
msg390507 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-04-08 08:01
> os.path.expanduser() has many flaws, and it just "guess" the 
> home directory for other users.

I'm fine with not guessing another user's profile directory (or home directory) in some cases, or even always. But the code that got committed bails out even if the target user is the same as the current user, according to the USERNAME environment variable. It shouldn't do that.

---

If we want something better than guessing, it's not very hard to get the real profile directory or home directory for another user on the current system. The profile directory is configured as the "ProfileImagePath" value in a subkey of "HKLM\Software\Microsoft\Windows NT\CurrentVersion\ProfileList". The subkey name is the user SID in string form. The SID can be looked up with LookupAccountNameW(NULL, target_user, &sid, ...) and converted to string form with ConvertSidToStringSidW(&sid, string_sid). If there's no local profile directory, try looking up the user's configured home_dir and/or home_dir_drive (a mapped drive for a remote home directory) via NetUserGetInfo(NULL, target_user, 4, &info).
msg390528 - (view) Author: Barney Gale (barneygale) * Date: 2021-04-08 11:53
Good spot Eryk - I've put in another PR to address it.
msg390669 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-09 21:28
New changeset ba1db571987c65672d9c06789e9852313ed2412a by Barney Gale in branch 'master':
bpo-39899: Don't double-check directory name if we're requesting the current user's home directory in ntpath.expanduser() (GH-25277)
https://github.com/python/cpython/commit/ba1db571987c65672d9c06789e9852313ed2412a
msg401167 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-09-06 17:41
Note this change also fixes https://bugs.python.org/issue41082 . I'm guessing it's too much of an edge case to backport this fix to 3.9, so I've put up a possible fix via docs update on that issue.
msg401168 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-09-06 17:52
To be more precise, this change fixes https://bugs.python.org/issue41082 by raising RuntimeError instead of KeyError and also by documenting it, which means matplotlib can fix it by either using os.path.expanduser or catching RuntimeError, whichever might work better in their case.

I will let them know once we sort this out.
History
Date User Action Args
2021-09-06 17:52:12andrei.avksetmessages: + msg401168
2021-09-06 17:41:03andrei.avksetnosy: + andrei.avk, kj
messages: + msg401167
2021-04-09 22:03:46steve.dowersetstatus: open -> closed
stage: patch review -> resolved
2021-04-09 21:28:22steve.dowersetmessages: + msg390669
2021-04-08 11:53:21barneygalesetmessages: + msg390528
2021-04-08 11:46:14barneygalesetstage: resolved -> patch review
pull_requests: + pull_request24014
2021-04-08 08:01:40eryksunsetmessages: + msg390507
2021-04-08 07:15:22serhiy.storchakasetmessages: + msg390504
2021-04-07 23:29:42eryksunsetstatus: closed -> open
nosy: + eryksun
messages: + msg390497

2021-04-07 22:50:43steve.dowersetstatus: open -> closed
resolution: fixed
2021-04-07 22:50:18steve.dowersetmessages: + msg390494
2021-04-07 21:17:20barneygalesetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg390477

versions: + Python 3.10, - Python 3.9
2021-04-07 18:25:29steve.dowersetmessages: + msg390461
2021-04-07 17:21:14barneygalesetstatus: open -> closed
versions: + Python 3.9, - Python 3.10
messages: + msg390452

resolution: not a bug
stage: patch review -> resolved
2021-04-07 17:11:21steve.dowersetstatus: closed -> open

versions: + Python 3.10, - Python 3.9
nosy: + steve.dower

messages: + msg390449
resolution: not a bug -> (no value)
stage: resolved -> patch review
2020-03-08 11:18:25barneygalesetmessages: + msg363652
2020-03-08 10:41:39serhiy.storchakasetmessages: + msg363647
2020-03-08 09:47:00barneygalesetmessages: + msg363644
2020-03-08 08:20:29serhiy.storchakasetstatus: open -> closed
resolution: not a bug
messages: + msg363639

stage: patch review -> resolved
2020-03-08 06:01:45barneygalesetkeywords: + patch
stage: patch review
pull_requests: + pull_request18198
2020-03-08 05:58:22barneygalesetmessages: + msg363637
2020-03-08 05:37:54serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg363636
2020-03-08 05:06:12barneygalecreate