Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathlib.Path.expanduser() does not call os.path.expanduser() #84080

Closed
barneygale mannequin opened this issue Mar 8, 2020 · 19 comments
Closed

pathlib.Path.expanduser() does not call os.path.expanduser() #84080

barneygale mannequin opened this issue Mar 8, 2020 · 19 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir

Comments

@barneygale
Copy link
Mannequin

barneygale mannequin commented Mar 8, 2020

BPO 39899
Nosy @serhiy-storchaka, @eryksun, @zooba, @barneygale, @Fidget-Spinner, @akulakov
PRs
  • bpo-39899: Make pathlib use os.path.expanduser() to expand home directories #18841
  • bpo-39899: ntpath.expanduser(): don't check the basename of the directory matches the username if we're requesting the current user's home directory #25277
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-04-09.22:03:46.934>
    created_at = <Date 2020-03-08.05:06:12.759>
    labels = ['library', '3.10']
    title = '`pathlib.Path.expanduser()` does not call `os.path.expanduser()`'
    updated_at = <Date 2021-09-06.17:52:12.404>
    user = 'https://github.com/barneygale'

    bugs.python.org fields:

    activity = <Date 2021-09-06.17:52:12.404>
    actor = 'andrei.avk'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-04-09.22:03:46.934>
    closer = 'steve.dower'
    components = ['Library (Lib)']
    creation = <Date 2020-03-08.05:06:12.759>
    creator = 'barneygale'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39899
    keywords = ['patch']
    message_count = 19.0
    messages = ['363635', '363636', '363637', '363639', '363644', '363647', '363652', '390449', '390452', '390461', '390477', '390494', '390497', '390504', '390507', '390528', '390669', '401167', '401168']
    nosy_count = 6.0
    nosy_names = ['serhiy.storchaka', 'eryksun', 'steve.dower', 'barneygale', 'kj', 'andrei.avk']
    pr_nums = ['18841', '25277']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue39899'
    versions = ['Python 3.10']

    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Mar 8, 2020

    pathlib.Path.expanduser() does not call os.path.expanduser(), but instead re-implements it. The implementations look pretty similar and I can't see a good reason for the duplication. The only difference is that pathlib.Path.expanduser() raises RuntimeError when a home directory cannot be resolved, whereas os.path.expanduser() returns the path unchanged.

    @barneygale barneygale mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir labels Mar 8, 2020
    @serhiy-storchaka
    Copy link
    Member

    There are two reasons:

    1. os.path.expanduser() returns the path unchanged when a home directory cannot be resolved, pathlib.Path.expanduser() raises an error. The latter behavior looks more robust, but we can't change os.path.expanduser().

    2. os.path.expanduser() needs to split the path on components while pathlib.Path.expanduser() already has ready components. In some cases it may be more efficient.

    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Mar 8, 2020

    We can check whether os.path.expanduser() returned a path beginning with "~" and raise a RuntimeError if so, right? On point #2, I'm not sure this optimization alone justifies the duplication. PR incoming...

    @serhiy-storchaka
    Copy link
    Member

    I see no reason to change the current code.

    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Mar 8, 2020

    I see no reason for the duplication, and I can point to one concrete bug affecting your re-implementation of expanduser that doesn't affect the original, i.e. that a KeyError is raised on Windows when "USERNAME" is not present in os.environ, whereas all similar cases raise RuntimeError. These sorts of issues sneak in when you duplicate code - better to stick with the battle-hardened version rather than an inherently risky rewrite.

    @serhiy-storchaka
    Copy link
    Member

    It was discussed in bpo-19776.

    Having separate implementation we can avoid design flaws of os.path.expanduser().

    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Mar 8, 2020

    The only design flaw mentioned in that thread is that os.path.expanduser() returns the input unchanged if expansion fails, which is not very pythonic. However, such a problem doesn't necessitate a rewrite of os.path.expanduser(). Checking result[:1] is enough.

    I think there's also a difference in the Windows heuristic in that pathlib checks whether basename(%HOMEPATH%) == %USERNAME% whereas ntpath.expanduser doesn't. But if that's really an issue it should probably be fixed in ntpath IMO, rather than having divergent implementations.

    @zooba
    Copy link
    Member

    zooba commented Apr 7, 2021

    I think this is worth unifying, but I'm concerned about making expanduser() return the original path on Windows - "~name" is a valid filename/relative path, and so code that does mkdir(expanduser("~nonuser/dir")) could create garbage in the current directory. I'd rather just raise OSError (or I guess RuntimeError, for consistency).

    Long term, I'd like to see it switch to calling GetProfilesDirectory on Windows, but that's separate from this change, and doesn't invalidate this one.

    Reading through the discussion, it seems like the primary concern about the change is "change for change sake". I think the amount and kind of code that's being removed is a good thing, and it's better represented as an "expand" step in the accessor than a "get" from the path.

    So let's get it merged, probably(?) with a stronger error for the unknown users, but happy to be talked out of that. And only for 3.10.

    @zooba zooba added 3.10 only security fixes and removed 3.9 only security fixes labels Apr 7, 2021
    @zooba zooba reopened this Apr 7, 2021
    @zooba zooba added 3.10 only security fixes and removed invalid 3.9 only security fixes labels Apr 7, 2021
    @zooba zooba reopened this Apr 7, 2021
    @zooba zooba removed the invalid label Apr 7, 2021
    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Apr 7, 2021

    Thanks for taking a look, Steve.

    A couple things maybe worth noting:

    Firstly, os.path.expanduser() is already documented to return the path unchanged if the home directory can't be resolved:

    If the expansion fails or if the path does not begin with a tilde, the path is returned unchanged.

    Secondly, ntpath.expanduser() already returns the path unchanged if neither USERPROFILE nor HOMEPATH are in the environment.

    An alternative would be to leave ntpath.expanduser() method alone, and forgo the slightly-improved error checking in WindowsPath.expanduser() in the name of conformity. Or perhaps we could add a stict parameter to expanduser()?

    I can understand why this could be seen as change for change's sake. In fact this code removal greatly aids my work towards addressing bpo-24132.

    Thanks again

    @barneygale barneygale mannequin added 3.9 only security fixes and removed 3.10 only security fixes labels Apr 7, 2021
    @barneygale barneygale mannequin closed this as completed Apr 7, 2021
    @barneygale barneygale mannequin added invalid 3.9 only security fixes labels Apr 7, 2021
    @barneygale barneygale mannequin removed the 3.10 only security fixes label Apr 7, 2021
    @barneygale barneygale mannequin closed this as completed Apr 7, 2021
    @barneygale barneygale mannequin added the invalid label Apr 7, 2021
    @zooba
    Copy link
    Member

    zooba commented Apr 7, 2021

    Firstly, os.path.expanduser() is already documented to return the path unchanged if the home directory can't be resolved:

    Ah, too bad. Doesn't prevent us from changing it, but hopefully it means that everyone using it is already checking the result and not blindly using it.

    I can understand why this could be seen as change for change's sake. In fact this code removal greatly aids my work towards addressing bpo-24132.

    In light of this, did you mean to close the issue? Or do you still want to pursue making the change?

    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Apr 7, 2021

    Apologies, I think I started writing a comment before your reply was posted, which undid your changes.

    @barneygale barneygale mannequin added 3.10 only security fixes and removed 3.9 only security fixes labels Apr 7, 2021
    @barneygale barneygale mannequin reopened this Apr 7, 2021
    @barneygale barneygale mannequin added 3.10 only security fixes and removed invalid 3.9 only security fixes labels Apr 7, 2021
    @barneygale barneygale mannequin reopened this Apr 7, 2021
    @barneygale barneygale mannequin removed the invalid label Apr 7, 2021
    @zooba
    Copy link
    Member

    zooba commented Apr 7, 2021

    New changeset 3f3d82b by Barney Gale in branch 'master':
    bpo-39899: os.path.expanduser(): don't guess other Windows users' home directories if the basename of the current user's home directory doesn't match their username. (GH-18841)
    3f3d82b

    @zooba zooba closed this as completed Apr 7, 2021
    @zooba zooba closed this as completed Apr 7, 2021
    @eryksun
    Copy link
    Contributor

    eryksun commented Apr 7, 2021

    For a "~user" path, the value of userhome should always be used if target_user == current_user. If for some reason the USERPROFILE environment variable isn't defined, the fallback "%HOMEDRIVE%%HOMEPATH%" does not necessarily end in the user's name. Example rewrite:

        if i != 1: #~user
            target_user = path[1:i]
            if isinstance(target_user, bytes):
                target_user = os.fsdecode(target_user)
            current_user = os.environ.get('USERNAME')
    
            if target_user != current_user:
                # Try to guess user home directory.  By default all user
                # profile directories are located in the same place and are
                # named by corresponding usernames.  If userhome isn't a
                # normal profile directory, this guess is likely wrong,
                # so we bail out.
                if current_user != basename(userhome):
                    return path
                userhome = join(dirname(userhome), target_user)

    @eryksun eryksun reopened this Apr 7, 2021
    @eryksun eryksun reopened this Apr 7, 2021
    @serhiy-storchaka
    Copy link
    Member

    AFAIK you can set arbitrary path as user home directory. So home directories of different users can even not be on the same disk, and the last component of the path can be different from the user name.

    os.path.expanduser() has many flaws, and it just "guess" the home directory for other users. It is difficult to fix os.path.expanduser() due to backward compatibility, but we should do better in pathlib from start.

    @eryksun
    Copy link
    Contributor

    eryksun commented Apr 8, 2021

    os.path.expanduser() has many flaws, and it just "guess" the
    home directory for other users.

    I'm fine with not guessing another user's profile directory (or home directory) in some cases, or even always. But the code that got committed bails out even if the target user is the same as the current user, according to the USERNAME environment variable. It shouldn't do that.

    ---

    If we want something better than guessing, it's not very hard to get the real profile directory or home directory for another user on the current system. The profile directory is configured as the "ProfileImagePath" value in a subkey of "HKLM\Software\Microsoft\Windows NT\CurrentVersion\ProfileList". The subkey name is the user SID in string form. The SID can be looked up with LookupAccountNameW(NULL, target_user, &sid, ...) and converted to string form with ConvertSidToStringSidW(&sid, string_sid). If there's no local profile directory, try looking up the user's configured home_dir and/or home_dir_drive (a mapped drive for a remote home directory) via NetUserGetInfo(NULL, target_user, 4, &info).

    @barneygale
    Copy link
    Mannequin Author

    barneygale mannequin commented Apr 8, 2021

    Good spot Eryk - I've put in another PR to address it.

    @zooba
    Copy link
    Member

    zooba commented Apr 9, 2021

    New changeset ba1db57 by Barney Gale in branch 'master':
    bpo-39899: Don't double-check directory name if we're requesting the current user's home directory in ntpath.expanduser() (GH-25277)
    ba1db57

    @zooba zooba closed this as completed Apr 9, 2021
    @zooba zooba closed this as completed Apr 9, 2021
    @akulakov
    Copy link
    Contributor

    akulakov commented Sep 6, 2021

    Note this change also fixes https://bugs.python.org/issue41082 . I'm guessing it's too much of an edge case to backport this fix to 3.9, so I've put up a possible fix via docs update on that issue.

    @akulakov
    Copy link
    Contributor

    akulakov commented Sep 6, 2021

    To be more precise, this change fixes https://bugs.python.org/issue41082 by raising RuntimeError instead of KeyError and also by documenting it, which means matplotlib can fix it by either using os.path.expanduser or catching RuntimeError, whichever might work better in their case.

    I will let them know once we sort this out.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants