Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.path.expanduser should not use HOME on windows #80445

Closed
asottile mannequin opened this issue Mar 11, 2019 · 27 comments
Closed

os.path.expanduser should not use HOME on windows #80445

asottile mannequin opened this issue Mar 11, 2019 · 27 comments
Assignees
Labels
3.8 only security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@asottile
Copy link
Mannequin

asottile mannequin commented Mar 11, 2019

BPO 36264
Nosy @pfmoore, @jaraco, @blueyed, @tjguk, @zware, @eryksun, @zooba, @lazka, @asottile, @miss-islington
PRs
  • bpo-36264: Don't honor POSIX HOME in os.path.expanduser on windows #12282
  • bpo-36264: Updates documentation for change to expanduser on Windows #12294
  • bpo-38883: Don't honor POSIX HOME in pathlib.Path.home/expanduser on Windows #17961
  • [3.8] bpo-38883: Don't use POSIX $HOME in pathlib.Path.home/expanduser on Windows (GH-17961) #18229
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/zooba'
    closed_at = <Date 2019-03-12.22:16:16.883>
    created_at = <Date 2019-03-11.16:09:57.334>
    labels = ['3.8', 'type-bug', 'library', 'OS-windows']
    title = 'os.path.expanduser should not use HOME on windows'
    updated_at = <Date 2020-05-16.17:18:51.381>
    user = 'https://github.com/asottile'

    bugs.python.org fields:

    activity = <Date 2020-05-16.17:18:51.381>
    actor = 'jaraco'
    assignee = 'steve.dower'
    closed = True
    closed_date = <Date 2019-03-12.22:16:16.883>
    closer = 'steve.dower'
    components = ['Library (Lib)', 'Windows']
    creation = <Date 2019-03-11.16:09:57.334>
    creator = 'Anthony Sottile'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36264
    keywords = ['patch']
    message_count = 27.0
    messages = ['337683', '337750', '337752', '337754', '337755', '337756', '337763', '337781', '337784', '337813', '338162', '357182', '357192', '357194', '357197', '360846', '360853', '368931', '368932', '368933', '368935', '368963', '369025', '369034', '369045', '369049', '369060']
    nosy_count = 10.0
    nosy_names = ['paul.moore', 'jaraco', 'blueyed', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'lazka', 'Anthony Sottile', 'miss-islington']
    pr_nums = ['12282', '12294', '17961', '18229']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36264'
    versions = ['Python 3.8']

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Mar 11, 2019

    The current code for os.path.expanduser in ntpath uses HOME in preference to USERPROFILE / HOMEDRIVE\\HOMEPATH

    I can't find any documentation of HOME being relevant on windows (only on posix). For example, wikipedia only mentions USERPROFILE / HOMEDRIVE\\HOMEPATH options: https://en.wikipedia.org/wiki/Home_directory#Default_home_directory_per_operating_system

    Seems to be (one of) the direct causes of a downstream bug for me: pre-commit/pre-commit#960

    (msys sets HOME to a bogus value if HOMEDRIVE is set, then python uses it)

    My proposal is to remove these lines and change the elif to an if:

    cpython/Lib/ntpath.py

    Lines 302 to 304 in d9bd8ec

    if 'HOME' in os.environ:
    userhome = os.environ['HOME']
    elif 'USERPROFILE' in os.environ:

    @asottile asottile mannequin added 3.8 only security fixes stdlib Python modules in the Lib dir OS-windows type-bug An unexpected behavior, bug, or error labels Mar 11, 2019
    @zooba
    Copy link
    Member

    zooba commented Mar 12, 2019

    I like the patch, but I'm not sure all the tests are properly preserving the real value of USERPROFILE.

    Modifying this value could have a real impact on the rest of the process, so we should be very careful to undo it regardless of test result. (Modifying HOME is not a as big a deal since, as you point out, it's not "real" ;) )

    Also, it's probably better to get the current value and check against that, rather than setting it to a known value. One day we might do the right thing and switch expanduser() from the environment variables to the correct API, which would break the test.

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Mar 12, 2019

    Modifying this value could have a real impact on the rest of the process, so we should be very careful to undo it regardless of test result. (Modifying HOME is not a as big a deal since, as you point out, it's not "real" ;) )

    Agreed, though I'd like to write it off as an existing problem (if it is a problem, it looks to me like the tests are doing proper environment cleanup). If they are broken, they'd already be leaking environment variables on posix since these tests run on both

    Auditing the tests I modified:

    • test_netrc: uses `with support.EnvironmentVarGuard() as environ:` (OK!)
    • test_ntpath: uses `with support.EnvironmentVarGuard() as env:` (OK!)
    • test_dist: `MetadataTestCase` extends `support.EnvironGuard` (OK!)
    • test_config: `BasePyPIRCCommandTestCase` extends `support.EnvironGuard` (OK!)

    so actually, looks like this is already good to go on that front!

    Definitely agree on using the true value in the long term -- I'd like to defer that until that's actually happening though as it'll be a much more involved patch!

    @zooba
    Copy link
    Member

    zooba commented Mar 12, 2019

    Great, thanks for checking! I'll merge.

    @zooba zooba self-assigned this Mar 12, 2019
    @zooba
    Copy link
    Member

    zooba commented Mar 12, 2019

    New changeset 25ec4a4 by Steve Dower (Anthony Sottile) in branch 'master':
    bpo-36264: Don't honor POSIX HOME in os.path.expanduser on Windows (GH-12282)
    25ec4a4

    @zware
    Copy link
    Member

    zware commented Mar 12, 2019

    If we're going ahead with this, it's worth a mention in whatsnew as this is going to break things for some users.

    @zooba
    Copy link
    Member

    zooba commented Mar 12, 2019

    If we're going ahead with this, it's worth a mention in whatsnew

    Good call. I'll do it

    @eryksun
    Copy link
    Contributor

    eryksun commented Mar 12, 2019

    os.path.expanduser in ntpath uses HOME in preference to
    USERPROFILE / HOMEDRIVE\\HOMEPATH

    Guido intentionally added support for HOME in ntpath.expanduser way back in Python 1.5 (circa 1997), and now we're removing it over 20 years later. I expect this to break some deployments, but it's not a serious problem.

    My feeling here is "meh". Practically all use of expanduser() is dubious in Windows, varying from going against convention to being completely wrong. We're long due for a module in the standard library that abstracts access to platform-specific configuration and special directories, but attempts to adopt one keep losing momentum. I guess it's too prone to bike shedding and arguments.

    @zooba
    Copy link
    Member

    zooba commented Mar 12, 2019

    Guido intentionally added support for HOME in ntpath.expanduser way back in Python 1.5 (circa 1997), and now we're removing it over 20 years later.

    Wouldn't be the first thing to be removed :)

    @zooba
    Copy link
    Member

    zooba commented Mar 12, 2019

    New changeset 8ef864d by Steve Dower in branch 'master':
    bpo-36264: Updates documentation for change to expanduser on Windows (GH-12294)
    8ef864d

    @zooba zooba closed this as completed Mar 12, 2019
    @blueyed
    Copy link
    Mannequin

    blueyed mannequin commented Mar 18, 2019

    Just as a sidenote: the other day I was checking Python's source to see how to override a user's home (in tests, when os.path.expanduser` is used in code), and found it easy to just set $HOME in the environment in the end.

    I guess this change will cause quite some regressions in this regard - even though $HOME might not be used in practice on Windows.

    @lazka
    Copy link
    Mannequin

    lazka mannequin commented Nov 21, 2019

    Was pathlib forgotten here? Pathlib.home() is documented to return the same as expanduser("~") but it still prefers HOME instead of USERPROFILE.

    Note that this change has some effect on cygwin/mingw environments which all set HOME and now potentially lead to programs no longer being able to find their config files. Not that I disagree with this change, just that it seems a bit rushed to me.

    @zooba
    Copy link
    Member

    zooba commented Nov 21, 2019

    Was pathlib forgotten here? Pathlib.home() is documented to return the same as expanduser("~") but it still prefers HOME instead of USERPROFILE.

    Yes, it was forgotten (why doesn't it just use expanduser?). We should file a new bug for that.

    Note that this change has some effect on cygwin/mingw environments which all set HOME and now potentially lead to programs no longer being able to find their config files.

    Firstly these are not supported environments, so it's not "rushed" for us to not preemptively consider them (though we'll happily merge most PRs that fix them without impacting supported environments).

    And I thought the idea was that they'd use posixpath as os.path rather than ntpath? Cygwin in particular, which provides the full environment. MinGW is a bit lighter to be closer to normal Windows behaviour, which would suggest that using the Windows variables is preferable.

    @lazka
    Copy link
    Mannequin

    lazka mannequin commented Nov 21, 2019

    Was pathlib forgotten here? Pathlib.home() is documented to return
    the same as expanduser("~") but it still prefers HOME instead of
    USERPROFILE.

    Yes, it was forgotten (why doesn't it just use expanduser?). We
    should file a new bug for that.

    I'll file one.

    > Note that this change has some effect on cygwin/mingw environments
    which all set HOME and now potentially lead to programs no longer
    being able to find their config files.

    Firstly these are not supported environments, so it's not "rushed"
    for us to not preemptively consider them (though we'll happily merge
    most PRs that fix them without impacting supported environments).

    Yeah, you're right, sorry, my comment wasn't appropriate.

    And I thought the idea was that they'd use posixpath as os.path
    rather than ntpath? Cygwin in particular, which provides the full
    environment. MinGW is a bit lighter to be closer to normal Windows
    behaviour, which would suggest that using the Windows variables is
    preferable.

    I meant executing Python scripts with official Python in bash on Windows which sets HOME. But I just checked with "git for Windows" and it sets HOME to USERPROFILE anyway, so that might only affect cygwin/msys2 which have their own user dir.

    @lazka
    Copy link
    Mannequin

    lazka mannequin commented Nov 21, 2019

    I've filed bpo-38883

    @zooba
    Copy link
    Member

    zooba commented Jan 28, 2020

    New changeset c45a2aa by Steve Dower (Christoph Reiter) in branch 'master':
    bpo-38883: Don't use POSIX $HOME in pathlib.Path.home/expanduser on Windows (GH-17961)
    c45a2aa

    @miss-islington
    Copy link
    Contributor

    New changeset 595b516 by Miss Islington (bot) in branch '3.8':
    bpo-38883: Don't use POSIX $HOME in pathlib.Path.home/expanduser on Windows (GH-17961)
    595b516

    @jaraco
    Copy link
    Member

    jaraco commented May 15, 2020

    Today I ran into an issue due to this change. I have a test that sets os.environ['HOME'] in order to ensure that os.path.expanduser(~) returns that value (https://github.com/pypa/setuptools/blob/f6f25adfc81df76e186bf6c3738a1baa0f92be05/setuptools/tests/test_packageindex.py#L288).

    I can't find any documentation of HOME being relevant on windows (only on posix).

    First, I'll nitpick that Windows is posix-compliant. What you probably mean here is "Unix".

    But the important thing here is that one of the beautiful things about Python is how it bridges the gap between Unix and Windows, in many ways providing a uniform interface across various platforms. By supporting HOME, Python provides a single mechanism that a caller can rely on to indicate a home directory.

    With this change, a user is now required to provide separate guidance for invocation on Windows.

    In addition to the Setuptools test above that failed, I know I've relied on this behavior before in uses where I've written documentation advising users to set "HOME" to override a location derived from ~ (perhaps for .pypirc, but probably for others as well). Now guidance like that changes from:

    Set HOME to the target directory.

    to:

    If on Windows and Python 3.8 or later, set USERPROFILE to the target directory. Otherwise, set HOME to the target directory.

    It also means that code like the tests in Setuptools need to implement this branching logic as well.

    Can we restore _some_ mechanism by which a caller can supply the home directory in a unified way?

    @jaraco
    Copy link
    Member

    jaraco commented May 15, 2020

    I should also point out that this was a documented feature of Python that was removed without any deprecation period.

    @jaraco
    Copy link
    Member

    jaraco commented May 15, 2020

    I addressed the setuptools test suite issue with this commit. What was a one-line elegant solution is now multiple lines requiring extra imports.

    @lazka
    Copy link
    Mannequin

    lazka mannequin commented May 15, 2020

    Config on Windows should go into APPDATA not USERPROFILE/HOME, on macOS it should go to "$HOME/Library/Application Support" and on Linux $XDG_CONFIG_HOME or $HOME/.config. So using using HOME on all platforms for config is not what those platforms recommend, though I understand why it's still done this way by many tools.

    What about supporting a MYFANCYTOOL_CONFIG_DIR env var that the user can override? That's what I do for example.

    @eryksun
    Copy link
    Contributor

    eryksun commented May 15, 2020

    set "HOME" to override a location derived from ~

    Using HOME in Windows is not reliable because it is not a system environment variable. That makes it fair game for administrators, users, and applications to use however they like.

    Platform differences can be papered over with functions. For example, add set_user_home(dir) and get_user_home(name=''). set_user_home sets a private global variable named _user_home, which defaults to None. get_user_home returns _user_home if it's not None and name is an empty string. Otherwise it gets the home directory per platform rules.

    If on Windows and Python 3.8 or later, set USERPROFILE to the
    target directory.

    This is unfortunate. Modifying USERPROFILE is highly unusual. USERPROFILE is the location of the user's "NTUSER.DAT" registry hive and local application data ("AppData\Local"), including "UsrClass.dat" (the "Software\Classes" registry hive). It's also the default location for a user's known shell folders and home directory. Modifying USERPROFILE shouldn't cause problems with any of this, but I'm not completely at ease with it.

    If a user profile defines a custom home directory (e.g. net user <username> /homedir:<dir>), it will be reflected in "%HOMEDRIVE%%HOMEPATH%", and not in USERPROFILE. In a more perfect world, in which no one ever depended on the default location of shell folders, I'd recommend flipping the order around in ntpath.expanduser to prefer "%HOMEDRIVE%%HOMEPATH%".

    That said, a major snag with HOMEDRIVE and HOMEPATH is that they're sometimes wrong or missing. CreateEnvironmentBlock() will enumerate but won't populate the user's "Volatile Environment" registry key, which is where HOMEDRIVE and HOMEPATH are normally set. This key gets populated during startup of a desktop session, and, since it's volatile, it gets deleted when the profile is unloaded (i.e. the user logs off). Consequently the Secondary Logon service (i.e. CreateProcessWithLogonW), which runas.exe uses, will only set the correct HOMEDRIVE and HOMEPATH values if the user is already logged on as a desktop session user. Otherwise it uses a default value of "%SystemRoot%\System32" as the home directory, which is spectacularly wrong. Similarly, a scheduled task that uses an S4U batch logon won't even have HOMEDRIVE and HOMEPATH defined if the task user isn't already logged on as a desktop session user. USERPROFILE will still be correct in these cases.

    Windows is posix-compliant. What you probably mean here is "Unix".

    The Windows subsystem has never been POSIX compliant. At most, Microsoft's C runtime library provides rudimentary POSIX API support (e.g. open, read).

    On the other hand, NT's original POSIX subsystem (psxss.exe) was POSIX.1 (1990) compliant. In 1999, Microsoft purchased the more capable OpenNT Unix system from Softway Systems and rebranded it as Interix. In Server 2003 R2, Interix was integrated in NT as the Subsystem for UNIX-based Applications (SUA). It was removed in Server 2012 R2. Microsoft abandoned the old conception of an NT subsystem in favor of virtualization concepts such as pico processes and pico providers (WSL 1) or hosting a kernel in a light-weight VM (WSL 2).

    @zooba
    Copy link
    Member

    zooba commented May 16, 2020

    Really, we shouldn't be using any environment variables on Windows here, because they open up too many security risks. There are API calls that are canonical, but the environment vars are compatibility helpers.

    Breakage due to HOME being overridden is serious because it won't show up in any other cases - Python will be the first to suffer the consequences, which means we are facing a targeted exploit. Not really much choice but to fix it (though there was a choice whether to release a security advisory or not... ;-) )

    The documentation was definitely updated, and it was in NEWS, but you're right there was no DeprecationWarning, not that we'd have been able to show it to most impacted library developers anyway.

    Perhaps the best approach for the sake of POSIX compatibility is to set HOME on startup to the correct value? It won't normally be set, so anyone using it is likely broken on Windows, but if we make it valid then everyone can just rely on it?

    @pfmoore
    Copy link
    Member

    pfmoore commented May 16, 2020

    Perhaps the best approach for the sake of POSIX compatibility is to set HOME on startup to the correct value?

    If Python starts setting HOME, that has the potential to affect programs called in a subprocess, possibly breaking them (making them not find the user's config, for example). Even if we only set HOME when it's not already set, we can get this issue. (Vim, for example, will break if you set HOME to somewhere other than where the user has their config stored).

    Certainly, it's arguable that such programs are not respecting Windows conventions correctly, but in practical terms that's not really that relevant. The user will just see it as "Python can't call my program correctly".

    This is a very real issue, that I used to hit a lot in the past, when native Windows support was a lot less common in open source code than it is now. And it was nearly always made worse by the programs that tried to be "too clever" about hiding the differences between Windows and Unix.

    So I'm a strong -1 on Python doing anything with HOME here.

    @jaraco
    Copy link
    Member

    jaraco commented May 16, 2020

    Platform differences can be papered over with functions.

    I’m not suggesting that from within Python there should be another way to detect a home directory. The expanduser functionality as written is just fine for that, though I agree with Steve that using an API on Windows would be even better.

    The issue is that from outside Python there’s now no platform-agnostic way to override the home directory, either for tests or to suppress user-local configuration. In addition to the test case, there is a user-facing use-case that's made more complicated:

    • A user needs to bypass the logged-in user's state for a Python command. For example, there was a buggy release (of something) that stored some broken state in the users’ home dir, and the workaround is to run some command with an alternate home dir. Communicating that technique to a multi-platform audience is now much more difficult.

    It was previously difficult enough because “set an environment variable” is a shell-dependent operation, so already requires the user to have some expertise on how to set environment variables in their environment.

    This change requires the guidance to be more complicated or for the expertise to be greater.

    Breakage due to HOME being overridden is serious because it won't show up in any other cases - Python will be the first to suffer the consequences

    Evidence is to the contrary.

    It seems other tools like 'git' honor the HOME directory:

    PS C:\Users\jaraco> $env:HOME = "C:\Users\jaraco\temp"
    PS C:\Users\jaraco> git config --global --list
    fatal: unable to read config file 'C:/Users/jaraco/temp/.gitconfig': No such file or directory

    Interestingly, I notice that Powershell does set the $HOME environment variable on Windows, reinforcing the concept that $HOME is a platform agnostic way to communicate a home directory.

    It does appear as if Ruby honors HOME and sets it if unset on Windows. I'm not a fan of setting the variable, but there is precedent.

    It seems to me Python 3.8 is the outlier and that honoring $HOME, while somewhat awkward on Windows, was a valuable unifying behavior, and its removal without any plans for a replacement is causing harm. Just searching through some of the projects I have checked out, I find a few.

    Here's another real-world example relying on Python honoring HOME to override home.

    Until recently, PDB used to look exclusively at $HOME and the backport still does.

    Devpi client relies on setting home for tests.

    we are facing a targeted exploit.

    What is the exploit? I don't think the downsides of honoring HOME on Windows were captured above.

    From what I could tell, there was one (fairly obscure) case where HOME was set unexpectedly. That hardly seems like a justification to reverse a long-standing documented feature that aligns with the expectation of other tools in the ecosystem.

    @eryksun
    Copy link
    Contributor

    eryksun commented May 16, 2020

    I’m not suggesting that from within Python there should be another way
    to detect a home directory. The expanduser functionality as written is
    just fine for that, though I agree with Steve that using an API on
    Windows would be even better.

    I thought it might be useful for testing purposes if os.path (i.e. ntpath and posixpath) had a way to set the home directory that it uses in way that wouldn't affect other libraries and child processes.

    It seems other tools like 'git' honor the HOME directory:

    Many cross-platform projects follow Unix conventions, which is simpler for development and documentation. These projects are often targeted at developers, who usually have a strong understanding of Unix conventions and system configuration. But Python is a general-purpose scripting language used for everything from system administration to major applications. It needs to follow platform conventions, but it should also abstract away platform differences as much as possible.

    Interestingly, I notice that Powershell does set the $HOME environment

    Not for me in PowerShell 5 and 7:

    PS C:\> test-path 'env:userprofile'
    True
    PS C:\> test-path 'env:home'
    False
    

    @jaraco
    Copy link
    Member

    jaraco commented May 16, 2020

    I thought it might be useful for testing purposes if os.path (i.e. ntpath and posixpath) had a way to set the home directory that it uses in way that wouldn't affect other libraries and child processes.

    I was thinking about this, and I'd argue that in the general case, you _want_ to be able to set the home directory in a way that affects child processes. If you care enough to try to isolate the behavior from the user-local state, you probably want other libraries and child processes to honor that isolation.

    This argument extends to there being a (preferably one) way override a home directory that's honored by tools and processes of all flavors (e.g. overriding USERPROFILE has no effect on git commands in a subprocess).

    I agree it may prove useful not to affect the global and subprocess state if possible, but I'd argue in that case, you could probably just patch the library under test directly.

    These processes patching $HOME really are attempting to suppress the user's local state (or simulate a specific state).

    Not for me in PowerShell 5 and 7:

    Weird. You're right, I get the same thing for test-path. And indeed it's not set in the Python process.

    PS C:\WINDOWS\system32> test-path env:home
    False
    PS C:\WINDOWS\system32> python -c "import os; print(os.environ.get('HOME'))"
    None

    But strangely, $HOME seems to have some effect.

    PS C:\WINDOWS\system32> echo $HOME C:\Users\jaraco
    PS C:\WINDOWS\system32> cd $HOME
    PS C:\Users\jaraco>

    Do you know what that's about?

    I found a couple Powershell users that seem to think setting $HOME is something you would want to do.

    And this Powershell documentation makes reference to $HOME.

    I see. Powershell defines $HOME as an Automatic Variable internally but doesn't expose it as an environment variable.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants