This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [Security][Windows] webbrowser: WindowsDefault uses os.startfile() and so can be abused to run arbitrary commands
Type: security Stage: patch review
Components: Library (Lib), Windows Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, matrixise, mdk, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords: patch

Created on 2019-02-18 11:00 by vstinner, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 11931 closed matrixise, 2019-02-19 10:33
Messages (25)
msg335805 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-18 11:00
The webbrowser module uses WindowsDefault which calls os.startfile() and so can be abused to run arbitrary commands.

WindowsDefault should do log a warning or raise an error if the URL is unsafe. I'm not sure how to build a list of safe URL schemes. At least, we can explicitly exclude "C:\WINDOWS\system32\calc.exe" which doesn't contain "://".

The union of all "uses_*" constants of urllib.parser give me this sorted list of URL schemes:

['', 'file', 'ftp', 'git', 'git+ssh', 'gopher', 'hdl', 'http', 'https', 'imap', 'mailto', 'mms', 'news', 'nfs', 'nntp', 'prospero', 'rsync', 'rtsp', 'rtspu', 'sftp', 'shttp', 'sip', 'sips', 'snews', 'svn', 'svn+ssh', 'tel', 'telnet', 'wais', 'ws', 'wss']

Would it make sense to ensure that urllib.parser can parse an email to check if the URL looks valid?
msg335806 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-18 11:01
Vulnerability reported by Nassim Asrir to the PSRT.
msg335813 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-18 11:12
@victor, the problem is with os.startfile or with WindowsDefault?

just to know if we need to fix os.startfile or WindowsDefault.
msg335827 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-18 12:36
> just to know if we need to fix os.startfile or WindowsDefault.

webbrowser shouldn't call os.startfile with a path to a program on the local hard drive.
msg335941 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-19 12:33
@vstinner, all the tests pass on AppVeyor and Travis,

I check if the resource is local (file://) or not, and if the given path is a file (c:\\windows\\system32\\calc.exe), I check if this one is an executable.
msg335942 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-19 12:43
After a small test, os.access() on a text file is True, and it's wrong in my case.
msg335948 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-19 14:05
@vstinner

Could we imagine to create a whitelist just for the html file?

like that, we could open an html file and reject an executable file.
msg335954 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-19 14:52
@vstinner

I have a whitelist where the HTML files (.html, .html, .dhtml, .xhtml, ...) are allowed. For the rest, I do not execute os.startfile()
msg335957 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-19 15:50
Parsing an URL and deciding if an URL is "safe" or not is hard.

For example, PR 11931 denies "file://" URLs, but I don't see the issue with opening such URL:
file:///home/vstinner/prog/GIT/github.io/output/index.html
(local path to a HTML file)

The problem here is that os.startfile() can be abused to run arbitrary command.

Another option would be to behave as Unix classes: run directly as specific browser like Chrome or Firefox.

Maybe the registry can help? I found interesting keys:

"HKEY_CURRENT_USER\Software\Classes\BSURL\shell\open\command"
"HKEY_CURRENT_USER\Software\Microsoft\Windows\Shell\Associations\UrlAssociations\http\UserChoice\Progid"
"HKEY_CURRENT_USER\Software\Microsoft\Windows\Shell\Associations\UrlAssociations\https\UserChoice\Progid"
"HKEY_CURRENT_USER\Software\Microsoft\Windows\Shell\Associations\UrlAssociations\ftp\UserChoice\Progid"
"HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\FileExts\.htm\UserChoice\Progid"
"HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\FileExts\.html\UserChoice\Progid"
"HKEY_CURRENT_USER\Software\Clients\StartmenuInternet\"
msg335959 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-19 15:52
PR 11931 experimented other tests:

         return os.path.isfile(path) and os.access(path, os.X_OK)

and:

        is_exe = False
        with open(path, 'rb') as fp:
            s = fp.read(2)
            is_exe = s != b'MZ'
         return os.path.isfile(path) and is_exe

I'm not sure that it's safe. Windows support a wide range of programs: .COM, .EXE, .VBS, .BAT, etc. It's hard to get a complete list.
msg335967 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-19 16:23
> After a small test, os.access() on a text file is True, and it's wrong in my case.

os.access(path, os.X_OK) is specific to Unix. It doesn't make sense on Windows.

os.access() is implemented with GetFileAttributesW() on Windows. The mode argument is more or less ignored.
msg335979 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-02-19 17:05
The most I'd be okay with doing here is filtering for "<scheme>://" in the webbrowser module, and not limiting "scheme" at all except that it must be a valid scheme.

Windows allows apps and programs to extend protocol handling in HKEY_CLASSES_ROOT\PROTOCOLS\Handler and os.startfile() should respect this list, even while some browsers may handle more protocols that are not registered here. So there's really no way to limit the scheme sensibly.

And yeah, anyone can take an arbitrary local or remote path and rewrite it as "file://<computer>/<path>". That's a feature :)

Perhaps we should add a warning to the docs that by default, webbrowser will open a URL with the user's associated program, and while this is generally the desirable behavior, if you want to enforce a particular browser then you should .get() it first?
msg336004 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-19 19:27
@vstinner

With the last commit, I don't test if the file is an executable but just if the extension is a .html file.

For example, if you rename calc.exe by calc.html and you try to execute it in the default browser (via os.startfile()), you will get the content of the file in the browser.
msg336039 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-02-20 05:24
> os.access(path, os.X_OK) is specific to Unix. It doesn't make sense 
> on Windows. 

It doesn't make sense with the current implementation of os.access, and not as Stéphane used it. Even if we completely implemented os.access, the problem is that most files grant execute access to the owner, "Users", or "Authenticated Users". Typically we have to override inheritance to prevent granting execute access, or add an entry that denies execute access.

However, it's not that it makes no sense in general. CreateProcess does require execute access on a file. This includes both DACL discretionary access and SACL mandatory access. ShellExecuteEx ultimately calls CreateProcess if it's running a "file:" URL, so execute access certainly matters in this case. For example, I've denied execute access on the following file:

    >>> os.startfile('file:///C:/Temp/test/test.exe')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    PermissionError: [WinError 5] Access is denied: 'file:///C:/Temp/test/test.exe'

On the other hand, if we're talking about data files and scripts, ShellExecute[Ex] doesn't check for execute access because it's generally used to "open" files. It wouldn't be a security barrier, anyway. It's easy enough for a program to call AssocQueryString to get the command-line template for a protocol or file type, manually replace template parameters, and execute the command directly via CreateProcess. 

> os.access() is implemented with GetFileAttributesW() on Windows. The
> mode argument is more or less ignored.

The readonly file attribute denies W_OK access, similar to how the [i]mmutable file attribute works in some Unix systems (e.g. Linux lsattr and chattr +i).
msg336042 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-20 06:45
Hi @eryk, @vstinner and @steve,

I think I could not work on this issue today, but will continue to fix it asap (tomorrow or on this evening).
msg336064 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-20 10:51
>     >>> os.startfile('file:///C:/Temp/test/test.exe')

Oh, startfile() also runs a program for an URL using file:// scheme? If yes, it becomes even more complex to fix this file :-/

How do you decide if an URL start with file:// is safe?
msg336079 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-20 12:34
Windows has the GetBinaryTypeW function, this one is used by pywin32, maybe I could develop a wrapper in os, like os.is_executable(path)

for Unix-like, os.is_executable(path) could use os.access(path, os.X_OK)
for Windows, the function would use GetBinaryTypeW.

my 2 cents.
msg336090 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-20 14:31
> Windows has the GetBinaryTypeW function, this one is used by pywin32, maybe I could develop a wrapper in os, like os.is_executable(path)

I don't think that it would detect .BAT or .VBS scripts as executable, whereas we don't want to execut such scripts with webbrowser.
msg336092 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-20 14:39
Sure we don't want to execute these kinds of scripts but we could mix with GetBinaryTypeW and add a check on the extensions.

for example and simplified protocode

if is_executable(path) or is_script(path):
   raise ...
os.startfile(path)
msg336094 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-20 14:48
> Sure we don't want to execute these kinds of scripts but we could mix with GetBinaryTypeW and add a check on the extensions.

Windows has a convenient feature: if you ask to run "program", Windows tries to run "program.exe", "program.bat", etc.

Example:
---
C:\vstinner>del hello.txt

C:\vstinner>type hello.bat
echo "Hello from hello.bat" > /vstinner/hello.txt

C:\vstinner>\vstinner\python\master\python
Python 3.8.0a0 (heads/master:8f59ee01be, Jan 25 2019, 01:16:59) [MSC v.1915 64 bit (AMD64)] on win32
>>> import os
>>> os.startfile(r"c:\vstinner\hello")
>>> with open(r"c:\vstinner\hello.txt") as fp: print(fp.read())
...
"Hello from hello.bat"
---

os.startfile(r"c:\vstinner\hello") <= "hello" filename has no extension
msg336096 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-20 15:02
Maybe webbrowser must be changed to become *very strict*. For example, raise an error if the URL doesn't start with "http://" or "https://". But add an option to opt-in for "unsafe" URLs with a warning in the doc to explain the risk on Windows?

Another option is to add an optional callback to validate the URL. As the 'verify' parameter of logging.config.listen():
https://docs.python.org/dev/library/logging.config.html#logging.config.listen

"pydoc -b" runs a local HTTP server but it uses regular "http://" URLs, it doesn't use file://.

Maybe only Windows should be modified, Unix is safe, no?
msg336097 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-20 15:09
+1
msg336098 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-02-20 15:14
> Windows has the GetBinaryTypeW function

ShellExecute[Ex] doesn't check for a PE image. It uses the file extension, and a tangled web of registry settings to determine what to execute. If a file should run directly via CreateProcess, you'll find the template command starts with the target file ("%1" or "%L"). For example:

    >>> AssocQueryStringW(ASSOCF_INIT_IGNOREUNKNOWN, ASSOCSTR_COMMAND,
    ...     '.exe', 'open', command, n)
    0
    >>> print(command.value)
    "%1" %*

OTOH, a script requires an interpreter, e.g for .py files:

    >>> AssocQueryStringW(ASSOCF_INIT_IGNOREUNKNOWN, ASSOCSTR_COMMAND,
    ...     '.py', 'open', command, n)
    0
    >>> print(command.value)
    "C:\WINDOWS\py.exe" "%L" %*

Except .bat and .cmd scripts are executed directly via the %ComSpec% interpreter:

    >>> AssocQueryStringW(ASSOCF_INIT_IGNOREUNKNOWN, ASSOCSTR_COMMAND,
    ...     '.bat', 'open', command, n)
    0
    >>> print(command.value)
    "%1" %*

One bit of metadata we can check is the file's "PerceivedType": unknown=0, text, image, audio, video, compressed, document, system, application, game-media, contacts. If a file's type is unknown (0), system (7), or application (8), or if getting the type fails, we probably don't want to run it. For example:

    >>> _ = AssocGetPerceivedType(".txt", ptype, flags, None); print(ptype[0])
    1
    >>> _ = AssocGetPerceivedType(".jpg", ptype, flags, None); print(ptype[0])
    2
    >>> _ = AssocGetPerceivedType(".mp3", ptype, flags, None); print(ptype[0])
    3
    >>> _ = AssocGetPerceivedType(".mp4", ptype, flags, None); print(ptype[0])
    4
    >>> _ = AssocGetPerceivedType(".zip", ptype, flags, None); print(ptype[0])
    5
    >>> _ = AssocGetPerceivedType(".html", ptype, flags, None); print(ptype[0])
    6
    >>> _ = AssocGetPerceivedType(".sys", ptype, flags, None); print(ptype[0])
    7
    >>> _ = AssocGetPerceivedType(".exe", ptype, flags, None); print(ptype[0])
    8
    >>> _ = AssocGetPerceivedType(".com", ptype, flags, None); print(ptype[0])
    8
    >>> _ = AssocGetPerceivedType(".bat", ptype, flags, None); print(ptype[0])
    8
    >>> _ = AssocGetPerceivedType(".cmd", ptype, flags, None); print(ptype[0])
    8

Except for a small number of hard-code definitions, the PerceivedType has to be defined for a filetype, and it's optional. It gets set either in the file-extension key under [HKCU|HKLM]\Software\Classes, or in the SystemFileAssocations subkey, or probably in 10 other locations sprawled across the registry. Python's installer doesn't set the PerceivedType of .py files to the application type (8), but it should. 

Another bit of metadata is the MIME "Content Type". This is also optional information provided in a filetype definition. For example:

    >>> AssocQueryStringW(ASSOCF_INIT_IGNOREUNKNOWN, ASSOCSTR_CONTENTTYPE,
    ...     '.exe', 'open', mtype, n)
    0
    >>> print(mtype.value)
    application/x-msdownload

    >>> AssocQueryStringW(ASSOCF_INIT_IGNOREUNKNOWN, ASSOCSTR_CONTENTTYPE,
    ...     '.html', 'open', mtype, n)
    0
    >>> print(mtype.value)
    text/html
msg336101 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-02-20 15:28
> Maybe webbrowser must be changed to become *very strict*.

This is the only acceptable suggestion I've seen (since my suggestion ;) )

I'd propose making it very strict by *requiring* a browser to be detected. So remove the os.startfile default and always require Chrome/Edge/etc. to be found. If they're not, you get an exception.

Personally, I'd hate this behaviour :) But for my cases I'd just switch to os.startfile unconditionally (as I only use this in my own scripts and not libraries).

One other thing to factor in is that if you use os.startfile to launch a malicious program, it will first be scanned by any anti-malware or antivirus software, and likely also by Windows SmartScreen. So you're not exactly getting arbitrary execution. It also only runs in the context of the current user, so there's not necessarily any escalation here.

All in all, I'd label this a vulnerability in applications that use webbrowser.open(), rather than in webbrowser.open() itself. The function is doing exactly what it is told, and if someone is passing untrusted input, then they'll get the exact untrusted output they expect.
msg338721 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-03-24 06:05
I remove my assignation to this issue, I wanted to work on it and learn the debugging on Windows, but after some weeks, I have no solution because no time for a real debugging session.

Please, feel free to work on this issue and I am really sorry for this delay.
History
Date User Action Args
2022-04-11 14:59:11adminsetgithub: 80202
2021-03-04 18:56:07eryksunsetversions: + Python 3.9, Python 3.10, - Python 2.7, Python 3.7
2019-03-24 06:05:52matrixisesetassignee: matrixise ->
messages: + msg338721
2019-02-20 15:28:31steve.dowersetmessages: + msg336101
2019-02-20 15:14:07eryksunsetmessages: + msg336098
2019-02-20 15:09:57matrixisesetmessages: + msg336097
2019-02-20 15:02:06vstinnersetmessages: + msg336096
2019-02-20 14:48:53vstinnersetmessages: + msg336094
2019-02-20 14:39:26matrixisesetmessages: + msg336092
2019-02-20 14:31:58vstinnersetmessages: + msg336090
2019-02-20 12:34:18matrixisesetmessages: + msg336079
2019-02-20 10:51:39vstinnersetmessages: + msg336064
2019-02-20 06:45:44matrixisesetmessages: + msg336042
2019-02-20 05:24:19eryksunsetnosy: + eryksun
messages: + msg336039
2019-02-19 19:27:43matrixisesetmessages: + msg336004
2019-02-19 17:05:00steve.dowersetmessages: + msg335979
2019-02-19 16:23:12vstinnersetmessages: + msg335967
2019-02-19 15:52:17vstinnersetmessages: + msg335959
2019-02-19 15:50:06vstinnersetnosy: + paul.moore, tim.golden, zach.ware, steve.dower
messages: + msg335957
components: + Windows
2019-02-19 14:52:02matrixisesetmessages: + msg335954
2019-02-19 14:05:07matrixisesetnosy: + mdk
2019-02-19 14:05:00matrixisesetmessages: + msg335948
2019-02-19 12:43:50matrixisesetmessages: + msg335942
2019-02-19 12:33:09matrixisesetmessages: + msg335941
2019-02-19 10:33:24matrixisesetkeywords: + patch
stage: patch review
pull_requests: + pull_request11955
2019-02-18 15:57:08matrixisesetassignee: matrixise
2019-02-18 12:36:15vstinnersetmessages: + msg335827
2019-02-18 11:12:10matrixisesetnosy: + matrixise
messages: + msg335813
2019-02-18 11:01:44vstinnersetmessages: + msg335806
2019-02-18 11:00:53vstinnercreate