classification
Title: Unnecessary URL scheme exists to allow file:// reading file in urllib
Type: security Stage: patch review
Components: Library (Lib) Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, martin.panter, matrixise, push0ebp
Priority: normal Keywords: patch

Created on 2019-02-06 08:19 by push0ebp, last changed 2019-02-13 17:11 by push0ebp.

Pull Requests
URL Status Linked Edit
PR 11842 open push0ebp, 2019-02-13 17:11
Messages (7)
msg334905 - (view) Author: Sihoon Lee (push0ebp) * Date: 2019-02-06 08:19
The Unnecessary scheme exists in urlopen() urllib

when people would protect to read file system in HTTP request of urlopen(), they often filter like this against SSRF.

# Vulnerability PoC
import urllib
print urllib.urlopen('local_file:///etc/passwd').read()[:30]
the result is
##
# User Database
# 
# Note t


but if we use a scheme like this, parsing URL cannot parse scheme with urlparse()
this is the parsed result.
ParseResult(scheme='', netloc='', path='local_file:/etc/passwd', params='', query='', fragment='')


def request(url):
    from urllib import urlopen
    from urlparse import urlparse

    result = urlparse(url)
    scheme = result.scheme
    if not scheme:
        return False #raise Exception("Required scheme")
    if scheme == 'file':
        return False #raise Exception("Don't open file")
    res = urlopen(url)
    content = res.read()
    print url, content[:30]
    return True

assert request('file:///etc/passwd') == False
assert request(' file:///etc/passwd') == False
assert request('File:///etc/passwd') == False
assert request('http://www.google.com') != False

if they filter only file://, this mitigation can be bypassed against SSRF. 
with this way.

assert request('local-file:/etc/passwd') == True
ParseResult(scheme='local-file', netloc='', path='/etc/passwd', params='', query='', fragment='') 
parseing URL also can be passed.


# Attack scenario 
this is the unnecessary URL scheme("local_file").
even if it has filtering, An Attacker can read arbitrary files as bypassing with it.

# Root Cause

URLopener::open in urllib.py 
from 203 lin

name = 'open_' + urltype
self.type = urltype
name = name.replace('-', '_') #it can also allows local-file
if not hasattr(self, name): #passed here hasattr(URLopener, 'open_local_file')
    if proxy:
        return self.open_unknown_proxy(proxy, fullurl, data)
    else:
        return self.open_unknown(fullurl, data)
try:
    if data is None:
        return getattr(self, name)(url)
    else:
        return getattr(self, name)(url, data) #return URLopener::open_local_file

it may be just trick because people usually use whitelist (allow only http or https. 
Even if but anyone may use blacklist like filtering file://, they will be affected with triggering SSRF
msg334923 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2019-02-06 10:53
Thanks for your report. I'm having a hard time understanding your English. If I understand you correctly, your bug report is about the open_local_file() method and the surprising fact that urllib supports the local_file schema.

I agree, this looks like an implementation artefact. urllib should not expose the local_file schema. In Python 3 refuses local_file:// (tested with 3.4 to 3.7).

>>> import urllib.request
>>> urllib.request.urlopen('local_file:///etc/passwd').read()[:30]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/usr/lib64/python3.6/urllib/request.py", line 549, in _open
    'unknown_open', req)
  File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/lib64/python3.6/urllib/request.py", line 1388, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: local_file>
msg334925 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2019-02-06 10:55
Only the Python 2 urllib module is affected. Python 2.7's urllib2 also correctly fails with local_file://

>>> import urllib2
>>> urllib2.urlopen('local_file:///etc/passwd').read()[:30]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 429, in open
    response = self._open(req, data)
  File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 452, in _open
    'unknown_open', req)
  File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 1266, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib2.URLError: <urlopen error unknown url type: local_file>
msg334927 - (view) Author: Sihoon Lee (push0ebp) * Date: 2019-02-06 11:28
Sorry for my bad English.
Yes, exactly. Only python 2.7 has been affected. not python3.
So I chose only Python2.7 version.
msg334928 - (view) Author: Sihoon Lee (push0ebp) * Date: 2019-02-06 11:29
and only urllib, not urllib2.
msg334929 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2019-02-06 11:33
I'm not a native English speaker either. I wasn't sure if I understood you correctly. Thanks!
msg334930 - (view) Author: Sihoon Lee (push0ebp) * Date: 2019-02-06 11:42
I am not also native English speaker. It's OK. Thank you for reading my report
History
Date User Action Args
2019-02-13 17:11:04push0ebpsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request11872
2019-02-06 11:42:22push0ebpsetmessages: + msg334930
2019-02-06 11:33:37christian.heimessetmessages: + msg334929
2019-02-06 11:29:51push0ebpsetmessages: + msg334928
2019-02-06 11:28:42push0ebpsetmessages: + msg334927
2019-02-06 10:55:51christian.heimessetmessages: + msg334925
2019-02-06 10:53:52christian.heimessetmessages: + msg334923
stage: needs patch
2019-02-06 08:57:57matrixisesetnosy: + christian.heimes, martin.panter, matrixise
2019-02-06 08:19:51push0ebpcreate