Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[security][CVE-2019-9948] Unnecessary URL scheme exists to allow local_file:// reading file in urllib #80088

Closed
push0ebp mannequin opened this issue Feb 6, 2019 · 27 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-security A security issue

Comments

@push0ebp
Copy link
Mannequin

push0ebp mannequin commented Feb 6, 2019

BPO 35907
Nosy @vstinner, @larryhastings, @tiran, @ned-deily, @vadmium, @matrixise, @stratakis, @PetterS, @tirkarthi, @push0ebp, @ware
PRs
  • [2.7] bpo-35907: Avoid file reading as disallowing the unnecessary URL scheme in urllib (GH-11842) #11842
  • bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme #13474
  • [3.7] bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) #13505
  • [2.7] bpo-35907: Complete test_urllib.test_local_file_open() #13506
  • [3.5] bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) (GH-13505) #13510
  • [3.6] bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) (GH-13505) #13513
  • bpo-35907: Clarify the NEWS entry #13523
  • [2.7] bpo-35907: Clarify the NEWS entry #13557
  • [3.7] bpo-35907: Clarify the NEWS entry #13558
  • bpo-35907: Fix typo in the NEWS entry #13559
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-05-18.14:16:27.258>
    created_at = <Date 2019-02-06.08:19:51.239>
    labels = ['type-security', '3.8', '3.7', 'library']
    title = '[security][CVE-2019-9948] Unnecessary URL scheme exists to allow local_file:// reading file  in urllib'
    updated_at = <Date 2020-05-18.21:31:43.010>
    user = 'https://github.com/push0ebp'

    bugs.python.org fields:

    activity = <Date 2020-05-18.21:31:43.010>
    actor = 'Petter S'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-05-18.14:16:27.258>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2019-02-06.08:19:51.239>
    creator = 'push0ebp'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35907
    keywords = ['patch']
    message_count = 27.0
    messages = ['334905', '334923', '334925', '334927', '334928', '334929', '334930', '339664', '342334', '342336', '342337', '342363', '343098', '343233', '343239', '343241', '343424', '343427', '343431', '343432', '343856', '347867', '368011', '368063', '368080', '369228', '369292']
    nosy_count = 11.0
    nosy_names = ['vstinner', 'larry', 'christian.heimes', 'ned.deily', 'martin.panter', 'matrixise', 'cstratak', 'Petter S', 'xtreak', 'push0ebp', 'ware']
    pr_nums = ['11842', '13474', '13505', '13506', '13510', '13513', '13523', '13557', '13558', '13559']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue35907'
    versions = ['Python 2.7', 'Python 3.7', 'Python 3.8']

    @push0ebp
    Copy link
    Mannequin Author

    push0ebp mannequin commented Feb 6, 2019

    The Unnecessary scheme exists in urlopen() urllib

    when people would protect to read file system in HTTP request of urlopen(), they often filter like this against SSRF.

    # Vulnerability PoC
    import urllib
    print urllib.urlopen('local_file:///etc/passwd').read()[:30]
    the result is
    ##
    # User Database
    # 
    # Note t

    but if we use a scheme like this, parsing URL cannot parse scheme with urlparse()
    this is the parsed result.
    ParseResult(scheme='', netloc='', path='local_file:/etc/passwd', params='', query='', fragment='')

    def request(url):
        from urllib import urlopen
        from urlparse import urlparse
    
        result = urlparse(url)
        scheme = result.scheme
        if not scheme:
            return False #raise Exception("Required scheme")
        if scheme == 'file':
            return False #raise Exception("Don't open file")
        res = urlopen(url)
        content = res.read()
        print url, content[:30]
        return True

    assert request('file:///etc/passwd') == False
    assert request(' file:///etc/passwd') == False
    assert request('File:///etc/passwd') == False
    assert request('http://www.google.com') != False

    if they filter only file://, this mitigation can be bypassed against SSRF. 
    with this way.

    assert request('local-file:/etc/passwd') == True
    ParseResult(scheme='local-file', netloc='', path='/etc/passwd', params='', query='', fragment='')
    parseing URL also can be passed.

    # Attack scenario
    this is the unnecessary URL scheme("local_file").
    even if it has filtering, An Attacker can read arbitrary files as bypassing with it.

    # Root Cause

    URLopener::open in urllib.py
    from 203 lin

    name = 'open_' + urltype
    self.type = urltype
    name = name.replace('-', '_') #it can also allows local-file
    if not hasattr(self, name): #passed here hasattr(URLopener, 'open_local_file')
        if proxy:
            return self.open_unknown_proxy(proxy, fullurl, data)
        else:
            return self.open_unknown(fullurl, data)
    try:
        if data is None:
            return getattr(self, name)(url)
        else:
            return getattr(self, name)(url, data) #return URLopener::open_local_file

    it may be just trick because people usually use whitelist (allow only http or https.
    Even if but anyone may use blacklist like filtering file://, they will be affected with triggering SSRF

    @push0ebp push0ebp mannequin added stdlib Python modules in the Lib dir type-security A security issue labels Feb 6, 2019
    @tiran
    Copy link
    Member

    tiran commented Feb 6, 2019

    Thanks for your report. I'm having a hard time understanding your English. If I understand you correctly, your bug report is about the open_local_file() method and the surprising fact that urllib supports the local_file schema.

    I agree, this looks like an implementation artefact. urllib should not expose the local_file schema. In Python 3 refuses local_file:// (tested with 3.4 to 3.7).

    >>> import urllib.request
    >>> urllib.request.urlopen('local_file:///etc/passwd').read()[:30]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
        return opener.open(url, data, timeout)
      File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
        response = self._open(req, data)
      File "/usr/lib64/python3.6/urllib/request.py", line 549, in _open
        'unknown_open', req)
      File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
        result = func(*args)
      File "/usr/lib64/python3.6/urllib/request.py", line 1388, in unknown_open
        raise URLError('unknown url type: %s' % type)
    urllib.error.URLError: <urlopen error unknown url type: local_file>

    @tiran
    Copy link
    Member

    tiran commented Feb 6, 2019

    Only the Python 2 urllib module is affected. Python 2.7's urllib2 also correctly fails with local_file://

    >>> import urllib2
    >>> urllib2.urlopen('local_file:///etc/passwd').read()[:30]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 154, in urlopen
        return opener.open(url, data, timeout)
      File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 429, in open
        response = self._open(req, data)
      File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 452, in _open
        'unknown_open', req)
      File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 407, in _call_chain
        result = func(*args)
      File "/home/heimes/dev/python/2.7/Lib/urllib2.py", line 1266, in unknown_open
        raise URLError('unknown url type: %s' % type)
    urllib2.URLError: <urlopen error unknown url type: local_file>

    @push0ebp
    Copy link
    Mannequin Author

    push0ebp mannequin commented Feb 6, 2019

    Sorry for my bad English.
    Yes, exactly. Only python 2.7 has been affected. not python3.
    So I chose only Python2.7 version.

    @push0ebp
    Copy link
    Mannequin Author

    push0ebp mannequin commented Feb 6, 2019

    and only urllib, not urllib2.

    @tiran
    Copy link
    Member

    tiran commented Feb 6, 2019

    I'm not a native English speaker either. I wasn't sure if I understood you correctly. Thanks!

    @push0ebp
    Copy link
    Mannequin Author

    push0ebp mannequin commented Feb 6, 2019

    I am not also native English speaker. It's OK. Thank you for reading my report

    @tirkarthi
    Copy link
    Member

    This issue seems to have been assigned CVE-2019-9948 (https://nvd.nist.gov/vuln/detail/CVE-2019-9948) as noted in #11842 (comment)

    @vstinner vstinner changed the title Unnecessary URL scheme exists to allow file:// reading file in urllib [security][CVE-2019-9948] Unnecessary URL scheme exists to allow file:// reading file in urllib May 13, 2019
    @vstinner
    Copy link
    Member

    Christian:

    I agree, this looks like an implementation artefact. urllib should not expose the local_file schema. In Python 3 refuses local_file:// (tested with 3.4 to 3.7).

    I'm not sure that I understand well the issue. urllib accepts various scheme by design: HTTP, HTTPS, FTP, FILE, etc.

    For example, file:// scheme is legit and works as expected. Python 3 example:
    ---

    import urllib.request
    req = urllib.request.Request('file:///etc/passwd')
    print(f"URL scheme: {req.type}")
    fp = urllib.request.urlopen(req)
    print(fp.read()[:30])
    fp.close()

    Output with Python 3:
    ---
    URL scheme: file
    b'root:x:0:0:root:/root:/bin/bas'
    ---

    I get a similar output with this Python 2 example:
    ---

    import urllib
    req = urllib.urlopen('file:///etc/passwd')
    print(req.read()[:30])
    req.close()

    Christian:

    I agree, this looks like an implementation artefact. urllib should not expose the local_file schema.

    I understand that Python 2 handles local_file://url as file://url. Ok. But is this a security issue? If you care of security, you ensure that the url scheme is HTTP or HTTPS, not only forbid FILE, no?

    I'm asking because of:

    Karthikeyan Singaravelan:

    This issue seems to have been assigned CVE-2019-9948 (https://nvd.nist.gov/vuln/detail/CVE-2019-9948) ...

    @tiran
    Copy link
    Member

    tiran commented May 13, 2019

    The issue is not about whether "file://" schema or not.

    It's about the fact that urllib on Python 2 has two schemas that allow local file access. There is the well-known "file://" schema and there is the implementation artifact "local_file://". A careful, security-minded developer knows about the file:// schema and also knows how to block it. But the "local_file://" schema is a surprising side-effect of the implementation.

    @vstinner
    Copy link
    Member

    If you use directly the URLopener class, Python 3 has a similar issue:
    ---

    import urllib.request
    req = urllib.request.URLopener().open('local_file:///etc/passwd')
    print(req.read()[:30])
    req.close()

    @vstinner vstinner added 3.7 (EOL) end of life 3.8 only security fixes labels May 13, 2019
    @tiran tiran changed the title [security][CVE-2019-9948] Unnecessary URL scheme exists to allow file:// reading file in urllib [security][CVE-2019-9948] Unnecessary URL scheme exists to allow local_file:// reading file in urllib May 13, 2019
    @push0ebp
    Copy link
    Mannequin Author

    push0ebp mannequin commented May 13, 2019

    If developers allow only http:// or https:// as whitelist, it has no problem.
    But, If someone blocks only one file://, attacker can bypass it.
    This issue may provides attacker with bypassing method as new scheme.

    @vstinner
    Copy link
    Member

    New changeset b15bde8 by Victor Stinner (SH) in branch '2.7':
    bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-11842)
    b15bde8

    @vstinner
    Copy link
    Member

    New changeset 0c2b6a3 by Victor Stinner in branch 'master':
    bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474)
    0c2b6a3

    @vstinner
    Copy link
    Member

    New changeset 942c31d by Victor Stinner in branch '2.7':
    bpo-35907: Complete test_urllib.test_local_file_open() (GH-13506)
    942c31d

    @vstinner
    Copy link
    Member

    New changeset 34bab21 by Victor Stinner in branch '3.7':
    bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) (GH-13505)
    34bab21

    @vstinner
    Copy link
    Member

    New changeset deffee5 by Victor Stinner in branch 'master':
    bpo-35907: Clarify the NEWS entry (GH-13523)
    deffee5

    @vstinner
    Copy link
    Member

    New changeset 1c9debd by Victor Stinner in branch 'master':
    bpo-35907: Fix typo in the NEWS entry (GH-13559)
    1c9debd

    @vstinner
    Copy link
    Member

    New changeset d9d1045 by Victor Stinner in branch '2.7':
    bpo-35907: Clarify the NEWS entry (GH-13557)
    d9d1045

    @vstinner
    Copy link
    Member

    New changeset cee4ac8 by Victor Stinner in branch '3.7':
    bpo-35907: Clarify the NEWS entry (GH-13558)
    cee4ac8

    @ned-deily
    Copy link
    Member

    New changeset 4f06dae by Ned Deily (Victor Stinner) in branch '3.6':
    bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13513)
    4f06dae

    @larryhastings
    Copy link
    Contributor

    New changeset 4fe82a8 by larryhastings (Victor Stinner) in branch '3.5':
    bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) (GH-13505) (bpo-13510)
    4fe82a8

    @PetterS
    Copy link
    Mannequin

    PetterS mannequin commented May 4, 2020

    We should whitelist the protocols. The current solution with getattr is really fragile.

    For example, this crashes with a TypeError: URLopener().open("unknown_proxy://test")

    @vstinner
    Copy link
    Member

    vstinner commented May 4, 2020

    We should whitelist the protocols. The current solution with getattr is really fragile. For example, this crashes with a TypeError: URLopener().open("unknown_proxy://test")

    Would you mind to elaborate why do you consider that the solution is incomplete? Your issue doesn't show that Python is vulnerable. TypeError *is* the expected behavior.

    Would you prefer another error message? If yes, please open a seperated issue.

    @PetterS
    Copy link
    Mannequin

    PetterS mannequin commented May 4, 2020

    The solution is incomplete because it fixes just this single security issue, not the inherent fragility of this file.

    If, in the future someone happens to add another method starting with open to this class, we are at risk of having the same problem again.

    As for the error message, it is of course a minor issue, but I don't think it is expected that "unknown_proxy://" and "something_else://" raise different exceptions, right?

    @vstinner
    Copy link
    Member

    The solution is incomplete because it fixes just this single security issue, not the inherent fragility of this file.

    If you want to propose a change to make the file "less fragile", please open a *new* separated issue.

    The issue is about an exact vulnerability, the "local_file://" scheme, which has been fixed. I close again the issue.

    @PetterS
    Copy link
    Mannequin

    PetterS mannequin commented May 18, 2020

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants