classification
Title: [security][CVE-2019-9740][CVE-2019-9947] HTTP Header Injection (follow-up of CVE-2016-5699)
Type: security Stage: patch review
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: gregory.p.smith, martin.panter, orange, serhiy.storchaka, vstinner, ware, xiang.zhang, xtreak
Priority: normal Keywords: patch

Created on 2017-05-24 15:01 by orange, last changed 2019-04-17 15:40 by vstinner.

Pull Requests
URL Status Linked Edit
PR 12755 open gregory.p.smith, 2019-04-10 09:11
Messages (21)
msg294360 - (view) Author: Orange (orange) Date: 2017-05-24 15:01
Hi, the patch in CVE-2016-5699 can be broke by an addition space.
http://www.cvedetails.com/cve/CVE-2016-5699/
https://hg.python.org/cpython/rev/bf3e1c9b80e9
https://hg.python.org/cpython/rev/1c45047c5102

import urllib, urllib2

urllib.urlopen('http://127.0.0.1\r\n\x20hihi\r\n :11211')
urllib2.urlopen('http://127.0.0.1\r\n\x20hihi\r\n :11211')
msg295026 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-06-02 15:36
Looking at the code and the previous issue #22928, CRLF immediately followed by a tab or space (obs-fold: CRLF 1*( SP / HTAB )) is a valid part of a header value so the regex deliberately ignore them.

So it looks right to me the url given doesn't raise the same exception as the url without spaces, though the given url seems malformed.
msg295067 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-06-03 07:01
You can also inject proper HTTP header fields (or do multiple requests) if you omit the space after the CRLF:

urlopen("http://localhost:8000/ HTTP/1.1\r\nHEADER: INJECTED\r\nIgnore:")

Data sent to the server:
>>> server = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)
>>> server.bind(("localhost", 8000))
>>> server.listen()
>>> [conn, addr] = server.accept()
>>> pprint(conn.recv(300).splitlines(keepends=True))
[b'GET / HTTP/1.1\r\n',
 b'HEADER: INJECTED\r\n',
 b'Ignore: HTTP/1.1\r\n',
 b'Accept-Encoding: identity\r\n',
 b'User-Agent: Python-urllib/3.5\r\n',
 b'Connection: close\r\n',
 b'Host: localhost:8000\r\n',
 b'\r\n']

Issue 14826 is already open about how “urlopen” handles spaces, and there is a patch in Issue 13359 that proposes to also encode newline characters. But if the CRLF or header injection is a security problem, then 2.7 etc could be changed to raise an exception (like Issue 22928), or to do percent encoding.
msg306981 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-11-26 01:04
Actually, the CRLF + space can be injected via percent encoding, so just dealing with literal CRLFs and spaces wouldn’t be enough. You would have to validate the hostname after it is decoded.

urlopen("http://127.0.0.1%0D%0A%20SLAVEOF . . . :6379/")

>>> pprint(conn.recv(300).splitlines(keepends=True))
[b'GET / HTTP/1.1\r\n',
 b'Accept-Encoding: identity\r\n',
 b'Host: 127.0.0.1\r\n',
 b' SLAVEOF . . . :6379\r\n',
 b'Connection: close\r\n',
 b'User-Agent: Python-urllib/2.7\r\n',
 b'\r\n']
msg337970 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-03-15 06:15
See also https://bugs.python.org/issue36276 for a similar report. I think it's better to raise an error instead of encoding CRLF characters in URL similar to headers.

I feel either of the issue and more preferably issue36276 closed as a duplicate of this one. Copy of msg337968 with reference to details about similar report in golang : 

For reference an exact report on golang repo : https://github.com/golang/go/issues/30794 . This seemed to have been fixed in latest golang release 1.12 and commit https://github.com/golang/go/commit/829c5df58694b3345cb5ea41206783c8ccf5c3ca . The commit introduces a check for CTL characters and throws an error for URLs something similar to Python does for headers now at bf3e1c9b80e9.

func isCTL(r rune) bool {
	return r < ' ' || 0x7f <= r && r <= 0x9f
}

if strings.IndexFunc(ruri, isCTL) != -1 {
	return errors.New("net/http: can't write control character in Request.URL")
}

So below program used to work before go 1.12 setting a key on Redis but now it throws error : 

package main

import "fmt"
import "net/http"

func main() {
	resp, err := http.Get("http://127.0.0.1:6379?\r\nSET test failure12\r\n:8080/test/?test=a")
	fmt.Println(resp)
	fmt.Println(err)
}


➜  go version
go version go1.12 darwin/amd64
➜  go run urllib_vulnerability.go
<nil>
parse http://127.0.0.1:6379?
SET test failure12
:8080/test/?test=a: net/url: invalid control character in URL

Looking more into the commit there seemed to be a solution towards escaping characters with https://github.com/golang/go/issues/22907 . The fix seemed to have broke Google's internal tests [0] and hence reverted to have the above commit where only CTL characters were checked and raises an error. I think this is a tricky bug upon reading code reviews in the golang repo that has around 2-3 reports with a fix committed to be reverted later for a more conservative fix and the issue was reopened to target go 1.13 .

Thanks a lot for the report @ragdoll.guo

[0] https://go-review.googlesource.com/c/go/+/159157/2#message-39c6be13a192bf760f6318ac641b432a6ab8fdc8
msg339754 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-09 14:28
The CVE-2019-9740 has been assigned to the bpo-36276:

* https://nvd.nist.gov/vuln/detail/CVE-2019-9740
* https://bugzilla.redhat.com/show_bug.cgi?id=1692984

... which has been marked as a duplicate of this issue.
msg339840 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-04-10 09:22
Martin claimed "Actually, the CRLF + space can be injected via percent encoding"

I am unable to reproduce that behavior using urllib.request.urlopen() or urllib.request.URLopener.open() in my master/3.8 tree.
msg339846 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 10:36
Oh, I didn't recall that this issue (this class of security vulnerabilities) has a so old history. I found *A LOT* of similar open issues. Here are my notes. Maybe most open issues should be closed as duplicate of this one to clarify the status of urllib in Python? :-)

Emails:

* 2019: https://mail.python.org/pipermail/python-dev/2019-April/157014.html
* 2017: https://mail.python.org/pipermail/python-dev/2017-July/148699.html

Open issues:

* 2011, bpo-13359: "urllib2 doesn't escape spaces in http requests"
  Not marked as a security issue.
* 2012, bpo-14826: "urlopen URL with unescaped space"
  Fix using quote(self.__original, safe="%/:=&?~#+!$,;'@()*[]|")
  ... and the changed has then be reverted because it broke buildbots.
  Still open.
* 2013, bpo-17322: "urllib.request add_header() currently allows trailing spaces (and other weird stuff)"
  Not marked as a security issue.
* 2014, bpo-22928: "HTTP header injection in urrlib2/urllib/httplib/http.client (CVE-2016-5699)"
  Marked as fixed, but user Orange explained in the first comment of  in
  bpo-30458 that the fix is incomplete.
* 2017, bpo-30458: "[CVE-2019-9740][security] CRLF Injection in httplib" (this issue)
* 2017, bpo-32085: "[Security] A New Era of SSRF - Exploiting URL Parser in Trending Programming Languages!"
* 2019, bpo-35906: "[CVE-2019-9947] Header Injection in urllib" (another CVE!)

Closed issues:

* 2004, bpo-918368: "urllib doesn't correct server returned urls" (urllib)
  FIXED BY: commit 7c2867fcb1ade429a41e030585332ea26e3f60e1
  Fix: fullurl = quote(fullurl, safe="%/:=&?~#+!$,;'@()*[]")
* 2005, bpo-1353433: "Http redirection error in urllib2.py" (urllib2)
  FIXED BY: commit ddb84d7c69addc5d5e2ab3e327260d97b52af3a7
  Fix: newurl = newurl.replace(' ', '%20')
* 2005, bpo-1153027: "http_error_302() crashes with 'HTTP/1.1 400 Bad Request"
  FIXED BY: commit 690ce9b353bc0a86d0886470adbaa50e813de3b8 (Lib/urllib/request.py)
  Fix: fullurl = quote(fullurl, safe="%/:=&?~#+!$,;'@()*[]")
* bpo-29606: "urllib FTP protocol stream injection"
  Duplicate of bpo-30119.
* bpo-30119: "(ftplib) A remote attacker could possibly attack by containing the newline characters"
  FIXED BY: commmit 8c2d4cf092c5f0335e7982392a33927579c4d512
  Fix: reject "\r" and "\n" in FTP.putline() (Lib/ftplib.py)
* bpo-36276: "[CVE-2019-9740] Python urllib CRLF injection vulnerability"
  Closed as duplicate of bpo-30458

Rejected pull requests:

* https://github.com/python/cpython/pull/1216/files
  bpo-29606: Reject "\n" in ftp_open() of Lib/urllib/request.py
* https://github.com/python/cpython/pull/2800/files
  bpo-29606: Reject "\n" in ftp_open() and open_ftp() of Lib/urllib/request.py
* https://github.com/python/cpython/pull/2301/files
  bpo-30713: The splittype(), splitport() and splithost() functions of the
  urllib.parse module now reject URLs which contain a newline character.
* https://github.com/python/cpython/pull/2303/files
  bpo-30713: The splittype(), splitport() and splithost() functions of the
  urllib.parse module now reject URLs which contain a newline character, but
  splittype() accepts newlines after the type.
msg339848 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 10:43
> * 2019, bpo-35906: "[CVE-2019-9947] Header Injection in urllib" (another CVE!)

Gregory P. Smith just marked bpo-35906 as a duplicate of this issue. Copy of his msg339842:

"""
my fix proposed in issue30458 fixes this issue.

i do not think this one deserved its own CVE; at least https://nvd.nist.gov/vuln/detail/CVE-2019-9947's current text also points to the other one.
"""

Until the status of CVE-2019-9947 is clarified, I added CVE-2019-9947 in the title of this issue to help to better track all CVEs :-)

Did someone contact the CVE organization to do something with CVE-2019-9947?
msg339850 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 10:45
> The CVE-2019-9740 has been assigned to the bpo-36276

I don't know how CVE are assigned. Since this issue started with "the patch in CVE-2016-5699 can be broke by an addition space", would it make sense to reuse CVE-2016-5699 rather than using a new CVE?
msg339851 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 10:48
> Closed issues:

I forgot:

* 2017, bpo-30713: "Reject newline character (U+000A) in URLs in urllib.parse"
  Rejected: the ftplib vulnerabilty has been fixed by bpo-30119
  with commmit 8c2d4cf092c5f0335e7982392a33927579c4d512.
msg339852 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-04-10 10:59
As @gregory.p.smith noted in GitHub [0] this fixes only protocol level bugs. There are some parsing ambiguities in urllib that are potential security issues still to be fixed.

issue20271 - urllib.urlparse('http://benign.com\[attacker.com]') returns attacker.com as hostname . A slightly related issue https://bugs.python.org/issue20271
issue35748 - urllib.urlparse(r'http://spam\eggs!cheese&aardvark@evil.com') returns evil.com as hostname
issue23505 - Urlparse insufficient validation leads to open redirect
issue33661 - urllib may leak sensitive HTTP headers to a third-party web site (Redirecting from https to http might also pass some headers in plain text. This behavior was changed in requests, golang, Curl that had their own respective CVEs)

As a fun side note this vulnerability was used by one of our own tests as a feature from 2012 to test another security issue (issue14001) [1] :) 

[0] https://github.com/python/cpython/pull/12755#issuecomment-481599611
[1] https://github.com/python/cpython/pull/12755#issuecomment-481618741
msg339853 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2019-04-10 11:28
Gregory, I haven’t tried recent Python code, but I expect the problem with percent decoding is still there. If you did try my example, what results did you see? Be aware that these techniques only work if the OS co-operates and connects to localhost when you give it the longer host string. At the moment I have glibc 2.26 on x86-64 Linux.

In the Python 3 master branch, the percent-encoding should be decoded in “urllib.request.Request._parse”:

def _parse(self):
    ...
    self.host, self.selector = _splithost(rest)
    if self.host:
        self.host = unquote(self.host)

Then in “AbstractHTTPHandler.do_request_” the decoded host string becomes the “Host” header field value, without any encoding:

def do_request_(self, request):
    host = request.host
    ...
    sel_host = host
    ...
    if not request.has_header('Host'):
        request.add_unredirected_header('Host', sel_host)

Perhaps one solution to both my version and Orange’s original version is to encode the “Host” header field value properly. This might also apply to the “http.client” code.
msg339857 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 12:32
bpo-36276 has been marked as a duplicate of this issue.

According to the following message, urllib3 is also vulnerable to HTTP Header Injection:
https://bugs.python.org/issue36276#msg337837

Copy of Alvin Chang's msg337837:

"""
I am also seeing the same issue with urllib3 

import urllib3

pool_manager = urllib3.PoolManager()

host = "localhost:7777?a=1 HTTP/1.1\r\nX-injected: header\r\nTEST: 123"
url = "http://" + host + ":8080/test/?test=a"

try:
    info = pool_manager.request('GET', url).info()
    print(info)
except Exception:
    pass

nc -l localhost 7777
GET /?a=1 HTTP/1.1
X-injected: header
TEST: 123:8080/test/?test=a HTTP/1.1
Host: localhost:7777
Accept-Encoding: identity
"""
msg339858 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 12:35
> According to the following message, urllib3 is also vulnerable to HTTP Header Injection: (...)

And the issue has been reported to urllib3:
https://github.com/urllib3/urllib3/issues/1553

Copy of the first message:

"""
At https://bugs.python.org/issue36276 there's an issue in Python's urllib that an attacker controlling the request parameter can inject headers by injecting CR/LF chars.

A commenter mentions that the same bug is present in urllib3:
https://bugs.python.org/issue36276#msg337837

So reporting it here to make sure it gets attention.
"""
msg339861 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 13:03
Since this issue has a long history and previously attempts to fix it failed, it seems like the Internet is a black or white world, more like a scale of gray... *Maybe* we need to provide a way to allow to pass junk characters in an URL? (disable URL validation)

Idea: add an optional parameter to urllib, httplib, maybe also ftplib, to allow arbitrary "invalid" URLs / FTP commands. It would be a parameter *per request*, not a global option.

I don't propose to have a global configuration option like an environment variable, urllib attribute or something else. A global option would be hard to control and would impact just too much code.

My PEP 433 has been rejected because of the sys.setdefaultcloexec(cloexec: bool) function which allowed to change globally the behavior of Python. The PEP 446 has been accepted with no *global* option to opt-in for the old behavior, but only "local" *per file descriptor*: os.set_inheritable(fd, inheritable).
msg339884 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-04-10 19:31
> *Maybe* we need to provide a way to allow to pass junk characters in an URL? (disable URL validation)

We should not do this in our http protocol stack code.  Anyone who _wants_ that is already intentionally violating the http protocol which defeats the entire purpose of our library and the parameter named "url".

Will this break something in the world other than our own test_xmlrpc test?  Probably.  Do they have a right to complain about it?  Not one we need listen to.  Such code is doing something that was clearly an abuse of the API.  The parameter was named url not raw_data_to_stuff_subversively_into_the_binary_protocol.  Its intent was clear.
msg339894 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 22:05
> Will this break something in the world other than our own test_xmlrpc test?  Probably. Do they have a right to complain about it?  Not one we need listen to.

I understand. But. Can we consider that for old Python versions like Python 2.7 and 3.5?

This change will be applied to all supported Python versions.

I recall that when Python 2.7 started to validate TLS certificate, the change broke some applications. Are these applications badly written? Yes! But well, "it worked well before". Sometimes, when you work in a private network, the security matters less, whereas it might be very expensive to fix a legacy application. At Red Hat, we developed a solution to let customers to opt-out from this fix (to no validate TLS certificates), because it is just too expensive for customers to fix their legacy code but they would like to be able to upgrade RHEL.

One option to not validate URLs is to downgrade Python, but I'm not sure that it's the best compromise :-/
msg340405 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-17 15:32
It seems like a change has been pushed into urllib3 to fix this issue, but that there is an issue with international URLs and that maybe RFC 3986 should be updated.

RFC 3986: "Uniform Resource Identifier (URI): Generic Syntax" (January 2005)
https://www.ietf.org/rfc/rfc3986.txt

"Without #1531 or IRI support in rfc3986 releasing master in it's current state will break backwards compatibility with international URLs."

https://github.com/urllib3/urllib3/issues/1553#issuecomment-474046652

=> where 1531 means https://github.com/urllib3/urllib3/pull/1531

"wave Hi! I've noticed that CVE-2019-11236 has been assigned to the CRLF injection issue described here. It seems that the library has been patched in GitHub, but no new release has been made to pypi. Will a new release containing the fix be made to pypi soon? Based on @theacodes comment it seems like a release was going to be made, but I also see her status has her perhaps unavailable. Is someone else perhaps able to cut a new release into pypi?"

https://github.com/urllib3/urllib3/issues/1553#issuecomment-484113222
msg340407 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-17 15:35
"wave Hi! I've noticed that CVE-2019-11236 has been assigned to the CRLF injection issue described here. It seems that the library has been patched in GitHub, but no new release has been made to pypi. (...)"

This urllib3 change:
https://github.com/urllib3/urllib3/commit/0aa3e24fcd75f1bb59ab159e9f8adb44055b2271

urllib3 now vendors a copy of the rfc3986 library:

https://pypi.org/project/rfc3986/
msg340408 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-17 15:40
> urllib3 now vendors a copy of the rfc3986 library:
> https://pypi.org/project/rfc3986/

There are multiple Python projects to validate URI:

* https://github.com/python-hyper/rfc3986/ -> https://pypi.org/project/rfc3986/
* https://github.com/dgerber/rfc3987 -> https://pypi.org/project/rfc3987/ (the name is confusing: the library implements the RFC 3986, not the RFC 3987)
* https://github.com/tkem/uritools/ -> https://pypi.org/project/uritools/
History
Date User Action Args
2019-04-17 15:40:34vstinnersetmessages: + msg340408
2019-04-17 15:35:49vstinnersetmessages: + msg340407
2019-04-17 15:32:37vstinnersetmessages: + msg340405
2019-04-10 22:05:48vstinnersetmessages: + msg339894
2019-04-10 19:31:06gregory.p.smithsetmessages: + msg339884
2019-04-10 13:03:09vstinnersetmessages: + msg339861
2019-04-10 12:35:01vstinnersetmessages: + msg339858
2019-04-10 12:32:58vstinnersetmessages: + msg339857
2019-04-10 11:28:11martin.pantersetmessages: + msg339853
2019-04-10 10:59:20xtreaksetmessages: + msg339852
2019-04-10 10:50:15vstinnersettitle: [CVE-2019-9740][CVE-2019-9947][security] CRLF Injection in httplib -> [security][CVE-2019-9740][CVE-2019-9947] HTTP Header Injection (follow-up of CVE-2016-5699)
2019-04-10 10:48:21vstinnersetmessages: + msg339851
2019-04-10 10:45:12vstinnersetmessages: + msg339850
2019-04-10 10:43:18vstinnersetmessages: + msg339848
title: [CVE-2019-9740][security] CRLF Injection in httplib -> [CVE-2019-9740][CVE-2019-9947][security] CRLF Injection in httplib
2019-04-10 10:36:30vstinnersetversions: + Python 3.5, Python 3.6, Python 3.7, Python 3.8
2019-04-10 10:36:15vstinnersetmessages: + msg339846
2019-04-10 09:32:36gregory.p.smithlinkissue35906 superseder
2019-04-10 09:23:06gregory.p.smithsetassignee: gregory.p.smith
2019-04-10 09:22:34gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg339840
2019-04-10 09:11:30gregory.p.smithsetkeywords: + patch
stage: patch review
pull_requests: + pull_request12688
2019-04-09 15:42:49waresetnosy: + ware
2019-04-09 14:28:59vstinnersetnosy: + vstinner

messages: + msg339754
title: CRLF Injection in httplib -> [CVE-2019-9740][security] CRLF Injection in httplib
2019-03-17 08:52:59xtreaklinkissue36276 superseder
2019-03-15 06:15:37xtreaksetnosy: + xtreak
messages: + msg337970
2017-11-26 01:04:36martin.pantersetmessages: + msg306981
2017-11-26 01:00:28martin.panterlinkissue32085 dependencies
2017-11-26 00:03:14martin.pantersettype: security
2017-06-03 07:01:33martin.pantersetmessages: + msg295067
2017-06-02 15:36:20xiang.zhangsetnosy: + xiang.zhang, serhiy.storchaka, martin.panter
messages: + msg295026
2017-05-24 15:01:31orangecreate