classification
Title: CVE-2019-18348: CRLF injection via the host part of the url passed to urlopen()
Type: security Stage: needs patch
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Anselmo Melo, b1tninja, cstratak, gregory.p.smith, kim, mcepl, rschiron, vstinner, xtreak
Priority: high Keywords:

Created on 2019-10-24 07:51 by rschiron, last changed 2019-12-10 16:24 by mcepl.

Messages (3)
msg355294 - (view) Author: Riccardo Schirone (rschiron) Date: 2019-10-24 07:51
Copy-pasted from https://bugs.python.org/issue30458#msg347282

================
The commit b7378d77289c911ca6a0c0afaf513879002df7d5 is incomplete: it doesn't seem to check for control characters in the "host" part of the URL, only in the "path" part of the URL. Example:
---
try:
    from urllib import request as urllib_request
except ImportError:
    import urllib2 as urllib_request
import socket
def bug(*args):
    raise Exception(args)
# urlopen() must not call create_connection()
socket.create_connection = bug
urllib_request.urlopen('http://127.0.0.1\r\n\x20hihi\r\n :11211')
---

The URL comes from the first message of this issue:
https://bugs.python.org/issue30458#msg294360

Development branches 2.7 and master produce a similar output:
---
Traceback (most recent call last):
 ...
Exception: (('127.0.0.1\r\n hihi\r\n ', 11211), ..., None)
---

So urllib2/urllib.request actually does a real network connection (DNS query), whereas it should reject control characters in the "host" part of the URL.

***

A second problem comes into the game. Some C libraries like glibc strip the end of the hostname (strip at the first newline character) and so HTTP Header injection is still possible is this case:
https://bugzilla.redhat.com/show_bug.cgi?id=1673465

***

According to the RFC 3986, the "host" grammar doesn't allow any control character, it looks like:

   host          = IP-literal / IPv4address / reg-name

   ALPHA (letters)
   DIGIT (decimal digits)
   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
      pct-encoded = "%" HEXDIG HEXDIG
      sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
   reg-name      = *( unreserved / pct-encoded / sub-delims )

   IP-literal    = "[" ( IPv6address / IPvFuture  ) "]"
   IPvFuture     = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
   IPv6address   =                            6( h16 ":" ) ls32
                 /                       "::" 5( h16 ":" ) ls32
                 / [               h16 ] "::" 4( h16 ":" ) ls32
                 / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                 / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                 / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                 / [ *4( h16 ":" ) h16 ] "::"              ls32
                 / [ *5( h16 ":" ) h16 ] "::"              h16
                 / [ *6( h16 ":" ) h16 ] "::"
   h16           = 1*4HEXDIG
   ls32          = ( h16 ":" h16 ) / IPv4address
   IPv4address   = dec-octet "." dec-octet "." dec-octet "." dec-octet
================


CVE-2019-18348 was assigned to this flaw, which is similar to CVE-2019-9947 and CVE-2019-9740 but it is about the *host* part of a url.
msg357073 - (view) Author: Justin Capella (b1tninja) * Date: 2019-11-20 13:52
Can't see the specifics of that "restricted" redhat bug, but this was interesting bug and I wanted to ask if perhaps the domain in such cases should be IDN / punycoded ://xn--n28h.ws/ for example is ://💩.la
msg357442 - (view) Author: Riccardo Schirone (rschiron) Date: 2019-11-25 15:38
The glibc issue mentioned in the first comment is CVE-2016-10739 .
History
Date User Action Args
2019-12-10 16:24:42mceplsetnosy: + mcepl
2019-12-09 03:08:06gregory.p.smithsetpriority: normal -> high
2019-11-25 15:38:02rschironsetmessages: + msg357442
2019-11-20 13:52:37b1tninjasetnosy: + b1tninja
messages: + msg357073
2019-11-20 12:04:45kimsetnosy: + kim
2019-11-19 14:21:31vstinnersetcomponents: + Library (Lib)
versions: + Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9
2019-10-30 22:11:08Anselmo Melosetnosy: + Anselmo Melo
2019-10-24 16:55:24gregory.p.smithsetstage: needs patch
2019-10-24 16:55:12gregory.p.smithsetnosy: + gregory.p.smith
2019-10-24 13:27:28cstrataksetnosy: + cstratak
2019-10-24 10:47:17vstinnersettitle: CVE-2019-18348 CRLF injection via the host part of the url passed to urlopen() -> CVE-2019-18348: CRLF injection via the host part of the url passed to urlopen()
2019-10-24 07:55:28xtreaksetnosy: + vstinner, xtreak
2019-10-24 07:51:18rschironcreate