classification
Title: [security] CVE-2021-29921: ipaddress Should not reject IPv4 addresses with leading zeroes as ambiguously octal
Type: security Stage: resolved
Components: Documentation, Library (Lib) Versions: Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Joel Croteau, Julian, christian.heimes, docs@python, eric.smith, gc2, lukasz.langa, mgorny, miss-islington, ncoghlan, ned.deily, pablogsal, pmoody, serhiy.storchaka, steve.dower, vstinner
Priority: deferred blocker Keywords: 3.8regression, 3.9regression, patch

Created on 2019-03-20 23:40 by Joel Croteau, last changed 2021-05-25 20:01 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 12577 merged python-dev, 2019-03-27 02:05
PR 25099 merged christian.heimes, 2021-03-30 14:51
PR 25815 merged miss-islington, 2021-05-02 12:00
Messages (29)
msg338514 - (view) Author: Joel Croteau (Joel Croteau) * Date: 2019-03-20 23:40
I understand to a certain extent the logic in not allowing IPv4 octets that might ambiguously be octal, but in practice, it just seems like it creates additional parsing hassle needlessly. I have never in many years of working on many networked systems seen anyone use dotted octal format, it is actually specifically forbidden by RFC 3986 (https://tools.ietf.org/html/rfc3986#section-7.4), and it means that the ipaddress module throws exceptions on many perfectly valid IP addresses just because they have leading zeroes. Since the module doesn't support dotted octal or dotted hex anyway, this check seems a little pointless. If nothing else, there should be a way to disable this check by specifying that your IPs are in fact dotted decimal, otherwise it seems like it's just making you have to do extra parsing work or just write your own implementation.
msg338694 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-03-23 19:35
I agree that this is not a useful check.
msg339204 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-03-30 14:53
New changeset e653d4d8e820a7a004ad399530af0135b45db27a by Nick Coghlan (Joel Croteau) in branch 'master':
bpo-36384: Remove check for leading zeroes in IPv4 addresses (GH-12577)
https://github.com/python/cpython/commit/e653d4d8e820a7a004ad399530af0135b45db27a
msg339205 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-03-30 14:57
I've merged the change for Python 3.8 (thanks Joel!).

I'm not sure whether to classify it as an enhancement or as an interoperability bug fix, though, so I've put the status to pending and added Ned to the nosy list to get his thoughts as the Python 3.7 RM.
msg339206 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-03-30 15:44
ipaddress is behaving as documented:

"The following constitutes a valid IPv4 address:

A string in decimal-dot notation, consisting of four decimal integers in the inclusive range 0–255, separated by dots (e.g. 192.168.0.1). Each integer represents an octet (byte) in the address. Leading zeroes are tolerated only for values less than 8 (as there is no ambiguity between the decimal and octal interpretations of such strings). [...]"

https://docs.python.org/3/library/ipaddress.html

I can sort of understand imposing that restriction in a Python 2 world where leading zeros implied octal and Python 3 outright rejects such forms of integers to avoid the ambiguity.  That said, there's no particular reason why the components of an IPv4 string acceptable to ipaddress *have* to follow the same rules so I'm +0 on making the change at all.  It's a bit of a stretch to consider it a bug when it appears to be behaving as documented but I would expect such a change to fix more problems than causing them so I'm OK if you want to backport it.

But, in any case, the documentation for 3.8 and/or 3.7 needs to be updated.
msg339210 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-30 16:15
See also the article "Ping and FTP Resolve IP Address with Leading Zero as Octal" (https://web.archive.org/web/20061206211851/http://support.microsoft.com/kb/115388).

This is still true in Windows 10.

So it is safer to reject IPv4 addresses with leading zeros that can be ambiguously interpreted. Otherwise this can create a security hole.
msg339211 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-03-30 16:16
I think it should be 3.8 only, and the docs should be updated. Apologies for not catching that earlier: I searched via Google, which was a mistake.
msg339572 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-04-07 11:49
The recommended handling in the article that Serhiy mentions is to strip the leading zeroes, which the ipaddress module will still do - it's only being made more tolerant on input. That means it will become usable as a prefilter step (pass string with potentially leading zeroes to ipaddress, get string with no leading zeroes out).

So that means the one part we missed is the docs update (together with a versionchanged note in the module docs themselves)
msg389826 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-03-30 14:23
Serhiy was right, this is a security issue.

The patch should not have landed in 3.8. At a bare minimum the patch should have been postponed until documentation was updated. Since 3.8 the ipaddresss does not behave as documented. A similar security issue in NPM was published two days ago, CVE-2021-28918.

I proposed to not only revert the change, but also tighten the check for leading zeros so it behaves like glibc's inet_pton(). It refuses any IPv4 string with a leading zero.

>>> socket.inet_pton(socket.AF_INET, "01.1.1.1")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: illegal IP address string passed to inet_pton
>>> socket.inet_pton(socket.AF_INET, "1.1.1.01")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: illegal IP address string passed to inet_pton
msg389847 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-30 19:35
> The patch should not have landed in 3.8. At a bare minimum the patch should have been postponed until documentation was updated. Since 3.8 the ipaddresss does not behave as documented. A similar security issue in NPM was published two days ago, CVE-2021-28918.

Link: https://sick.codes/sick-2021-011
msg390046 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-04-02 09:35
Deferred the blocker to the next regular release due to lack of activity in time for the current expedited releases.
msg390128 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-03 16:41
(Copied from my comment on the PR, following the one where I said this was ready to go.)

Withdrawing the readiness - @ambv and I would prefer to see this behind a flag (probably "strict" parsing), on by default for 3.10, and maybe on by default for 3.9/earlier.

The main reasoning being that this isn't our vulnerability, but an inconsistency with other vulnerable libraries. The current fix is the best it can be, but it doesn't prevent the vulnerability, it just causes Python to break first. So it ought to be relatively easy to retain the flexible (though admittedly non-sensical) behaviour for those who currently rely on it.
msg390331 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 13:41
> Withdrawing the readiness - @ambv and I would prefer to see this behind a flag (probably "strict" parsing), on by default for 3.10, and maybe on by default for 3.9/earlier.

Last time we added a new parameter in a stable branch, it didn't go well:
https://bugs.python.org/issue42967#msg387638
msg390340 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-06 14:29
The important quote from the linked issue seems to be:

> Our new separator= parameter does not allow one to achieve the previous behavior if mixing and matching & And ; was intended to be allowed, as it is a single separator rather than a set of separators.

So arguably, we added _the wrong_ parameter in that case, because it only allowed choosing between behaviours not including the "bad" behaviour. We should've added one that was "give me back the previous behaviour".

In this case, having it off by default goes further to prevent breakage, and I wouldn't be opposed to a process level opt-in (e.g. a module-level flag), so that _applications_ have a way to force their dependencies to use the safer behaviour without needing to patch them. Similarly, a process level opt-out also seems good enough if we were to have it on by default.
msg390353 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 17:19
> In this case, having it off by default goes further to prevent breakage

PyYAML was unsafe by default: it allowed to execute arbitary Python code by default. It took years to change the default to "safe". I don't think that adding a parameter for opt-in for security is a good approach. An application can use ipaddress internally without being aware of using it, if it's done by a third party module. It's hard to prevent security vulnerabilities if people have to "opt-in" for security.

I prefer to break code and force people to manually get back the old behavior. It's better to make 90% safe by default but make 10% of people unhappy.

It's uncommon to pass IPv4 addresses with leading zeros.

If you want to tolerate leading zeros, you don't have to modify the ipaddress for that, you can pre-process your inputs: it works on any Python version with or without the fix.

>>> def reformat_ip(address): return '.'.join(part.lstrip('0') if part != '0' else part for part in address.split('.'))
... 
>>> reformat_ip('0127.0.0.1')
'127.0.0.1'

Or with an explicit loop for readability:

def reformat_ip(address):
    parts = []
    for part in address.split('.'):
        if part != "0":
            part = part.lstrip('0')
        parts.append(part)
    return '.'.join(parts)
msg390361 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-06 19:03
> I don't think that adding a parameter for opt-in for security is a good approach.

I meant to have it set by default on 3.10, when we do not have to worry 
about breaking users.

If it takes years for users to get to 3.10, we should reevaluate our 
release cycle, not whether we aggressively break maintenance releases.
msg392423 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-30 11:53
The CVE-2021-29921 was assigned to this vulnerability.
msg392572 - (view) Author: Michał Górny (mgorny) * Date: 2021-05-01 07:36
> If it takes years for users to get to 3.10, we should reevaluate our 
> release cycle, not whether we aggressively break maintenance releases.

I don't really understand how that would help.  The problem is that users have major inertia for switching to newer Python versions.  A part of it is that a lot of people just don't care about deprecation warnings, and don't fix stuff until it's actually broken.  In the end, your projects are blocked from using new major Python version by broken dependencies with long release cycles.

I can't imagine deliberately leaving 3.8 and 3.9 vulnerable when 3.10 isn't going to reach final release in the next half year.  Gentoo stable is only switching to 3.9 next month.  I'm pretty sure some of our (few) corporate users are still on 3.7 or earlier.  Then, there are projects that literally include a vulnerable copy of Python 2.7 to get around distributions removing it.

I dare say this has less breakage potential than the &/; change.  It should be fixed on all affected versions.  If you don't do that, distributions will have to patch it anyway, and this will only lead to incompatibility between different Python package vendors.
msg392684 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-05-02 10:01
Due to the relative obscurity of the bug and potential disruption of the fix, I decided not to include it in 3.8.

However, Michał's argument about 3.10 not being released for another five months is resonating with me and so we will be backporting the change to 3.9.5, to be released tomorrow. Victor's argument about opt-ins being a bad way to fix security also makes sense, although let me point out that we've made decisions the other way in the past as well, for instance with hash randomization.

In any case, the issue will be solved in Python 3.10.0 Beta 1 and Python 3.9.5. Having the fixed behavior "in 3.9.5 and newer" makes for easy mechanical checks whether a given version is affected.
msg392692 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-05-02 12:00
New changeset 60ce8f0be6354ad565393ab449d8de5d713f35bc by Christian Heimes in branch 'master':
bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099)
https://github.com/python/cpython/commit/60ce8f0be6354ad565393ab449d8de5d713f35bc
msg392696 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-05-02 13:49
New changeset 5374fbc31446364bf5f12e5ab88c5493c35eaf04 by Miss Islington (bot) in branch '3.9':
bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099) (GH-25815)
https://github.com/python/cpython/commit/5374fbc31446364bf5f12e5ab88c5493c35eaf04
msg392697 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-05-02 13:53
Łukasz, thanks for pushing the PR over the finish line!
msg394162 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2021-05-21 22:22
Is there anything more to be done for this issue or can it be closed?
msg394316 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-25 10:26
I'm closing this, if someone thinks something is missing, please, reopen
msg394322 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-25 10:55
I created https://python-security.readthedocs.io/vuln/ipaddress-ipv4-leading-zeros.html to track this vulnerability.

Python 3.8 is left unchanged (accept leading zeros). Python 3.7 and older are not affected.
msg394327 - (view) Author: George-Cristian Bîrzan (gc2) Date: 2021-05-25 11:20
The timeline there is wrong. This issue's creation time isn't the disclosure time, it's when the bug was introduced. The disclosure was on 30th of May, when I emailed security@python.org and Christian Heimes commented here and made https://github.com/python/cpython/pull/25099. Even though Serhiy Storchaka commented that this could be a security issue back when the issue was new, the date would be 30th of March 2019, not 20th.
msg394379 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-25 16:44
George-Cristian Bîrzan: "The timeline there is wrong."

Fixed: https://python-security.readthedocs.io/vuln/ipaddress-ipv4-leading-zeros.html#timeline

The strange part is "2019-03-20 (-741 days): Python issue bpo-36384 reported by Joel Croteau".

The problem is that this issue was "reused" for two different things: the initial change and the vulnerability.

Maybe I can removed the reference to the bpo to remove it from the timeline (and put it in links).
msg394380 - (view) Author: George-Cristian Bîrzan (gc2) Date: 2021-05-25 17:10
I think the only thing I'd improve would be to mention that this issue is the one that introduced the bug, otherwise it looks a bit weird.
msg394390 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-25 20:01
> I think the only thing I'd improve would be to mention that this issue is the one that introduced the bug, otherwise it looks a bit weird.

Ok, done: https://python-security.readthedocs.io/vuln/ipaddress-ipv4-leading-zeros.html#timeline
History
Date User Action Args
2021-05-25 20:01:40vstinnersetmessages: + msg394390
2021-05-25 17:10:43gc2setmessages: + msg394380
2021-05-25 16:44:50vstinnersetmessages: + msg394379
2021-05-25 11:20:51gc2setmessages: + msg394327
2021-05-25 10:55:15vstinnersetmessages: + msg394322
2021-05-25 10:26:41pablogsalsetstatus: open -> closed

nosy: + pablogsal
messages: + msg394316

resolution: fixed
stage: patch review -> resolved
2021-05-21 22:22:30ned.deilysetmessages: + msg394162
2021-05-02 13:53:06christian.heimessetmessages: + msg392697
2021-05-02 13:49:09lukasz.langasetmessages: + msg392696
2021-05-02 12:00:52lukasz.langasetmessages: + msg392692
2021-05-02 12:00:50miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request24501
2021-05-02 10:01:10lukasz.langasetversions: - Python 3.8
2021-05-02 10:01:01lukasz.langasetassignee: docs@python ->
messages: + msg392684
2021-05-01 07:36:51mgornysetnosy: + mgorny
messages: + msg392572
2021-04-30 11:53:11vstinnersetmessages: + msg392423
title: ipaddress Should not reject IPv4 addresses with leading zeroes as ambiguously octal -> [security] CVE-2021-29921: ipaddress Should not reject IPv4 addresses with leading zeroes as ambiguously octal
2021-04-11 23:01:06Juliansetnosy: + Julian
2021-04-06 19:03:09steve.dowersetmessages: + msg390361
2021-04-06 17:19:01vstinnersetmessages: + msg390353
2021-04-06 14:29:48steve.dowersetmessages: + msg390340
2021-04-06 13:41:59vstinnersetmessages: + msg390331
2021-04-03 16:41:45steve.dowersetnosy: + steve.dower
messages: + msg390128
2021-04-02 09:35:18lukasz.langasetmessages: + msg390046
2021-04-02 09:34:39lukasz.langasetpriority: release blocker -> deferred blocker
2021-03-31 19:59:14christian.heimessetpriority: critical -> release blocker
nosy: + lukasz.langa
2021-03-31 08:54:04gc2setnosy: + gc2
2021-03-30 19:35:23vstinnersetnosy: + vstinner
messages: + msg389847
2021-03-30 14:51:56christian.heimessetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request23844
2021-03-30 14:23:11christian.heimessetpriority: normal -> critical

type: behavior -> security
components: + Library (Lib)
versions: + Python 3.9, Python 3.10
keywords: + 3.8regression, 3.9regression, - 3.2regression
nosy: + christian.heimes

messages: + msg389826
2019-04-07 11:49:32ncoghlansetnosy: + docs@python
messages: + msg339572

assignee: docs@python
components: + Documentation, - Library (Lib)
2019-03-30 16:16:17eric.smithsetmessages: + msg339211
2019-03-30 16:15:02serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg339210
2019-03-30 15:44:32ned.deilysetstatus: pending -> open
messages: + msg339206

keywords: + 3.2regression, - patch
resolution: fixed -> (no value)
stage: resolved -> needs patch
2019-03-30 15:12:39xtreaksetstatus: open -> pending
2019-03-30 15:12:26xtreaksetstatus: pending -> open
nosy: + ned.deily
2019-03-30 14:57:36ncoghlansetstatus: open -> pending
resolution: fixed
messages: + msg339205

stage: patch review -> resolved
2019-03-30 14:53:51ncoghlansetmessages: + msg339204
2019-03-27 06:32:51serhiy.storchakasetnosy: + ncoghlan
2019-03-27 02:05:56python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request12521
2019-03-23 19:35:05eric.smithsetnosy: + eric.smith
messages: + msg338694
2019-03-21 08:16:51SilentGhostsetnosy: + pmoody

versions: - Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.9
2019-03-20 23:40:09Joel Croteaucreate