classification
Title: ipaddress Should not reject IPv4 addresses with leading zeroes as ambiguously octal
Type: security Stage: patch review
Components: Documentation, Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Joel Croteau, Julian, christian.heimes, docs@python, eric.smith, gc2, lukasz.langa, ncoghlan, ned.deily, pmoody, serhiy.storchaka, steve.dower, vstinner
Priority: deferred blocker Keywords: 3.8regression, 3.9regression, patch

Created on 2019-03-20 23:40 by Joel Croteau, last changed 2021-04-11 23:01 by Julian.

Pull Requests
URL Status Linked Edit
PR 12577 merged python-dev, 2019-03-27 02:05
PR 25099 open christian.heimes, 2021-03-30 14:51
Messages (16)
msg338514 - (view) Author: Joel Croteau (Joel Croteau) * Date: 2019-03-20 23:40
I understand to a certain extent the logic in not allowing IPv4 octets that might ambiguously be octal, but in practice, it just seems like it creates additional parsing hassle needlessly. I have never in many years of working on many networked systems seen anyone use dotted octal format, it is actually specifically forbidden by RFC 3986 (https://tools.ietf.org/html/rfc3986#section-7.4), and it means that the ipaddress module throws exceptions on many perfectly valid IP addresses just because they have leading zeroes. Since the module doesn't support dotted octal or dotted hex anyway, this check seems a little pointless. If nothing else, there should be a way to disable this check by specifying that your IPs are in fact dotted decimal, otherwise it seems like it's just making you have to do extra parsing work or just write your own implementation.
msg338694 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-03-23 19:35
I agree that this is not a useful check.
msg339204 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-03-30 14:53
New changeset e653d4d8e820a7a004ad399530af0135b45db27a by Nick Coghlan (Joel Croteau) in branch 'master':
bpo-36384: Remove check for leading zeroes in IPv4 addresses (GH-12577)
https://github.com/python/cpython/commit/e653d4d8e820a7a004ad399530af0135b45db27a
msg339205 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-03-30 14:57
I've merged the change for Python 3.8 (thanks Joel!).

I'm not sure whether to classify it as an enhancement or as an interoperability bug fix, though, so I've put the status to pending and added Ned to the nosy list to get his thoughts as the Python 3.7 RM.
msg339206 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-03-30 15:44
ipaddress is behaving as documented:

"The following constitutes a valid IPv4 address:

A string in decimal-dot notation, consisting of four decimal integers in the inclusive range 0–255, separated by dots (e.g. 192.168.0.1). Each integer represents an octet (byte) in the address. Leading zeroes are tolerated only for values less than 8 (as there is no ambiguity between the decimal and octal interpretations of such strings). [...]"

https://docs.python.org/3/library/ipaddress.html

I can sort of understand imposing that restriction in a Python 2 world where leading zeros implied octal and Python 3 outright rejects such forms of integers to avoid the ambiguity.  That said, there's no particular reason why the components of an IPv4 string acceptable to ipaddress *have* to follow the same rules so I'm +0 on making the change at all.  It's a bit of a stretch to consider it a bug when it appears to be behaving as documented but I would expect such a change to fix more problems than causing them so I'm OK if you want to backport it.

But, in any case, the documentation for 3.8 and/or 3.7 needs to be updated.
msg339210 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-30 16:15
See also the article "Ping and FTP Resolve IP Address with Leading Zero as Octal" (https://web.archive.org/web/20061206211851/http://support.microsoft.com/kb/115388).

This is still true in Windows 10.

So it is safer to reject IPv4 addresses with leading zeros that can be ambiguously interpreted. Otherwise this can create a security hole.
msg339211 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-03-30 16:16
I think it should be 3.8 only, and the docs should be updated. Apologies for not catching that earlier: I searched via Google, which was a mistake.
msg339572 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-04-07 11:49
The recommended handling in the article that Serhiy mentions is to strip the leading zeroes, which the ipaddress module will still do - it's only being made more tolerant on input. That means it will become usable as a prefilter step (pass string with potentially leading zeroes to ipaddress, get string with no leading zeroes out).

So that means the one part we missed is the docs update (together with a versionchanged note in the module docs themselves)
msg389826 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-03-30 14:23
Serhiy was right, this is a security issue.

The patch should not have landed in 3.8. At a bare minimum the patch should have been postponed until documentation was updated. Since 3.8 the ipaddresss does not behave as documented. A similar security issue in NPM was published two days ago, CVE-2021-28918.

I proposed to not only revert the change, but also tighten the check for leading zeros so it behaves like glibc's inet_pton(). It refuses any IPv4 string with a leading zero.

>>> socket.inet_pton(socket.AF_INET, "01.1.1.1")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: illegal IP address string passed to inet_pton
>>> socket.inet_pton(socket.AF_INET, "1.1.1.01")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: illegal IP address string passed to inet_pton
msg389847 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-30 19:35
> The patch should not have landed in 3.8. At a bare minimum the patch should have been postponed until documentation was updated. Since 3.8 the ipaddresss does not behave as documented. A similar security issue in NPM was published two days ago, CVE-2021-28918.

Link: https://sick.codes/sick-2021-011
msg390046 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-04-02 09:35
Deferred the blocker to the next regular release due to lack of activity in time for the current expedited releases.
msg390128 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-03 16:41
(Copied from my comment on the PR, following the one where I said this was ready to go.)

Withdrawing the readiness - @ambv and I would prefer to see this behind a flag (probably "strict" parsing), on by default for 3.10, and maybe on by default for 3.9/earlier.

The main reasoning being that this isn't our vulnerability, but an inconsistency with other vulnerable libraries. The current fix is the best it can be, but it doesn't prevent the vulnerability, it just causes Python to break first. So it ought to be relatively easy to retain the flexible (though admittedly non-sensical) behaviour for those who currently rely on it.
msg390331 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 13:41
> Withdrawing the readiness - @ambv and I would prefer to see this behind a flag (probably "strict" parsing), on by default for 3.10, and maybe on by default for 3.9/earlier.

Last time we added a new parameter in a stable branch, it didn't go well:
https://bugs.python.org/issue42967#msg387638
msg390340 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-06 14:29
The important quote from the linked issue seems to be:

> Our new separator= parameter does not allow one to achieve the previous behavior if mixing and matching & And ; was intended to be allowed, as it is a single separator rather than a set of separators.

So arguably, we added _the wrong_ parameter in that case, because it only allowed choosing between behaviours not including the "bad" behaviour. We should've added one that was "give me back the previous behaviour".

In this case, having it off by default goes further to prevent breakage, and I wouldn't be opposed to a process level opt-in (e.g. a module-level flag), so that _applications_ have a way to force their dependencies to use the safer behaviour without needing to patch them. Similarly, a process level opt-out also seems good enough if we were to have it on by default.
msg390353 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 17:19
> In this case, having it off by default goes further to prevent breakage

PyYAML was unsafe by default: it allowed to execute arbitary Python code by default. It took years to change the default to "safe". I don't think that adding a parameter for opt-in for security is a good approach. An application can use ipaddress internally without being aware of using it, if it's done by a third party module. It's hard to prevent security vulnerabilities if people have to "opt-in" for security.

I prefer to break code and force people to manually get back the old behavior. It's better to make 90% safe by default but make 10% of people unhappy.

It's uncommon to pass IPv4 addresses with leading zeros.

If you want to tolerate leading zeros, you don't have to modify the ipaddress for that, you can pre-process your inputs: it works on any Python version with or without the fix.

>>> def reformat_ip(address): return '.'.join(part.lstrip('0') if part != '0' else part for part in address.split('.'))
... 
>>> reformat_ip('0127.0.0.1')
'127.0.0.1'

Or with an explicit loop for readability:

def reformat_ip(address):
    parts = []
    for part in address.split('.'):
        if part != "0":
            part = part.lstrip('0')
        parts.append(part)
    return '.'.join(parts)
msg390361 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-06 19:03
> I don't think that adding a parameter for opt-in for security is a good approach.

I meant to have it set by default on 3.10, when we do not have to worry 
about breaking users.

If it takes years for users to get to 3.10, we should reevaluate our 
release cycle, not whether we aggressively break maintenance releases.
History
Date User Action Args
2021-04-11 23:01:06Juliansetnosy: + Julian
2021-04-06 19:03:09steve.dowersetmessages: + msg390361
2021-04-06 17:19:01vstinnersetmessages: + msg390353
2021-04-06 14:29:48steve.dowersetmessages: + msg390340
2021-04-06 13:41:59vstinnersetmessages: + msg390331
2021-04-03 16:41:45steve.dowersetnosy: + steve.dower
messages: + msg390128
2021-04-02 09:35:18lukasz.langasetmessages: + msg390046
2021-04-02 09:34:39lukasz.langasetpriority: release blocker -> deferred blocker
2021-03-31 19:59:14christian.heimessetpriority: critical -> release blocker
nosy: + lukasz.langa
2021-03-31 08:54:04gc2setnosy: + gc2
2021-03-30 19:35:23vstinnersetnosy: + vstinner
messages: + msg389847
2021-03-30 14:51:56christian.heimessetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request23844
2021-03-30 14:23:11christian.heimessetpriority: normal -> critical

type: behavior -> security
components: + Library (Lib)
versions: + Python 3.9, Python 3.10
keywords: + 3.8regression, 3.9regression, - 3.2regression
nosy: + christian.heimes

messages: + msg389826
2019-04-07 11:49:32ncoghlansetnosy: + docs@python
messages: + msg339572

assignee: docs@python
components: + Documentation, - Library (Lib)
2019-03-30 16:16:17eric.smithsetmessages: + msg339211
2019-03-30 16:15:02serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg339210
2019-03-30 15:44:32ned.deilysetstatus: pending -> open
messages: + msg339206

keywords: + 3.2regression, - patch
resolution: fixed ->
stage: resolved -> needs patch
2019-03-30 15:12:39xtreaksetstatus: open -> pending
2019-03-30 15:12:26xtreaksetstatus: pending -> open
nosy: + ned.deily
2019-03-30 14:57:36ncoghlansetstatus: open -> pending
resolution: fixed
messages: + msg339205

stage: patch review -> resolved
2019-03-30 14:53:51ncoghlansetmessages: + msg339204
2019-03-27 06:32:51serhiy.storchakasetnosy: + ncoghlan
2019-03-27 02:05:56python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request12521
2019-03-23 19:35:05eric.smithsetnosy: + eric.smith
messages: + msg338694
2019-03-21 08:16:51SilentGhostsetnosy: + pmoody

versions: - Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.9
2019-03-20 23:40:09Joel Croteaucreate