msg385266 - (view) |
Author: Adam Goldschmidt (AdamGold) * |
Date: 2021-01-19 15:06 |
The urlparse module treats semicolon as a separator (https://github.com/python/cpython/blob/master/Lib/urllib/parse.py#L739) - whereas most proxies today only take ampersands as separators. Link to a blog post explaining this vulnerability: https://snyk.io/blog/cache-poisoning-in-popular-open-source-packages/
When the attacker can separate query parameters using a semicolon (;), they can cause a difference in the interpretation of the request between the proxy (running with default configuration) and the server. This can result in malicious requests being cached as completely safe ones, as the proxy would usually not see the semicolon as a separator, and therefore would not include it in a cache key of an unkeyed parameter - such as `utm_*` parameters, which are usually unkeyed. Let’s take the following example of a malicious request:
```
GET /?link=http://google.com&utm_content=1;link='><t>alert(1)</script> HTTP/1.1
Host: somesite.com
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,imag e/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9 Connection: close
```
urlparse sees 3 parameters here: `link`, `utm_content` and then `link` again. On the other hand, the proxy considers this full string: `1;link='><t>alert(1)</script>` as the value of `utm_content`, which is why the cache key would only contain `somesite.com/?link=http://google.com`.
A possible solution could be to allow developers to specify a separator, like werkzeug does:
https://github.com/pallets/werkzeug/blob/6784c44673d25c91613c6bf2e614c84465ad135b/src/werkzeug/urls.py#L833
|
msg385332 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-01-20 11:07 |
Oops, I missed this issue. I just marked my bpo-42975 issue as a duplicate of this one.
My message:
urllib.parse.parse_qsl() uses "&" *and* ";" as separators:
>>> urllib.parse.parse_qsl("a=1&b=2&c=3")
[('a', '1'), ('b', '2'), ('c', '3')]
>>> urllib.parse.parse_qsl("a=1&b=2;c=3")
[('a', '1'), ('b', '2'), ('c', '3')]
But the W3C standards evolved and now suggest against considering semicolon (";") as a separator:
https://www.w3.org/TR/2014/REC-html5-20141028/forms.html#url-encoded-form-data
"This form data set encoding is in many ways an aberrant monstrosity, the result of many years of implementation accidents and compromises leading to a set of requirements necessary for interoperability, but in no way representing good design practices. In particular, readers are cautioned to pay close attention to the twisted details involving repeated (and in some cases nested) conversions between character encodings and byte sequences."
"To decode application/x-www-form-urlencoded payloads (...) Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&)."
Maybe we should even go further in Python 3.10 and only split at "&" by default, but let the caller to opt-in for ";" separator as well.
|
msg385337 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2021-01-20 12:02 |
On 20.01.2021 12:07, STINNER Victor wrote:
> Maybe we should even go further in Python 3.10 and only split at "&" by default, but let the caller to opt-in for ";" separator as well.
+1.
Personally, I've never seen URLs encoded with ";" as query parameter
separator in practice on the server side.
The use of ";" was recommended in the HTML4 spec, but only in an
implementation side note:
https://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.2.2
and not in the main reference:
https://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#h-17.13.4.1
Browsers are also pretty relaxed about seeing non-escaped ampersands in
link URLs and do the right thing, so the suggested work-around for
avoiding escaping is not really needed.
|
msg385341 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2021-01-20 14:15 |
Sorry for the title mess: It seems that when replying to a ticket, RoundUp uses the subject line as the new header regardless of what it was set to before.
|
msg385342 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-01-20 14:16 |
> Sorry for the title mess: It seems that when replying to a ticket, RoundUp uses the subject line as the new header regardless of what it was set to before.
Yeah, it's annoying :-( I like to put a module name in the issue title, to help bug triage.
|
msg385344 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2021-01-20 14:23 |
It looks to me, that this is an issue of proxies, not Python. Python implementation obeys contemporary standards, and they are not formally cancelled yet. If we add an option in parse_qsl() or change its default behavior, it should be considered as a new feature which helps to mitigate proxies' issues.
|
msg385346 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-01-20 15:17 |
> Python implementation obeys contemporary standards
The contemporary standard is HTML5 and HTML5 asks to only split at "&", no?
|
msg385352 - (view) |
Author: Ken Jin (kj) * |
Date: 2021-01-20 16:06 |
FWIW, a surprising amount of things rely on treating ';' as a valid separator in the standard test suite.
From just a cursory look:
test_cgi
test_urlparse
A change in the public API of urlparse will also require a change in cgi.py's FieldStorage, FieldStorage.read_multi, parse and parse_multipart to expose that parameter since those functions forward arguments directly to urllib.parse.parse_qs internally.
If we backport this, it seems that we will *also* need to backport all those changes to cgi's public API. Otherwise, just backporting the security fix part without allowing the user to switch would break existing code.
Just my 2 cents on the issue. I'm not too familiar with security fixes in cpython anyways ;).
|
msg385495 - (view) |
Author: Ken Jin (kj) * |
Date: 2021-01-22 12:53 |
Adam, I linked a PR 2 days ago here https://github.com/python/cpython/pull/24271 , it has the test suite passing and the appropriate changes to cgi.py. Would you like to review it? Or since you submitted a PR, would you prefer I close mine instead?
|
msg385496 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-01-22 12:58 |
Ken, Please don't close your PR. I will review it. It has a CLA signed
which is helpful.
On Fri, Jan 22, 2021 at 4:53 AM Ken Jin <report@bugs.python.org> wrote:
>
> Ken Jin <kenjin4096@gmail.com> added the comment:
>
> Adam, I linked a PR 2 days ago here
> https://github.com/python/cpython/pull/24271 , it has the test suite
> passing and the appropriate changes to cgi.py. Would you like to review it?
> Or since you submitted a PR, would you prefer I close mine instead?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue42967>
> _______________________________________
>
|
msg385497 - (view) |
Author: Adam Goldschmidt (AdamGold) * |
Date: 2021-01-22 13:18 |
I haven't noticed, I'm sorry. I don't mind closing mine, just thought it could be a nice first contribution. Our PRs are different though - I feel like if we are to implement this, we should let the developer choose the separator and not limit to just `&` and `;` - but that discussion probably belongs in the PR.
|
msg385513 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-01-22 20:55 |
Too bad that semicolon is not recommended nowadays, it was a nice way to avoid ampersand HTML escape issues!
One server software that generates links using semicolons is debbugs: https://bugs.debian.org/cgi-bin/pkgreport.cgi?archive=both;package=gtk3-engines-xfce;package=gtk2-engines-xfce
|
msg385527 - (view) |
Author: Ken Jin (kj) * |
Date: 2021-01-23 10:13 |
@Adam:
>I haven't noticed, I'm sorry. I don't mind closing mine, just thought it could be a nice first contribution.
No worries :), please don't close yours.
> Our PRs are different though - I feel like if we are to implement this, we should let the developer choose the separator and not limit to just `&` and `;` - but that discussion probably belongs in the PR.
You're right, I think that's an elegant solution. In the unlikely event web standards change again in another 5 years, the user can change the arguments themselves and cpython won't have to change. And like Eric pointed out, some people do need ';'.
@senthil
I might make some changes soon, so it may not be ready for review yet. If I go ahead with the separator idea, I'll credit Adam as a co-author in the PR, which will require them to sign the CLA too.
|
msg385544 - (view) |
Author: Ken Jin (kj) * |
Date: 2021-01-23 16:41 |
I updated the PR to take in a sequence of separators from the user - eg:
>>> urllib.parse.parse_qsl('a=1&b=2;c=3', separators=('&', ';'))
[('a', '1'), ('b', '2'), ('c', '3')]
>>> urllib.parse.parse_qsl('a=1&b=2;c=3', separators=('&',))
[('a', '1'), ('b', '2;c=3')]
I _didn't_ change the default - it will allow both '&' and ';' still. Eric showed a link above that still uses semicolon. So I feel that it's strange to break backwards compatibility in a patch update. Maybe we can make just '&' the default in Python 3.10, while backporting the ability to specify separators to older versions so it's up to users?
I'm not sure, any thoughts on this? Opinions would be greatly appreciated.
|
msg385549 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-01-23 17:12 |
> I feel like if we are to implement this, we should let the developer choose the separator and not limit to just `&` and `;`
That doesn’t feel necessary to me. I suspect most links use &, some use ;, nothing else is valid at the moment and I don’t expect a new separator to suddenly appear. IMO the boolean parameter to also recognize ; was better.
> but that discussion probably belongs in the PR.
PR discussions are generally about how to achieve the goal (fix or new feature) and quality of implementation, but tickets is where we agree on what the goal is and how to fix it (big picture).
|
msg385565 - (view) |
Author: Adam Goldschmidt (AdamGold) * |
Date: 2021-01-23 22:22 |
> I _didn't_ change the default - it will allow both '&' and ';' still. Eric showed a link above that still uses semicolon. So I feel that it's strange to break backwards compatibility in a patch update. Maybe we can make just '&' the default in Python 3.10, while backporting the ability to specify separators to older versions so it's up to users?
I like this implementation. I definitely think we should not break backwards compatibility and only change the default in Python 3.10.
|
msg385566 - (view) |
Author: Adam Goldschmidt (AdamGold) * |
Date: 2021-01-23 22:25 |
> That doesn’t feel necessary to me. I suspect most links use &, some use ;, nothing else is valid at the moment and I don’t expect a new separator to suddenly appear. IMO the boolean parameter to also recognize ; was better.
That's reasonable. However, I think that we are making this change in order to treat the semicolon as a "custom" separator. In that case, why not let the developer decide on a different custom separator for their own use cases? What's the difference between a semicolon and something else?
|
msg385567 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-01-23 22:28 |
The difference is that semicolon is defined in a previous specification.
I don’t see this change as providing support for custom delimiters in URL parsing, but offering an option to pick between two specifications.
|
msg385582 - (view) |
Author: Ken Jin (kj) * |
Date: 2021-01-24 14:51 |
Dear all, now that Adam has signed the CLA, I have closed my PR in favor of Adam's because I think 2 open PRs might split everyone's attention. Instead, I'll focus on reviewing Adam's PR. Sorry for any inconvenience caused.
|
msg385585 - (view) |
Author: Adam Goldschmidt (AdamGold) * |
Date: 2021-01-24 17:15 |
> The difference is that semicolon is defined in a previous specification.
I understand, but this will limit us in the future if the spec changes - though I don't have strong feelings regarding this one.
> Dear all, now that Adam has signed the CLA, I have closed my PR in favor of Adam's because I think 2 open PRs might split everyone's attention. Instead, I'll focus on reviewing Adam's PR. Sorry for any inconvenience caused.
❤
|
msg385590 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-01-24 20:05 |
Senthil, what is your opinion here?
|
msg385865 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2021-01-28 14:45 |
Resolution of this issue is blocking 3.7.x and 3.6.x security releases and threatens to block upcoming maintenance releases.
|
msg386003 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-01-31 01:21 |
Ned, and others watching.
In future versions of Python, we can use only "&" based separator. But I am not certain what should be proposed for the older releases of Python.
Adam's Patch is a good one to specify explicitly specify the separator, but it changes the expectations in our test cases and is not backwards compatible.
Victor / Marc-Andre: Need your recommendation here.
|
msg386785 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2021-02-10 15:14 |
Ping. This issue has been delaying 3.7.x and 3.6.x security releases. I would prefer to have it resolved before releasing.
|
msg386787 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-10 15:40 |
Sorry for that, Ned. I will take a decision on this by Saturday (13-Feb).
I did some research, but could come way conclusively. I have not heard any opinions (+ves or -ves) on this. This will be a breaking change, so necessary to support it with documentation, alerts etc.
|
msg386788 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-10 15:41 |
I meant, "I did some research, but couldn't come away conclusively".
|
msg386954 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-14 15:27 |
I finished reviewing this PR https://github.com/python/cpython/pull/24297
With the contexts given in W3C recommendation, Synk.io Security Report and pattern of usage in libraries like werkzeug and bottle, instead of ignoring this and letting this behavior be handled at proxy software level, addressing this in stdlib as safe-guard seems like a much better choice to me.
The change and the approach taken by Adam's patch looks good to me. I have requested for documentation updates and news entry and it will be merged for Python 3.10 and ported to earlier versions.
- Fixing this in 3.10 is going to break behavior of software which relied on both "&" and ";" as query parameter separator. Only a single separator will be allowed, and it will default to &. This will be mentioned in documentation.
- As we back-port this to security releases of python, a rationale can be added on this change. The documentation or news entry could help developers with their plans to upgrade.
|
msg386957 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-02-14 17:35 |
I also have concerns about specifics of the implementation (see PR) and in general the behaviour change in point releases. Maybe have a thread on python-dev?
|
msg386960 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-14 18:26 |
Éric, I considered the possibility of bringing it in python-dev, but thought it could be decided in this ticket itself.
1. This was already brought up by multiple Release Managers in Python-dev, and some conversation seems to have happened there previously, especially regarding backwards incompatiblity. Ofcourse, we didn't debate the implementation, but debating that seems to better to focused here and in OR. On wider group, we only to acknowledge that a backwards incompatibility is introduced.
2. Other interested core-devs seems to have given shared their thoughts early in the bug too.
So, once I reviewed these, I thought, it seems to okay for us to make a decision here. If there is anything particular you wanted to bring, we could.
|
msg386968 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-14 22:42 |
New changeset fcbe0cb04d35189401c0c880ebfb4311e952d776 by Adam Goldschmidt in branch 'master':
bpo-42967: only use '&' as a query string separator (#24297)
https://github.com/python/cpython/commit/fcbe0cb04d35189401c0c880ebfb4311e952d776
|
msg386980 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-02-15 08:13 |
I agree with changing the default in Python 3.6-3.10.
|
msg387027 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-15 17:00 |
New changeset a2f0654b0a5b4c4f726155620002cc1f5f2d206a by Ken Jin in branch 'master':
bpo-42967: Fix urllib.parse docs and make logic clearer (GH-24536)
https://github.com/python/cpython/commit/a2f0654b0a5b4c4f726155620002cc1f5f2d206a
|
msg387037 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-15 18:03 |
New changeset c9f07813ab8e664d8c34413c4fc2d4f86c061a92 by Senthil Kumaran in branch '3.9':
[3.9] bpo-42967: only use '&' as a query string separator (GH-24297) (#24528)
https://github.com/python/cpython/commit/c9f07813ab8e664d8c34413c4fc2d4f86c061a92
|
msg387039 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2021-02-15 18:15 |
New changeset e3110c3cfbb7daa690d54d0eff6c264c870a71bf by Senthil Kumaran in branch '3.8':
[3.8] bpo-42967: only use '&' as a query string separator (GH-24297) (#24529)
https://github.com/python/cpython/commit/e3110c3cfbb7daa690d54d0eff6c264c870a71bf
|
msg387040 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2021-02-15 18:34 |
New changeset d0d4d30882fe3ab9b1badbecf5d15d94326fd13e by Senthil Kumaran in branch '3.7':
[3.7] bpo-42967: only use '&' as a query string separator (GH-24297) (GH-24531)
https://github.com/python/cpython/commit/d0d4d30882fe3ab9b1badbecf5d15d94326fd13e
|
msg387045 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2021-02-15 19:16 |
New changeset 5c17dfc5d70ce88be99bc5769b91ce79d7a90d61 by Senthil Kumaran in branch '3.6':
[3.6] bpo-42967: only use '&' as a query string separator (GH-24297) (GH-24532)
https://github.com/python/cpython/commit/5c17dfc5d70ce88be99bc5769b91ce79d7a90d61
|
msg387049 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-02-15 19:34 |
This is resolved in all version of Python now.
Thank you all for your contributions!
|
msg387069 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-02-15 22:42 |
I created https://python-security.readthedocs.io/vuln/urllib-query-string-semicolon-separator.html to track fixes of this vulnerability.
|
msg387638 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2021-02-24 20:15 |
FYI - This was somewhat of an unfortuate API change. I'm coming across code that relies on ; also being treated as a separator by parse_qs(). That code is now broken with no easy way around it.
And I'm only seeing things lucky enough to have an explicit test that happens to rely in some way on that behavior. How much code doesn't?
It's been a mix of some clearly broken code (ex & appearing in the URI being parsed) and code where it is not so immediately obvious if there is a problem or not (up to the code owners to dive in and figure that out...).
The workarounds for people implementing "fixes" to previously working as intended rather than "oops that was a html charref" code are annoying. Our new separator= parameter does not allow one to achieve the previous behavior if mixing and matching & And ; was intended to be allowed, as it is a single separator rather than a set of separators.
For security fixes, a way for people to explicitly opt-in to now-deemed-undesirable-by-default behavior they got from the API is desirable. We failed to provide that here.
Just a heads up with no suggested remediation for now. I'm still unsure how big a problem this will turn out to be or not or if it is identifying actual worthwhile issues in code. It's certainly a headache for a few.
|
msg387712 - (view) |
Author: Matej Cepl (mcepl) * |
Date: 2021-02-26 08:20 |
> FYI - This was somewhat of an unfortuate API change. I'm coming across code that relies on ; also being treated as a separator by parse_qs(). That code is now broken with no easy way around it.
So far, we at openSUSE had to package at least SQLAlchemy, Twisted, yarl and furl. The author of the first one acknowledged use of semicolon as a bug. I don't think it was so bad.
|
msg387735 - (view) |
Author: Matej Cepl (mcepl) * |
Date: 2021-02-26 18:05 |
Port of the patch to 2.7.18.
|
msg387756 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2021-02-27 00:59 |
An example code snippet to detect if the API supports the new parameter at runtime for code that wants to use to use something other than the default '&'.
```
if 'separator' in inspect.signature(urllib.parse.parse_qs).parameters:
... parse_qs(..., separator=';')
else:
... parse_qs(...)
```
calling it with the arg and catching TypeError if that fails would also work, but might not be preferred as catching things like TypeError is non-specific and could hide other problems, making it a code maintenance headache.
|
msg388368 - (view) |
Author: Riccardo Schirone (rschiron) |
Date: 2021-03-09 16:04 |
This CVE was reported against Python, however it does not seem to be Python's fault for supporting the `;` separator, which was a valid separator for older standards.
@AdamGold for this issue to become a real security problem, it seems that the proxy has to be configured to ignore certain parameters in the query. For NGINX and Varnish proxies mentioned in the article it seems that by default they use the entire request path, host included, and other things as cache key. For NGINX in particular I could find some snippets online to manipulate the query arguments and split them in arguments, so to remove the "utm_*" arguments, however this does not seem a standard(or at least default) behaviour, nor something easily supported.
I think that if that is the case and a user has to go out of his way to configure the (wrong) splitting of arguments in the proxy, it is not fair to blame python for accepting `;` as separator and assigning a CVE against it may cause confusion.
For distributions this is problematic as they have 2 choices:
1) "fix" python but with the risk of breaking user's programs/scripts relying on the previous API
2) keep older version/unpatched python so that user's programs still work, but with a python version "vulnerable" to this CVE.
None of these options is really ideal, especially if the problem is somewhere else.
@AdamGold Could you elaborate a bit more on how common it is and how much configuration is required for proxies to make `;` a problem in python?
|
msg388433 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2021-03-10 13:51 |
With the fix, parse_qs[l] doesn't handle bytes separators correctly.
There is an explicit type check for str/bytes:
if not separator or (not isinstance(separator, (str, bytes))):
raise ValueError("Separator must be of type string or bytes.")
but a bytes separator fails further down:
>>> import urllib.parse
>>> urllib.parse.parse_qs('a=1,b=2', separator=b',')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pviktori/dev/cpython/Lib/urllib/parse.py", line 695, in parse_qs
pairs = parse_qsl(qs, keep_blank_values, strict_parsing,
File "/home/pviktori/dev/cpython/Lib/urllib/parse.py", line 748, in parse_qsl
pairs = [s1 for s1 in qs.split(separator)]
TypeError: must be str or None, not bytes
|
msg388434 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-03-10 13:56 |
Petr, thank you. Let's treat it as a new issue linked to this.
|
msg388440 - (view) |
Author: Riccardo Schirone (rschiron) |
Date: 2021-03-10 15:57 |
> So far, we at openSUSE had to package at least SQLAlchemy, Twisted, yarl and furl. The author of the first one acknowledged use of semicolon as a bug. I don't think it was so bad.
Did you upstream fixes for those packages?
Asking because if this is considered a vulnerability in Python, it should be considered a vulnerability for every other tool/library that accept `;` as separator. For example, Twisted seems to have a parse_qs method in web/http.py file that splits by both `;` and `&`.
Again, I feel like we are blaming the wrong piece of the stack, unless proxies are usually ignoring some arguments (e.g. utm_*) as part of the cache key, by default or in a very easy way.
|
msg388447 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2021-03-10 17:40 |
Riccardo - FWIW I agree, the wrong part of the stack was blamed and a CVE was wrongly sought for against CPython on this one.
It's sewage under the bridge at this point. The API change has shipped in several different stable releases and thus is something virtually Python all code must now deal with.
Why was this a bad change to make? Python's parse_qsl obeyed the prevailing HTML 4 standard at the time it was written:
https://www.w3.org/TR/html401/appendix/notes.html#ampersands-in-uris
'''
We recommend that HTTP server implementors, and in particular, CGI implementors support the use of ";" in place of "&"
'''
That turns out to have been bad advice in the standard. 15 years later the html5 standard quoted in Adam's snyk blog post links to its text on this which leaves no room for that interpretation.
In that light, the correct thing to do for this issue would be to:
* Make the default behavior change in 3.10 match the html5 standard [done].
* Document that it matches the html4 standard in 3.9 and earlier without changing their default behavior [oops, too late, not done].
* While adding the ability to allow applications to select the stricter behavior on those older versions. [only sort of done, and somewhat too late now that the strict version has already shipped as stable]
Afterall, the existence of html5 didn't magically fix all of the html and web applications written in the two decades of web that came before it. Ask any browser author...
|
msg388486 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2021-03-11 08:24 |
There's another part of the new implementation that looks a bit fishy: the `separator` argument now allows multi-character strings, so you can parse 'a=1<SPLIT>b=2' with separator='<SPLIT>'.
Was this intentional?
|
msg388574 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-03-13 00:52 |
Petr,
On
> the `separator` argument now allows multi-character strings, so you can parse 'a=1<SPLIT>b=2' with separator='<SPLIT>'. Was this intentional?
No, this was not intentional. The separator arg was just coice, for compatibility, if some wanted to use `;` like the some URLs that were shared as use case. We didn't restrict about what was allowed or length of the separator.
|
msg390782 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-04-11 13:26 |
New changeset b38601d49675d90e1ee6faa47f7adaeca992d02d by Ken Jin in branch 'master':
bpo-42967: coerce bytes separator to string in urllib.parse_qs(l) (#24818)
https://github.com/python/cpython/commit/b38601d49675d90e1ee6faa47f7adaeca992d02d
|
msg390784 - (view) |
Author: miss-islington (miss-islington) |
Date: 2021-04-11 13:49 |
New changeset 6ec2fb42f93660810952388e5c4018c197c17c8c by Miss Islington (bot) in branch '3.9':
bpo-42967: coerce bytes separator to string in urllib.parse_qs(l) (GH-24818)
https://github.com/python/cpython/commit/6ec2fb42f93660810952388e5c4018c197c17c8c
|
msg390790 - (view) |
Author: Matej Cepl (mcepl) * |
Date: 2021-04-11 19:14 |
> Did you upstream fixes for those packages?
Of course we did. Upstream first!
|
msg391231 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2021-04-16 17:07 |
New changeset d5b80eb11b4812b4a579ce129ba4a10c5f5d27f6 by Miss Islington (bot) in branch '3.8':
bpo-42967: coerce bytes separator to string in urllib.parse_qs(l) (GH-24818) (#25345)
https://github.com/python/cpython/commit/d5b80eb11b4812b4a579ce129ba4a10c5f5d27f6
|
msg405721 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-11-04 14:47 |
erlandaasland you’ve been editing closed issues today (got messages from at least 2). maybe submitting old browser tabs with obsolete form data?
|
msg405723 - (view) |
Author: Erlend E. Aasland (erlendaasland) * |
Date: 2021-11-04 14:53 |
Yes, cleaning up ahmedsayeed1982 spam. I did my best to revert the nosy list, component, versions, and assigned to changes. What did I mess up?
|
msg405725 - (view) |
Author: Erlend E. Aasland (erlendaasland) * |
Date: 2021-11-04 15:01 |
See bpo-12168 for a similar cleanup by Eryk Sun. There was approx. 20 spammed issues. Eryk fixed most of them; I did a couple.
|
msg405728 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2021-11-04 15:42 |
See the changelog entry for 2021-11-04 10:31:24 (and the other ticket where Guido just commented)
(and thanks for cleaning spam!)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:40 | admin | set | github: 87133 |
2021-11-08 16:47:04 | vstinner | set | nosy:
- vstinner
|
2021-11-04 15:42:23 | eric.araujo | set | messages:
+ msg405728 |
2021-11-04 15:01:03 | erlendaasland | set | messages:
+ msg405725 |
2021-11-04 14:53:51 | erlendaasland | set | messages:
+ msg405723 |
2021-11-04 14:47:40 | eric.araujo | set | nosy:
+ erlendaasland messages:
+ msg405721
|
2021-11-04 14:31:24 | erlendaasland | set | nosy:
+ vstinner, ned.deily, pablogsal, serhiy.storchaka, miss-islington, mcepl, petr.viktorin, rschiron, eric.araujo, lemburg, gregory.p.smith, kj, orsenthil, AdamGold, - ahmedsayeed1982
versions:
+ Python 3.6, Python 3.7, Python 3.9, Python 3.10 |
2021-11-04 14:30:16 | erlendaasland | set | messages:
- msg405709 |
2021-11-04 12:12:30 | ahmedsayeed1982 | set | nosy:
+ ahmedsayeed1982, - lemburg, gregory.p.smith, orsenthil, vstinner, ned.deily, mcepl, eric.araujo, petr.viktorin, serhiy.storchaka, miss-islington, rschiron, kj, AdamGold
messages:
+ msg405709 versions:
- Python 3.6, Python 3.7, Python 3.9, Python 3.10 |
2021-04-16 17:07:48 | orsenthil | set | messages:
+ msg391231 |
2021-04-11 19:14:37 | mcepl | set | messages:
+ msg390790 |
2021-04-11 13:49:42 | miss-islington | set | messages:
+ msg390784 |
2021-04-11 13:26:58 | miss-islington | set | pull_requests:
+ pull_request24079 |
2021-04-11 13:26:50 | miss-islington | set | nosy:
+ miss-islington
pull_requests:
+ pull_request24078 |
2021-04-11 13:26:15 | orsenthil | set | messages:
+ msg390782 |
2021-03-13 00:52:40 | orsenthil | set | messages:
+ msg388574 |
2021-03-11 08:24:52 | petr.viktorin | set | messages:
+ msg388486 |
2021-03-10 17:40:30 | gregory.p.smith | set | messages:
+ msg388447 |
2021-03-10 15:57:50 | rschiron | set | messages:
+ msg388440 |
2021-03-10 14:57:59 | kj | set | pull_requests:
+ pull_request23584 |
2021-03-10 13:56:13 | orsenthil | set | messages:
+ msg388434 |
2021-03-10 13:51:48 | petr.viktorin | set | nosy:
+ petr.viktorin messages:
+ msg388433
|
2021-03-09 16:04:50 | rschiron | set | nosy:
+ rschiron messages:
+ msg388368
|
2021-02-27 00:59:33 | gregory.p.smith | set | messages:
+ msg387756 |
2021-02-26 18:05:18 | mcepl | set | files:
+ CVE-2021-23336-only-amp-as-query-sep.patch
messages:
+ msg387735 |
2021-02-26 08:20:03 | mcepl | set | nosy:
+ mcepl messages:
+ msg387712
|
2021-02-24 20:15:53 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages:
+ msg387638
|
2021-02-15 22:42:10 | vstinner | set | messages:
+ msg387069 |
2021-02-15 19:34:55 | orsenthil | set | status: open -> closed title: [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator -> [CVE-2021-23336] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator messages:
+ msg387049
resolution: fixed stage: patch review -> resolved |
2021-02-15 19:16:51 | ned.deily | set | messages:
+ msg387045 |
2021-02-15 18:34:20 | ned.deily | set | messages:
+ msg387040 |
2021-02-15 18:15:11 | lukasz.langa | set | messages:
+ msg387039 |
2021-02-15 18:03:42 | orsenthil | set | messages:
+ msg387037 |
2021-02-15 17:00:29 | orsenthil | set | messages:
+ msg387027 |
2021-02-15 15:11:10 | kj | set | pull_requests:
+ pull_request23323 |
2021-02-15 08:13:16 | vstinner | set | messages:
+ msg386980 |
2021-02-15 03:05:50 | orsenthil | set | pull_requests:
+ pull_request23319 |
2021-02-15 02:33:34 | orsenthil | set | pull_requests:
+ pull_request23318 |
2021-02-15 02:01:38 | orsenthil | set | pull_requests:
+ pull_request23316 |
2021-02-15 01:38:58 | orsenthil | set | pull_requests:
+ pull_request23315 |
2021-02-14 22:42:10 | orsenthil | set | messages:
+ msg386968 |
2021-02-14 18:26:29 | orsenthil | set | messages:
+ msg386960 |
2021-02-14 17:35:01 | eric.araujo | set | messages:
+ msg386957 |
2021-02-14 15:27:08 | orsenthil | set | messages:
+ msg386954 |
2021-02-10 15:41:52 | orsenthil | set | messages:
+ msg386788 |
2021-02-10 15:40:33 | orsenthil | set | messages:
+ msg386787 |
2021-02-10 15:14:53 | ned.deily | set | messages:
+ msg386785 |
2021-01-31 01:21:35 | orsenthil | set | assignee: orsenthil |
2021-01-31 01:21:25 | orsenthil | set | messages:
+ msg386003 |
2021-01-28 14:45:46 | ned.deily | set | priority: normal -> release blocker nosy:
+ ned.deily, lukasz.langa messages:
+ msg385865
|
2021-01-24 20:05:42 | eric.araujo | set | messages:
+ msg385590 |
2021-01-24 17:15:38 | AdamGold | set | messages:
+ msg385585 |
2021-01-24 14:51:09 | kj | set | messages:
+ msg385582 |
2021-01-23 22:28:26 | eric.araujo | set | messages:
+ msg385567 |
2021-01-23 22:25:20 | AdamGold | set | messages:
+ msg385566 |
2021-01-23 22:22:17 | AdamGold | set | messages:
+ msg385565 |
2021-01-23 17:12:21 | eric.araujo | set | messages:
+ msg385549 |
2021-01-23 16:42:00 | kj | set | messages:
+ msg385544 |
2021-01-23 10:13:39 | kj | set | messages:
+ msg385527 |
2021-01-22 20:55:57 | eric.araujo | set | nosy:
+ eric.araujo messages:
+ msg385513 components:
+ Library (Lib), - C API
|
2021-01-22 13:18:52 | AdamGold | set | messages:
+ msg385497 |
2021-01-22 12:58:27 | orsenthil | set | messages:
+ msg385496 title: [security] urllib.parse.parse_qsl(): Web cache poisoning - `;` as a query args separator -> [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator |
2021-01-22 12:53:33 | kj | set | messages:
+ msg385495 |
2021-01-22 12:34:49 | AdamGold | set | pull_requests:
+ pull_request23120 |
2021-01-21 04:06:30 | orsenthil | set | nosy:
+ orsenthil
|
2021-01-20 16:06:30 | kj | set | messages:
+ msg385352 |
2021-01-20 15:20:51 | kj | set | keywords:
+ patch nosy:
+ kj
pull_requests:
+ pull_request23094 stage: patch review |
2021-01-20 15:17:27 | vstinner | set | messages:
+ msg385346 |
2021-01-20 14:23:52 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg385344
|
2021-01-20 14:16:33 | vstinner | set | messages:
+ msg385342 |
2021-01-20 14:15:31 | lemburg | set | messages:
+ msg385341 |
2021-01-20 14:12:38 | vstinner | set | title: [security] Web cache poisoning - `;` as a query args separator -> [security] urllib.parse.parse_qsl(): Web cache poisoning - `;` as a query args separator |
2021-01-20 12:04:10 | lemburg | set | title: Web cache poisoning - `;` as a query args separator -> [security] Web cache poisoning - `;` as a query args separator |
2021-01-20 12:02:44 | lemburg | set | nosy:
+ lemburg
messages:
+ msg385337 title: [security] urllib.parse.parse_qsl(): Web cache poisoning - `;` as a query args separator -> Web cache poisoning - `;` as a query args separator |
2021-01-20 11:09:07 | vstinner | set | title: urllib.parse.parse_qsl(): Web cache poisoning - `;` as a query args separator -> [security] urllib.parse.parse_qsl(): Web cache poisoning - `;` as a query args separator |
2021-01-20 11:08:56 | vstinner | set | title: Web cache poisoning - `;` as a query args separator -> urllib.parse.parse_qsl(): Web cache poisoning - `;` as a query args separator |
2021-01-20 11:07:32 | vstinner | set | nosy:
+ vstinner messages:
+ msg385332
|
2021-01-20 11:06:54 | vstinner | link | issue42975 superseder |
2021-01-19 15:06:49 | AdamGold | create | |