Message 328786 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Henry Zhu
Recipients	Henry Zhu
Date	2018-10-29.03:55:53
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1540785354.09.0.788709270274.issue35100@psf.upfronthosting.co.za>
In-reply-to

Content
`urllib.parse.unquote_to_bytes` should have an "escape plus" option, just like `urllib.parse.unquote_plus` does. It's very necessary in some cases: ``` # Say I have a url string: 'a+%2b%c0'. # In Python2, I can parse it into b'a +\xc0' with urllib.unquote_plus. # Note that the first "+" was escaped into space, and the second "+" was decoded from "%2b". # But in Python3, this just can't be done, either with urllib.parse.unquote, urllib.par.unquote_plus or urllib.parse.unquote_to_bytes. # This is the example: >>> from urllib import parse >>> s = 'a+%2b%c0' >>> parse.unquote(s) 'a++�' >>> parse.unquote_plus(s) 'a +�' >>> parse.unquote_to_bytes(s) b'a++\xc0' ``` PS: the character "�" should be "À", but it can't be shown in command line. The result of `urllib.parse.unquote_to_bytes` is almost what I want, except that it doesn't escape the first "+" into space.

`urllib.parse.unquote_to_bytes` should have an "escape plus" option, just like `urllib.parse.unquote_plus` does. 

It's very necessary in some cases:

```
# Say I have a url string: 'a+%2b%c0'. 
# In Python2, I can parse it into b'a +\xc0' with urllib.unquote_plus.
# Note that the first "+" was escaped into space, and the second "+" was decoded from "%2b".
# But in Python3, this just can't be done, either with urllib.parse.unquote, urllib.par.unquote_plus or urllib.parse.unquote_to_bytes.
# This is the example:

>>> from urllib import parse
>>> s = 'a+%2b%c0'
>>> parse.unquote(s)
'a++�'
>>> parse.unquote_plus(s)
'a +�'
>>> parse.unquote_to_bytes(s)
b'a++\xc0'
```

PS: the character "�" should be "À", but it can't be shown in command line.

The result of `urllib.parse.unquote_to_bytes` is almost what I want, except that it doesn't escape the first "+" into space.

History
Date	User	Action	Args
2018-10-29 03:55:54	Henry Zhu	set	recipients: + Henry Zhu
2018-10-29 03:55:54	Henry Zhu	set	messageid: <1540785354.09.0.788709270274.issue35100@psf.upfronthosting.co.za>
2018-10-29 03:55:54	Henry Zhu	link	issue35100 messages
2018-10-29 03:55:53	Henry Zhu	create