classification
Title: 2to3 fixers for missing codecs
Type: enhancement Stage: resolved
Components: 2to3 (2.x to 3.x conversion tool) Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: benjamin.peterson, ezio.melotti, lemburg, martin.panter, meador.inge, ncoghlan, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2013-04-23 18:11 by serhiy.storchaka, last changed 2013-11-10 09:22 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
issue17823_lib2to3_fixer_for_binary_codecs.diff ncoghlan, 2013-11-06 14:06 Proof of concept that just implements the "encode" case review
Messages (10)
msg187662 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-23 18:11
Quoting Victor Stinner from msg106669:

"""
It's maybe possible for write some 2to3 fixers for the following examples:

"...".encode("base64") => base64.b64encode("...")
"...".encode("rot13") => do nothing (but display a warning?)
"...".encode("zlib") => zlib.compress("...")
"...".encode("hex") => base64.b16encode("...")
"...".encode("bz2") => bz2.compress("...")

"...".decode("base64") => base64.b64decode("...")
"...".decode("rot13") => do nothing (but display a warning?)
"...".decode("zlib") => zlib.decompress("...")
"...".decode("hex") => base64.b16decode("...")
"...".decode("bz2") => bz2.decompress("...")
"""

Unfortunately I don't know where is the syntax for writing fixers.
msg187766 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-04-25 08:14
A more consistent alternative conversion:

"...".encode("base64") => codecs.encode("...", "base64_codec")
"...".encode("rot13") => codecs.encode("...", "rot_13")
"...".encode("zlib") => codecs.encode("...", "zlib_codec")
"...".encode("hex") => codecs.encode("...", "hex_codec")
"...".encode("bz2") => codecs.encode("...", "bz2_codec")

"...".decode("base64") => codecs.decode("...", "base64_codec")
"...".decode("rot13") => codecs.decode("...", "rot_13")
"...".decode("zlib") => codecs.decode("...", "zlib_codec")
"...".decode("hex") => codecs.decode("...", "hex_codec")
"...".decode("bz2") => codecs.decode("...", "bz2_codec")
msg187767 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2013-04-25 08:18
On 25.04.2013 10:14, Nick Coghlan wrote:
> 
> Nick Coghlan added the comment:
> 
> A more consistent alternative conversion:
> 
> "...".encode("base64") => codecs.encode("...", "base64_codec")
> "...".encode("rot13") => codecs.encode("...", "rot_13")
> "...".encode("zlib") => codecs.encode("...", "zlib_codec")
> "...".encode("hex") => codecs.encode("...", "hex_codec")
> "...".encode("bz2") => codecs.encode("...", "bz2_codec")
> 
> "...".decode("base64") => codecs.decode("...", "base64_codec")
> "...".decode("rot13") => codecs.decode("...", "rot_13")
> "...".decode("zlib") => codecs.decode("...", "zlib_codec")
> "...".decode("hex") => codecs.decode("...", "hex_codec")
> "...".decode("bz2") => codecs.decode("...", "bz2_codec")

It would be better to add back the aliases we had for these codecs.
msg187768 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-04-25 08:28
Sure, that's what issue 7475 is about, and if we do that, then the fixers can be simplified to just replace the method with the function call for the known set of non-text-model related codecs.

However, I also wanted to make a note of what the fixers should look like for a version of the fixer that can provide compatibility with 3.2+ rather than relying on the aliases being restored in 3.4
msg187772 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-25 11:00
> A more consistent alternative conversion:

What advantages `codecs.encode("...", "base64_codec")` has comparing with `base64.b64encode("...")`? The latter is at least more portable and powerfull (it allows to specify altchars).

I think that main problem with issue 7475 is that people don't think about a different (actually the most obvious) way to do base64 encoding.
msg202263 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-06 12:40
Switch direction of dependency to make this fixer rely on restoring the codec aliases in issue 7475 first.
msg202266 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-06 14:06
Attached diff shows a proof of concept fixer for the encode case.

It could be adjusted fairly easily to also handle decode methods (by including an alternative in the pattern and also capturing the method name)

I'm sure how useful such a fixer would be in practice, though, since it only triggers when the codec name is passed as a literal - passing in a variable instead keeps it from firing.
msg202267 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-06 14:18
On 7 November 2013 00:06, Nick Coghlan <report@bugs.python.org> wrote:
> I'm sure how useful such a fixer would be in practice, though, since it only triggers when the codec name is passed as a literal - passing in a variable instead keeps it from firing.

Oops, that should say "I'm *not* sure" :)
msg202328 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-07 12:23
After thinking about this some more, perhaps a -3 warning in 2.7 would be a
better solution? That would be more robust, as it could complain any time
unicode.encode produced unicode and str.decode produced str and point users
to the codecs module level functions as a forward compatible alternative.

Producing Py3k warnings when calling unicode.decode and str.encode under -3
would also be appropriate (although those warnings may already exist).
msg202514 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-10 09:22
Due to the data driven nature of this particular incompatibility, I'm rejecting this in favour of the Py3k warning based approach in issue 19543.
History
Date User Action Args
2013-11-10 09:22:10ncoghlansetstatus: open -> closed
messages: + msg202514

dependencies: - codecs missing: base64 bz2 hex zlib hex_codec ...
resolution: rejected
stage: resolved
2013-11-08 05:24:57martin.pantersetnosy: + martin.panter
2013-11-07 12:37:06ezio.melottisetnosy: + ezio.melotti
2013-11-07 12:23:12ncoghlansetmessages: + msg202328
2013-11-06 15:17:28meador.ingesetnosy: + meador.inge
2013-11-06 14:18:45ncoghlansetmessages: + msg202267
2013-11-06 14:06:21ncoghlansetfiles: + issue17823_lib2to3_fixer_for_binary_codecs.diff
keywords: + patch
messages: + msg202266
2013-11-06 12:41:41ncoghlanunlinkissue7475 dependencies
2013-11-06 12:40:42ncoghlansetdependencies: + codecs missing: base64 bz2 hex zlib hex_codec ...
messages: + msg202263
2013-11-06 12:23:43ncoghlansetassignee: ncoghlan
2013-04-25 11:00:00serhiy.storchakasetmessages: + msg187772
2013-04-25 08:28:01ncoghlansetmessages: + msg187768
2013-04-25 08:18:04lemburgsetnosy: + lemburg
messages: + msg187767
2013-04-25 08:14:29ncoghlansetnosy: + ncoghlan
messages: + msg187766
2013-04-25 07:53:34serhiy.storchakalinkissue7475 dependencies
2013-04-23 18:13:33serhiy.storchakasettitle: 2to3 fixers for -> 2to3 fixers for missing codecs
2013-04-23 18:12:25serhiy.storchakasetnosy: + vstinner
2013-04-23 18:11:54serhiy.storchakacreate