Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2to3 fixers for missing codecs #62023

Closed
serhiy-storchaka opened this issue Apr 23, 2013 · 10 comments
Closed

2to3 fixers for missing codecs #62023

serhiy-storchaka opened this issue Apr 23, 2013 · 10 comments
Assignees
Labels
topic-2to3 type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 17823
Nosy @malemburg, @ncoghlan, @vstinner, @benjaminp, @ezio-melotti, @meadori, @vadmium, @serhiy-storchaka
Files
  • issue17823_lib2to3_fixer_for_binary_codecs.diff: Proof of concept that just implements the "encode" case
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2013-11-10.09:22:10.963>
    created_at = <Date 2013-04-23.18:11:54.352>
    labels = ['type-feature', 'expert-2to3']
    title = '2to3 fixers for missing codecs'
    updated_at = <Date 2013-11-10.09:22:10.961>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2013-11-10.09:22:10.961>
    actor = 'ncoghlan'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2013-11-10.09:22:10.963>
    closer = 'ncoghlan'
    components = ['2to3 (2.x to 3.x conversion tool)']
    creation = <Date 2013-04-23.18:11:54.352>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['32515']
    hgrepos = []
    issue_num = 17823
    keywords = ['patch']
    message_count = 10.0
    messages = ['187662', '187766', '187767', '187768', '187772', '202263', '202266', '202267', '202328', '202514']
    nosy_count = 8.0
    nosy_names = ['lemburg', 'ncoghlan', 'vstinner', 'benjamin.peterson', 'ezio.melotti', 'meador.inge', 'martin.panter', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue17823'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @serhiy-storchaka
    Copy link
    Member Author

    Quoting Victor Stinner from msg106669:

    """
    It's maybe possible for write some 2to3 fixers for the following examples:

    "...".encode("base64") => base64.b64encode("...")
    "...".encode("rot13") => do nothing (but display a warning?)
    "...".encode("zlib") => zlib.compress("...")
    "...".encode("hex") => base64.b16encode("...")
    "...".encode("bz2") => bz2.compress("...")

    "...".decode("base64") => base64.b64decode("...")
    "...".decode("rot13") => do nothing (but display a warning?)
    "...".decode("zlib") => zlib.decompress("...")
    "...".decode("hex") => base64.b16decode("...")
    "...".decode("bz2") => bz2.decompress("...")
    """

    Unfortunately I don't know where is the syntax for writing fixers.

    @serhiy-storchaka serhiy-storchaka added topic-2to3 type-feature A feature request or enhancement labels Apr 23, 2013
    @serhiy-storchaka serhiy-storchaka changed the title 2to3 fixers for 2to3 fixers for missing codecs Apr 23, 2013
    @ncoghlan
    Copy link
    Contributor

    A more consistent alternative conversion:

    "...".encode("base64") => codecs.encode("...", "base64_codec")
    "...".encode("rot13") => codecs.encode("...", "rot_13")
    "...".encode("zlib") => codecs.encode("...", "zlib_codec")
    "...".encode("hex") => codecs.encode("...", "hex_codec")
    "...".encode("bz2") => codecs.encode("...", "bz2_codec")

    "...".decode("base64") => codecs.decode("...", "base64_codec")
    "...".decode("rot13") => codecs.decode("...", "rot_13")
    "...".decode("zlib") => codecs.decode("...", "zlib_codec")
    "...".decode("hex") => codecs.decode("...", "hex_codec")
    "...".decode("bz2") => codecs.decode("...", "bz2_codec")

    @malemburg
    Copy link
    Member

    On 25.04.2013 10:14, Nick Coghlan wrote:

    Nick Coghlan added the comment:

    A more consistent alternative conversion:

    "...".encode("base64") => codecs.encode("...", "base64_codec")
    "...".encode("rot13") => codecs.encode("...", "rot_13")
    "...".encode("zlib") => codecs.encode("...", "zlib_codec")
    "...".encode("hex") => codecs.encode("...", "hex_codec")
    "...".encode("bz2") => codecs.encode("...", "bz2_codec")

    "...".decode("base64") => codecs.decode("...", "base64_codec")
    "...".decode("rot13") => codecs.decode("...", "rot_13")
    "...".decode("zlib") => codecs.decode("...", "zlib_codec")
    "...".decode("hex") => codecs.decode("...", "hex_codec")
    "...".decode("bz2") => codecs.decode("...", "bz2_codec")

    It would be better to add back the aliases we had for these codecs.

    @ncoghlan
    Copy link
    Contributor

    Sure, that's what bpo-7475 is about, and if we do that, then the fixers can be simplified to just replace the method with the function call for the known set of non-text-model related codecs.

    However, I also wanted to make a note of what the fixers should look like for a version of the fixer that can provide compatibility with 3.2+ rather than relying on the aliases being restored in 3.4

    @serhiy-storchaka
    Copy link
    Member Author

    A more consistent alternative conversion:

    What advantages codecs.encode("...", "base64_codec") has comparing with base64.b64encode("...")? The latter is at least more portable and powerfull (it allows to specify altchars).

    I think that main problem with bpo-7475 is that people don't think about a different (actually the most obvious) way to do base64 encoding.

    @ncoghlan ncoghlan self-assigned this Nov 6, 2013
    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 6, 2013

    Switch direction of dependency to make this fixer rely on restoring the codec aliases in bpo-7475 first.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 6, 2013

    Attached diff shows a proof of concept fixer for the encode case.

    It could be adjusted fairly easily to also handle decode methods (by including an alternative in the pattern and also capturing the method name)

    I'm sure how useful such a fixer would be in practice, though, since it only triggers when the codec name is passed as a literal - passing in a variable instead keeps it from firing.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 6, 2013

    On 7 November 2013 00:06, Nick Coghlan <report@bugs.python.org> wrote:

    I'm sure how useful such a fixer would be in practice, though, since it only triggers when the codec name is passed as a literal - passing in a variable instead keeps it from firing.

    Oops, that should say "I'm *not* sure" :)

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 7, 2013

    After thinking about this some more, perhaps a -3 warning in 2.7 would be a
    better solution? That would be more robust, as it could complain any time
    unicode.encode produced unicode and str.decode produced str and point users
    to the codecs module level functions as a forward compatible alternative.

    Producing Py3k warnings when calling unicode.decode and str.encode under -3
    would also be appropriate (although those warnings may already exist).

    @ncoghlan
    Copy link
    Contributor

    Due to the data driven nature of this particular incompatibility, I'm rejecting this in favour of the Py3k warning based approach in bpo-19543.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-2to3 type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants