Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binascii b2a functions accept strings (unicode) as data #48637

Closed
terryjreedy opened this issue Nov 22, 2008 · 7 comments
Closed

binascii b2a functions accept strings (unicode) as data #48637

terryjreedy opened this issue Nov 22, 2008 · 7 comments
Labels
extension-modules C modules in the Modules dir release-blocker

Comments

@terryjreedy
Copy link
Member

BPO 4387
Nosy @loewis, @warsaw, @birkenfeld, @terryjreedy, @pitrou
Files
  • reqbytes.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-12-02.06:00:34.774>
    created_at = <Date 2008-11-22.00:41:18.708>
    labels = ['extension-modules', 'release-blocker']
    title = 'binascii b2a functions accept strings (unicode) as data'
    updated_at = <Date 2008-12-02.06:00:34.735>
    user = 'https://github.com/terryjreedy'

    bugs.python.org fields:

    activity = <Date 2008-12-02.06:00:34.735>
    actor = 'loewis'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-12-02.06:00:34.774>
    closer = 'loewis'
    components = ['Extension Modules']
    creation = <Date 2008-11-22.00:41:18.708>
    creator = 'terry.reedy'
    dependencies = []
    files = ['12167']
    hgrepos = []
    issue_num = 4387
    keywords = ['patch', 'needs review']
    message_count = 7.0
    messages = ['76226', '76233', '76628', '76629', '76639', '76662', '76724']
    nosy_count = 5.0
    nosy_names = ['loewis', 'barry', 'georg.brandl', 'terry.reedy', 'pitrou']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue4387'
    versions = ['Python 3.0']

    @terryjreedy
    Copy link
    Member Author

    Binascii b2a_xxx functions accept 'binary data' and return ascii-encoded
    bytes. The corresponding a2b_xxx functions turn the ascii-encoded bytes
    back to 'binary data' (bytes). If the binary data is bytes, these
    should be inverses of each other.

    Somewhat surprisingly to me (because the corresponding base64 module
    functions raise "TypeError: expected bytes, not str") 3.0 strings
    (unicode) are accepted as 'binary data', though they will not 'round-trip'.

    Ascii chars almost do
    >>> a='aaaa'
    >>> c=b.b2a_base64(a)
    >>> c
    b'YWFhYQ==\n'
    >>> d=b.a2b_base64(c)
    >>> d
    b'aaaa'
    
    But general unicode chars generate nonsense.
    >>> a='\u1000'
    >>> c=b.b2a_base64(a)
    >>> c
    b'4YCA\n'
    >>> d=b.a2b_base64(c)
    >>> d
    b'\xe1\x80\x80'

    I also tried b2a_uu.

    Is this a bug?

    @terryjreedy terryjreedy added the extension-modules C modules in the Modules dir label Nov 22, 2008
    @birkenfeld
    Copy link
    Member

    I vote yes.

    @pitrou
    Copy link
    Member

    pitrou commented Nov 29, 2008

    It's not /exactly/ nonsense, it seems to assume an utf8 encoding pass is
    necessary:

    >>> b'\xe1\x80\x80'.decode('utf8') == '\u1000'
    True

    IMO, while accepting unicode strings instead of bytes for the a2b_xx
    functions is understandable (because in practice only ASCII characters
    are allowed), it is not acceptable for b2a_xx functions to accept
    unicode strings instead of bytes.

    In other words, it might/should be ok for
    binascii.a2b_base64('YWFh\n') to return the same as
    binascii.a2b_base64('YWFh\n') (that is, b'aaa'), but
    binascii.b2a_base64('aaa') should raise a TypeError rather than
    applying an utf8 encoding pass before doing the actual b2a encoding.

    I think this must be fixed before 3.0 final, and is therefore a release
    blocker.

    @pitrou
    Copy link
    Member

    pitrou commented Nov 29, 2008

    Hmm, I obviously meant:

    [...] In other words, it might/should be ok for
    binascii.a2b_base64('YWFh\n') to return the same as
    binascii.a2b_base64(b'YWFh\n') (that is, b'aaa') [...]

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Nov 30, 2008

    Here is a patches that fixes the problem.

    Notice that this affects the email API; base64mime.body_encode now also
    requires bytes (whereas quoprimime remains unchanged).

    There are probably more functions that still incorrectly accept strings,
    e.g. zlib.crc32.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 30, 2008

    Martin, the patch looks okay to me. I vote for applying it.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 2, 2008

    Committed as r67472.

    @loewis loewis mannequin closed this as completed Dec 2, 2008
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir release-blocker
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants