Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base64.encodestring does not actually accept strings #47863

Closed
ddvoinikov mannequin opened this issue Aug 20, 2008 · 12 comments
Closed

base64.encodestring does not actually accept strings #47863

ddvoinikov mannequin opened this issue Aug 20, 2008 · 12 comments
Assignees
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ddvoinikov
Copy link
Mannequin

ddvoinikov mannequin commented Aug 20, 2008

BPO 3613
Nosy @gvanrossum, @birkenfeld, @pitrou, @devdanzin
Files
  • get_host_info.diff: Use unquote_to_bytes to feed base64.encodestring, then decode
  • encodestring_rename.patch: base64.py with encodestring/decodestring renamed to encodebytes/decodebytes.
  • encodebytes_new_types.patch: encodestring/decodestring with new input/output types.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/birkenfeld'
    closed_at = <Date 2009-06-04.09:13:14.845>
    created_at = <Date 2008-08-20.06:00:33.285>
    labels = ['type-bug', 'library', 'docs']
    title = 'base64.encodestring does not actually accept strings'
    updated_at = <Date 2009-06-04.09:13:14.844>
    user = 'https://bugs.python.org/ddvoinikov'

    bugs.python.org fields:

    activity = <Date 2009-06-04.09:13:14.844>
    actor = 'georg.brandl'
    assignee = 'georg.brandl'
    closed = True
    closed_date = <Date 2009-06-04.09:13:14.845>
    closer = 'georg.brandl'
    components = ['Documentation', 'Library (Lib)']
    creation = <Date 2008-08-20.06:00:33.285>
    creator = 'ddvoinikov'
    dependencies = []
    files = ['12983', '13753', '13754']
    hgrepos = []
    issue_num = 3613
    keywords = ['patch']
    message_count = 12.0
    messages = ['71513', '71531', '71532', '71535', '71536', '71550', '81413', '81904', '86307', '86390', '86391', '88869']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'georg.brandl', 'pitrou', 'ajaksu2', 'kawai', 'ddvoinikov', 'mgiuca']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'test needed'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3613'
    versions = ['Python 3.0']

    @ddvoinikov
    Copy link
    Mannequin Author

    ddvoinikov mannequin commented Aug 20, 2008

    This quote from base64.py:

    ---

    bytes_types = (bytes, bytearray)  # Types acceptable as binary data
    ...
    def encodestring(s):
        """Encode a string into multiple lines of base-64 data.
    Argument and return value are bytes.
    """
    if not isinstance(s, bytes_types):
        raise TypeError("expected bytes, not %s" % s.__class__.__name__)
    ...
    

    ---

    shows that encodestring method won't accept str for an argument, only
    bytes. Perhaps this is by design, but then wouldn't it make sense to
    change the name of the method ?

    Anyway, this behavior clashes in (the least I know) xmlrpc.client, line
    1168 when basic authentication is present:

    ---

    auth = base64.encodestring(urllib.parse.unquote(auth))

    because unquote() returns str, not bytes.

    @ddvoinikov ddvoinikov mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Aug 20, 2008
    @pitrou
    Copy link
    Member

    pitrou commented Aug 20, 2008

    The encodestring() function is refered to in the docs as "the legacy
    interface". Perhaps it should be simply deprecated in 3.0?

    @pitrou pitrou added the docs Documentation in the Doc dir label Aug 20, 2008
    @mgiuca
    Copy link
    Mannequin

    mgiuca mannequin commented Aug 20, 2008

    Hi Dmitry,

    RE the method behaviour: I think it probably is correct to NOT accept a
    string. Given that it's base64 encoding it, it only makes sense to
    encode bytes, not arbitrary Unicode characters which have no
    well-defined binary representation.

    RE the method name: I agree, it should be renamed to encodestring. I
    argued a similar case for the array.tostring and fromstring methods
    (which actually act on bytes in Python 3.0) - here:
    http://bugs.python.org/issue3565. So far nobody replied on that issue; I
    think it may be too late to rename them. Best we can do is document them.

    RE xmlrpc.client:1168. We just checked in a patch to urllib which adds
    an unquote_to_bytes function (see
    http://docs.python.org/dev/3.0/library/urllib.parse.html#urllib.parse.unquote_to_bytes).
    (Unquote itself still returns a string). It should be correct to just
    change xmlrpc.client:1168 to call urllib.parse.unquote_to_bytes. (Though
    I've not tested it).

    @ddvoinikov
    Copy link
    Mannequin Author

    ddvoinikov mannequin commented Aug 20, 2008

    I think it probably is correct to NOT accept a string

    I agree.

    it should be renamed to encodestring

    Huh ? It is already called that :) IMO it should be renamed to
    encodebytes or simply encode if the module is only (or most frequently)
    used to encode bytes.

    Best we can do is document them.

    Oh well.

    @mgiuca
    Copy link
    Mannequin

    mgiuca mannequin commented Aug 20, 2008

    > it should be renamed to encodestring
    Huh ? It is already called that :)

    Um ... yes. I mean encodebytes :)

    > Best we can do is document them.
    Oh well.

    But I don't know the rules. People are saying things like "no new
    features after beta3" but I take it that
    backwards-compatibility-breaking changes are included in this.

    But maybe it's still OK for us to break code after the beta. Perhaps
    someone involved in the release can comment on this issue (and hopefully
    with a view to my array patch - http://bugs.python.org/issue3565 - as well).

    @gvanrossum
    Copy link
    Member

    Did someone fix xmlrpc.client:1168 yet?

    IMO it's okay to add encodebytes(), but let's leave encodestring()
    around with a deprecation warning, since it's so late in the release cycle.

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Feb 8, 2009

    Here's a trivial patch for xmlrpc.client:1168. The testcase below
    doesn't seem to fit well in test_xmlrpc, should it just be hacked in?

    import xmlrpc.client
    transp = xmlrpc.client.Transport()
    transp.get_host_info("user@host.tld")

    @birkenfeld
    Copy link
    Member

    Applied the patch in r69575.

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Apr 22, 2009

    We still need to solve the encodebytes/encodestring stuff.

    @mgiuca
    Copy link
    Mannequin

    mgiuca mannequin commented Apr 24, 2009

    I've attached a patch which renames encodestring to encodebytes (keeping
    encodestring around as an alias). Updated test and documentation.

    I also renamed decodestring to decodebytes, because it also refuses to
    accept a string (only a bytes). I have an alternative suggestion, which
    I'll post in a separate comment (in a minute).

    @mgiuca
    Copy link
    Mannequin

    mgiuca mannequin commented Apr 24, 2009

    Now, base64.encodestring and decodestring seem a bit weird because the
    Base64 encoded string is also required to be a bytes.

    It seems to me that once something is Base64-encoded, it's considered to
    be ASCII text, not just some byte string, and therefore it should be a
    str, not a bytes. (For example, they end with a '\n'. That's something
    which strings do, not bytes).

    Hence, base64.encodestring (which should be "encodebytes") should take a
    bytes and return a str. base64.decodestring should take a str (required
    to be ASCII-only) and return a bytes.

    I've attached an alternative patch, encodebytes_new_types.patch (which,
    unlike my other patch, doesn't rename decodestring to decodebytes). This
    patch:

    • Renames encodestring to encodebytes.
    • Changes the output of encodebytes to return an ASCII str*, not a bytes.
    • Changes the input of decodestring to accept an ASCII str, not a bytes.
    • An ASCII str is a Unicode string with only ASCII characters.

    This isn't a proper patch (it breaks a lot of other code which I haven't
    bothered to fix). I'm just submitting it as an idea, in case this is
    something we want to do. Most likely not, due to the breakage. Also we
    have the same problem for the non-legacy functions, b64encode and
    b64decode, etc, so the problem is more widespread than just these two
    functions.

    @birkenfeld
    Copy link
    Member

    Applied a patch to rename (and keep old aliases) in r73204.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants