classification
Title: When using `python -bb`, `struct.calcsize` raises a warning when used with str argument after being used with bytes (might be a larger problem with dicts)
Type: behavior Stage:
Components: Interpreter Core, Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: The Compiler, mark.dickinson, meador.inge, serhiy.storchaka, tadeu
Priority: normal Keywords:

Created on 2020-09-13 13:23 by tadeu, last changed 2020-09-13 14:24 by The Compiler.

Messages (4)
msg376836 - (view) Author: Edson Tadeu M. Manoel (tadeu) Date: 2020-09-13 13:23
Here is the inconsistent behavior, when running with `python -bb` (or just `python -b`), caused by an internal cache:

    >>> import struct
    >>> struct.calcsize(b'!d')  # cache for '!d' uses bytes
    8
    >>> struct.calcsize('!d')  # so there's a warning when trying to use str
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    BytesWarning: Comparison between bytes and string
    >>> struct.calcsize('>d')  # cache for '>d' uses str
    8
    >>> struct.calcsize(b'>d')  # so now the warning is inverted, it shows up when trying to use bytes
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    BytesWarning: Comparison between bytes and string
    >>> struct.calcsize('>d')  # no problem when using str
    8

Note that this might be caused by a possible larger problem when dealing with keys of different string types in dicts under `python -b` (or `python -bb`):

    $ python
    Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> d={}
    >>> d['a']=1
    >>> d[b'a']=2
    >>> d['a']
    1
    >>> d[b'a']
    2


    $ python -bb
    Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> d={}
    >>> d['a']=1
    >>> d[b'a']=2
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>

I'm not sure if this warning is intentional, since in Python 3 there seems to be no special reason for dicts to try to compare 'a' with b'a' (other than possible implementation details).


Note: this is from an issue found here: https://github.com/pytest-dev/pytest-xdist/issues/596
msg376837 - (view) Author: Florian Bruhin (The Compiler) * Date: 2020-09-13 13:28
Taking the freedom of adding people involved in the `struct` module to the nosy list.
msg376838 - (view) Author: Edson Tadeu M. Manoel (tadeu) Date: 2020-09-13 13:35
> I'm not sure if this warning is intentional, since in Python 3 there seems to be no special reason for dicts to try to compare 'a' with b'a' (other than possible implementation details).

Okay, there's one special reason, it's the fact that 'a' and b'a' have the same hash. I'm not sure about the expected behavior, though.
msg376839 - (view) Author: Florian Bruhin (The Compiler) * Date: 2020-09-13 14:24
Ah, also see https://bugs.python.org/issue21071#msg292409 where the same thing was mentioned as part of another issue as well.

After some discussions in the Python IRC channel, I guess it's acceptable for dicts to raise a ByteWarning here - after all, there *is* a comparison between str/bytes going on here. It might be an implementation detail, but so is e.g.   b'a' in ['a']   and I'd certainly expect that to give me a warning/error with -b/-bb.

So I guess if struct continues to accept bytes as format string, it should probably decode them to ASCII or something before interacting with the cache?
History
Date User Action Args
2020-09-13 14:24:57The Compilersetnosy: + serhiy.storchaka
messages: + msg376839
2020-09-13 13:35:30tadeusetmessages: + msg376838
2020-09-13 13:28:21The Compilersetnosy: + mark.dickinson, meador.inge, The Compiler
messages: + msg376837
2020-09-13 13:23:48tadeucreate