This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: let code force str(bytes) to raise an exception
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, dholth, jwilk, martin.panter, pitrou, r.david.murray, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2013-07-05 22:58 by dholth, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
strbytes.patch dholth, 2016-05-27 12:54 review
Messages (20)
msg192372 - (view) Author: Daniel Holth (dholth) * Date: 2013-07-05 22:58
This patch lets the user get/set the byteswarningflag() to control whether warnings are emitted at runtime (pursuant to the configuration of the warnings module, of course).

The idea is that a user who is concerned with interacting with bytes correctly would use a context manager to set str(bytes) to raise exceptions during a critical serialization.

TODO the hypothetical context manager would also have to push/pop the warnings state to work correctly. Maybe str(bytes) can raise a regular exception if byteswarningflag() is high enough?

TODO don't know if this plays nicely when there are no threads
msg192383 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-07-06 04:15
Why wouldn't you just enable bytes warning and fix all the code to work correctly?
msg192427 - (view) Author: Daniel Holth (dholth) * Date: 2013-07-06 12:30
You would just enable this during your serialization code, and if
(perhaps someone is calling your library and passing it the wrong type)
they would be guarded against this common error.

On Sat, Jul 6, 2013, at 12:15 AM, R. David Murray wrote:
> 
> R. David Murray added the comment:
> 
> Why wouldn't you just enable bytes warning and fix all the code to work
> correctly?
> 
> ----------
> nosy: +r.david.murray
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue18373>
> _______________________________________
msg192957 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-07-12 19:01
When I added os.environb, I had issues with the ByteWarning, in os.get_exec_path() if I remember correctly. I had to play with warmings.catch_warning(). A thread variable would simplify the code.
msg192960 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-07-12 19:41
Read only access to bytes warning flag already implemented as sys.flags.bytes_warning. How new functions are related to it?

And I concur with David. Why not just run your program with -b or -bb? Besides the testing purpose.
msg193270 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-07-18 08:21
> if
> (perhaps someone is calling your library and passing it the wrong type)
> they would be guarded against this common error.

OTOH, if your library is concerned about unwanted bytes objects, you can add an explicit type check.
That said, I don't see any counter-argument against this proposal; only that I'm not sure it's very useful.
msg193271 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-07-18 08:22
I don't know why your patch is putting this in the thread state, though...
msg193287 - (view) Author: Daniel Holth (dholth) * Date: 2013-07-18 12:55
On Thu, Jul 18, 2013, at 04:22 AM, Antoine Pitrou wrote:
> 
> Antoine Pitrou added the comment:
> 
> I don't know why your patch is putting this in the thread state,
> though...

If you have multiple threads one thread might want exceptions when
str(bytes), and the other might not.

str(bytes) has been an insidious problem for me. No exceptions,
seemingly working code, and b'' in serialized data. I am motivated to
turn it off.
msg193288 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-07-18 13:05
> > I don't know why your patch is putting this in the thread state,
> > though...
> 
> If you have multiple threads one thread might want exceptions when
> str(bytes), and the other might not.

That sounds too much of a special case to me. You can still catch
BytesWarnings inside a context manager or something.
msg193293 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-07-18 13:34
I presume you mean you are motivated to turn it on (to catch the bugs).  I still think that also fixing the bugs in the other places str(bytes) is used would be better.  Are they occurring in third party libraries?  How often do you run into it?
msg193330 - (view) Author: Daniel Holth (dholth) * Date: 2013-07-18 21:57
Python 3 is supposed to make it easier to do Unicode correctly. str(bytes) does not. I felt strongly enough about that to write this patch.

With this feature my library can have control in a way that is much more practical than ensuring a particular flag has been passed to the interpreter.

It might make sense to modify the patch so str(bytes) can throw a regular exception instead of a warning while under the influence of the new flag. The reason being that the warning will be suppressed the second time around but for this use case you wouldn't want that.
msg193331 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-07-18 22:04
> Python 3 is supposed to make it easier to do Unicode correctly. str(bytes) does not.

str(bytes) does not make sense, that's why an exception is raised when
the -bb command line option is used ;-)
msg208096 - (view) Author: Daniel Holth (dholth) * Date: 2014-01-14 13:57
Just mitigating the bug that -bb is not the default...
msg219620 - (view) Author: Daniel Holth (dholth) * Date: 2014-06-02 18:53
As an aside, why wouldn't I run my program with -bb?

One reason is that the following code won't work on Linux:

#!/usr/bin/env python -bb

Instead of passing -bb to Python, it will look for an executable called "python -bb", and it's not likely to find it.

The original reason was that I did not know about -bb.
msg266469 - (view) Author: Daniel Holth (dholth) * Date: 2016-05-27 02:38
Superceded by http://bugs.python.org/issue27134 , a simpler solution providing a with StrBytesRaises(): context manager.
msg266493 - (view) Author: Daniel Holth (dholth) * Date: 2016-05-27 12:54
Better to continue discussion here.

Attached is my second, simpler version of the feature. A context manager is included:

with string.StrBytesRaises():
    str(b'bytes')

# raises an exception

In a normal program, you might just set the flag to True, but in a library that has no hope of setting the -bb flag you might use it during serialization rather than checking isinstance() on all input.
msg266494 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-05-27 13:02
strbytes.patch of #27134: adding calls to PyThreadState_GetDict()+PyDict_GetItemString() seems inefficient for a rare use case

I would prefer to modify the existing Py_BytesWarningFlag flag.

The flag is process-wide. If it's an issue, we can use a thread-local storage (TLS) for the flag in C.

FYI get_exec_path() uses the following code to temporarely ignore the warning:

    # {b'PATH': ...}.get('PATH') and {'PATH': ...}.get(b'PATH') emit a
    # BytesWarning when using python -b or python -bb: ignore the warning
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", BytesWarning)
        ...
msg266495 - (view) Author: Daniel Holth (dholth) * Date: 2016-05-27 13:17
My reasoning was that str(bytes) would normally be called so rarely that
who cares.

On Fri, May 27, 2016, at 09:02 AM, STINNER Victor wrote:
> 
> STINNER Victor added the comment:
> 
> strbytes.patch of #27134: adding calls to
> PyThreadState_GetDict()+PyDict_GetItemString() seems inefficient for a
> rare use case
> 
> I would prefer to modify the existing Py_BytesWarningFlag flag.
> 
> The flag is process-wide. If it's an issue, we can use a thread-local
> storage (TLS) for the flag in C.
> 
> FYI get_exec_path() uses the following code to temporarely ignore the
> warning:
> 
>     # {b'PATH': ...}.get('PATH') and {'PATH': ...}.get(b'PATH') emit a
>     # BytesWarning when using python -b or python -bb: ignore the warning
>     with warnings.catch_warnings():
>         warnings.simplefilter("ignore", BytesWarning)
>         ...
> 
> ----------
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue18373>
> _______________________________________
msg266500 - (view) Author: Daniel Holth (dholth) * Date: 2016-05-27 15:20
Then you prefer the older three year old patch that adds to pystate.h ?
http://bugs.python.org/review/18373/#ps8545

I remember not being happy with how complicated it turned out, and
perhaps having problems with not wanting the "warn once" behavior? It
seems more Pythonic to store the flag in a dict.
msg268725 - (view) Author: Daniel Holth (dholth) * Date: 2016-06-17 14:20
What if we changed it so that Python code could only disable str_bytes() process-wide, editing the original flag? Would that be fatal to debuggers and the repl?
History
Date User Action Args
2022-04-11 14:57:47adminsetgithub: 62573
2016-06-17 14:20:20dholthsetmessages: + msg268725
2016-05-27 15:20:26dholthsetmessages: + msg266500
2016-05-27 13:17:29dholthsetmessages: + msg266495
2016-05-27 13:02:02vstinnersetmessages: + msg266494
2016-05-27 12:54:45dholthsetfiles: + strbytes.patch

versions: + Python 3.6, - Python 3.5
messages: + msg266493
title: implement sys.get/setbyteswarningflag() -> let code force str(bytes) to raise an exception
2016-05-27 12:40:58dholthsetfiles: - byteswarningflag.patch
2016-05-27 02:38:38dholthsetmessages: + msg266469
2014-12-13 01:37:45martin.pantersetnosy: + martin.panter
2014-06-02 18:53:47dholthsetmessages: + msg219620
2014-01-15 23:22:29jwilksetnosy: + jwilk
2014-01-15 08:17:50Arfreversetnosy: + Arfrever
2014-01-14 13:57:13dholthsetmessages: + msg208096
2014-01-14 09:22:42serhiy.storchakasetstage: patch review
type: enhancement
components: + Interpreter Core
versions: + Python 3.5, - Python 3.4
2013-07-18 22:04:27vstinnersetmessages: + msg193331
2013-07-18 21:57:41dholthsetmessages: + msg193330
2013-07-18 13:34:57r.david.murraysetmessages: + msg193293
2013-07-18 13:05:08pitrousetmessages: + msg193288
2013-07-18 12:55:51dholthsetmessages: + msg193287
2013-07-18 08:22:11pitrousetmessages: + msg193271
2013-07-18 08:21:00pitrousetnosy: + pitrou
messages: + msg193270
2013-07-12 19:41:17serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg192960
2013-07-12 19:01:09vstinnersetnosy: + vstinner
messages: + msg192957
2013-07-06 12:30:36dholthsetmessages: + msg192427
2013-07-06 04:15:44r.david.murraysetnosy: + r.david.murray
messages: + msg192383
2013-07-05 22:58:44dholthcreate