msg388809 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-16 04:19 |
PEP 597 is accepted.
|
msg389092 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-03-19 14:16 |
I replied to INADA-san message on bpo-43552:
https://bugs.python.org/issue43552#msg389091
> I had forgot to consider about UTF-8 mode while finishing PEP 597. If possible, I want to ignore UTF-8 mode when `encoding="locale"` is specified from Python 3.10.
In this case, the PEP 597 statement that open(filename, encoding="locale") is the same than open(filename) is wrong. It would mean that users which got the UTF-8 Mode enabled (implicitly or explicitly) would switch to a legacy encoding like latin1 rather than using the UTF-8 encoding, if they add encoding="locale" to their open() calls?
Since the final goal is to move everybody towards to UTF-8, I'm not sure how it's a good thing.
|
msg389094 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-19 14:28 |
> Since the final goal is to move everybody towards to UTF-8, I'm not sure how it's a good thing.
The final goal (the third motivation of the pep 597) is changing the default encoding (i.e. encoding used when it is not specified) to UTF-8.
But forcing people to use UTF-8 even they specify locale encoding explicitly is not the goal. That's why I want to ignore UTF-8 mode when `encoding="locale"` is specified.
I think this is almost Windows-only issue, and "mbcs" can be used in Windows already. It is documented in https://docs.python.org/3/using/windows.html#utf-8-mode
So this is not a blocker. Just my preference.
|
msg389095 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-03-19 14:31 |
I see different cases when open() is called with no encoding argument:
(A) User wants to use UTF-8: add encoding="utf-8"
(B) Windows user wants to use the ANSI code page of their computer, local file not intended to be shared with other computers: add encoding="mbcs". This makes the code specific to Windows ("mbcs" alias doesn't exist on Unix).
(C) User wants to use the locale encoding and is fine with the UTF-8 Mode: add encoding=getpreferredencoding(False)
(D) Unix user wants to use the locale encoding but not the UTF-8 Mode: encoding=get_current_locale_encoding() (function proposed in bpo-43552) or nl_langinfo(CODESET) (should work on any Python version). I don't know if nl_langinfo(CODESET) is available on Windows.
(E) User has no idea of what they are doing and don't understand anything to Unicode: please trust us and specify explicitly UTF-8 :-)
Apart the encoding="utf-8" case, I understand that they are two main complex cases:
(1) "UTF-8" in the UTF-8 Mode, or the locale encoding
(2) Always use the locale encoding, ignore the UTF-8 Mode
What I don't expect is the current behavior, before PEP 597. Who uses open() without specifying an encoding but always want to use the locale encoding? (case 2) So this use case is already broken when the UTF-8 Mode is enabled explicitly?
|
msg389099 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-19 14:57 |
> (1) "UTF-8" in the UTF-8 Mode, or the locale encoding
> (2) Always use the locale encoding, ignore the UTF-8 Mode
>
> What I don't expect is the current behavior, before PEP 597. Who uses open() without specifying an encoding but always want to use the locale encoding? (case 2) So this use case is already broken when the UTF-8 Mode is enabled explicitly?
Yes, it is broken already. So they can not use UTF-8 mode.
If `encoding="locale"` ignore UTF-8 mode, it save the use case. They can add `encoding="locale"` where they need to use locale/GetACP encoding and enable UTF-8 mode.
That's why it is important If we enable UTF-8 mode by default in the future.
|
msg389656 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-29 03:28 |
New changeset 4827483f47906fecee6b5d9097df2a69a293a85c by Inada Naoki in branch 'master':
bpo-43510: Implement PEP 597 opt-in EncodingWarning. (GH-19481)
https://github.com/python/cpython/commit/4827483f47906fecee6b5d9097df2a69a293a85c
|
msg389685 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-03-29 11:53 |
Yeah! Congrats INADA-san for implementing your PEP!
|
msg389796 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-30 07:25 |
I created bpo-43651 to track fixing EncodingError in Python stdlibs.
I close this issue for now.
|
msg389822 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-30 12:07 |
In bpo-43651, I found code pattern that it's difficult to use io.text_encoding():
class OpenWrapper:
def __new__(cls, *args, **kwargs):
return open(*args, **kwargs)
`kwargs["encoding"] = text_encoding(kwargs.get("encoding)` doesn't work because `open(filename, "b", encoding="locale")` raises `ValueError: binary mode doesn't take an encoding argument`.
I think we should accept `encoding="locale"` even in binary mode. It makes easy to use `text_encoding()` and `encoding="locale"`.
|
msg389873 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-31 05:26 |
New changeset ff3c9739bd69aa8b58007e63c9e40e6708b4761e by Inada Naoki in branch 'master':
bpo-43510: PEP 597: Accept `encoding="locale"` in binary mode (GH-25103)
https://github.com/python/cpython/commit/ff3c9739bd69aa8b58007e63c9e40e6708b4761e
|
msg389877 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-31 06:23 |
I'm sorry, I was wrong. Allowing `encoding="locale"` didn't help OpenWrapper. See GH-25107.
If we use `encoding = text_encoding(encoding)` in binary mode, `open(filename, "rb")` will be warned. This doesn't make sense at all.
Adding `mode` parameter to the `text_encoding()` doesn't make sense too. Because it is used for functions wrapping not only open(), but also TextIOWrapper().
So we must not call `text_encoding()` in binary mode. Allowing `encoding="locale"` in binary mode doesn't make it easy. I will revert GH-25103.
|
msg389879 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-03-31 09:30 |
To me, it sounds really weird to accept an encoding when a file is opened in binary mode. open(filename, "rb", encoding="locale") looks like a bug.
|
msg389881 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2021-03-31 09:36 |
On 31.03.2021 11:30, STINNER Victor wrote:
>
> To me, it sounds really weird to accept an encoding when a file is opened in binary mode. open(filename, "rb", encoding="locale") looks like a bug.
Same here.
If encoding is used as an argument and then not used, this is a bug,
not a feature :-)
|
msg389882 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-03-31 09:49 |
New changeset cfa176685a5e788bafc7749d7a93f43ea3e4de9f by Inada Naoki in branch 'master':
Revert "bpo-43510: PEP 597: Accept `encoding="locale"` in binary mode (GH-25103)" (#25108)
https://github.com/python/cpython/commit/cfa176685a5e788bafc7749d7a93f43ea3e4de9f
|
msg390044 - (view) |
Author: Inada Naoki (methane) * |
Date: 2021-04-02 08:39 |
New changeset bec8c787ec72d73b39011bde3f3a93e9bb1174b7 by Inada Naoki in branch 'master':
bpo-43510: Fix emitting EncodingWarning from _io module. (GH-25146)
https://github.com/python/cpython/commit/bec8c787ec72d73b39011bde3f3a93e9bb1174b7
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:42 | admin | set | github: 87676 |
2021-04-06 03:46:15 | methane | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2021-04-02 08:39:19 | methane | set | messages:
+ msg390044 |
2021-04-02 07:25:32 | methane | set | pull_requests:
+ pull_request23893 |
2021-03-31 09:49:47 | methane | set | messages:
+ msg389882 |
2021-03-31 09:36:33 | lemburg | set | nosy:
+ lemburg messages:
+ msg389881
|
2021-03-31 09:30:53 | vstinner | set | messages:
+ msg389879 |
2021-03-31 07:11:56 | eryksun | set | messages:
- msg389828 |
2021-03-31 06:24:20 | methane | set | pull_requests:
+ pull_request23852 |
2021-03-31 06:23:33 | methane | set | messages:
+ msg389877 |
2021-03-31 05:53:00 | methane | set | pull_requests:
+ pull_request23851 |
2021-03-31 05:26:15 | methane | set | messages:
+ msg389873 |
2021-03-31 04:21:53 | methane | set | stage: resolved -> patch review pull_requests:
+ pull_request23848 |
2021-03-31 04:20:19 | methane | link | issue43651 superseder |
2021-03-30 14:38:08 | eryksun | set | nosy:
+ eryksun messages:
+ msg389828
|
2021-03-30 12:07:35 | methane | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg389822
|
2021-03-30 07:25:41 | methane | set | status: open -> closed resolution: fixed messages:
+ msg389796
stage: patch review -> resolved |
2021-03-29 11:53:04 | vstinner | set | messages:
+ msg389685 |
2021-03-29 03:28:22 | methane | set | messages:
+ msg389656 |
2021-03-19 14:57:12 | methane | set | messages:
+ msg389099 |
2021-03-19 14:31:58 | vstinner | set | messages:
+ msg389095 |
2021-03-19 14:28:46 | methane | set | messages:
+ msg389094 |
2021-03-19 14:16:06 | vstinner | set | nosy:
+ vstinner messages:
+ msg389092
|
2021-03-16 04:25:21 | methane | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request23653 |
2021-03-16 04:19:49 | methane | create | |