Author eryksun
Recipients Carl Osterwisch, Gabi.Davar, John Florian, chary314, dabrahams, davide.rizzo, dlenski, eric.araujo, eric.smith, eryksun, ethan smith, jaraco, jwilk, martin.panter, ncoghlan, njs, paul.moore, piotr.dobrogost, pitrou, r.david.murray, sbt, steve.dower, tim.golden, zach.ware
Date 2020-09-11.07:57:19
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1599811039.95.0.862094569051.issue14243@roundup.psfhosted.org>
In-reply-to
Content
> We'd CreateFile the file and then immediately pass it to 
> _open_osfhandle

Unlike _wopen, _open_osfhandle doesn't truncate Ctrl+Z (0x1A) from the last byte when the flags value contains _O_TEXT | _O_RDWR. _wopen implements this to allow appending data in text mode. The implementation is based on GetFileType (skip pipes and character devices), _lseeki64, _read, and _chsize[_s].

O_TEXT (ANSI text mode) has to be supported for now, but it doesn't properly fit in Python 3. The io module opens files using the CRT's binary mode. It doesn't implement newline translation for bytes I/O. And io.TextIOWrapper doesn't support Ctrl+Z as a logical EOF marker. 

As long as it's supported, O_TEXT should be made the default in os.open (but not in msvcrt.open_osfhandle), independent of the CRT default fmode (i.e. _get_fmode and _set_fmode). Many callers already assume that's the case. For example, tempfile.mkstemp with text=True uses tempfile._text_openflags, which doesn't include os.O_TEXT. That assumption is currently wrong if _set_fmode(_O_BINARY) is called.

Thankfully, Python has never documented support for the _O_WTEXT, _O_U16TEXT, and _O_U8TEXT Unicode text modes in os.open. To my knownledge, there is no reasonable way to reimplement these modes. The C runtime doesn't expose a public interface to modify a file's internal text and Unicode modes, and _open_osfhandle only supports ANSI text mode. If _Py_wopen is implemented, it will have to fail the Unicode (UTF-16 or UTF-8) modes with EINVAL. Even without _Py_wopen, I'd prefer to modify os.open to fail them because wrapping a Unicode-mode fd with io.FileIO doesn't function reliably. FileIO doesn't guarantee wchar_t aligned reads and writes, which the CRT requires in Unicode mode.
History
Date User Action Args
2020-09-11 07:57:20eryksunsetrecipients: + eryksun, paul.moore, jaraco, ncoghlan, pitrou, eric.smith, tim.golden, jwilk, eric.araujo, r.david.murray, njs, dabrahams, davide.rizzo, sbt, Gabi.Davar, martin.panter, piotr.dobrogost, zach.ware, dlenski, steve.dower, Carl Osterwisch, ethan smith, John Florian, chary314
2020-09-11 07:57:19eryksunsetmessageid: <1599811039.95.0.862094569051.issue14243@roundup.psfhosted.org>
2020-09-11 07:57:19eryksunlinkissue14243 messages
2020-09-11 07:57:19eryksuncreate