classification
Title: Add new io.FileIO using the native Windows API
Type: enhancement Stage: needs patch
Components: IO, Unicode, Windows Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, amaury.forgeotdarc, ezio.melotti, gvanrossum, haypo, mmarkk, piotr.dobrogost, pitrou, rpetrov, santoso.wijaya, sbt, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2011-09-08 23:19 by haypo, last changed 2014-11-01 20:35 by rpetrov.

Files
File name Uploaded Description Edit
winfileio.patch sbt, 2013-01-15 12:33 review
Messages (30)
msg143741 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-09-08 23:19
On Windows, Python uses the POSIX API (file descriptors), instead of the native API (file handles). Some features cannot be used using the POSIX API, like setting security attributes. It would be nice to have a io.FileIO using Windows file handlers to get access to all Windows features. It would help feature #12105 to implement "O_CLOEXEC" flag using the lpSecurityAttributes argument.

We can maybe try with a prototype written in Python. Using _pyio.RawIOBase, only readinto(), write(), seek() and truncate() have to be implemented. But it is better to implement also close() :-)
msg143742 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-09-08 23:21
See also issue #1602: a prototype of a console object has been proposed to use the native Windows console API, instead of the POSIX API (read from fd 0, write to fd 1 or 2). The prototype is implemented in Python using ctypes.
msg146769 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-01 12:12
Instead of rewriting your own RawIO implementation, why not use _open_osfhandle?
This should be simple now with the "opener" argument.
http://msdn.microsoft.com/en-us/library/bdts1c9x.aspx
msg146788 - (view) Author: Марк Коренберг (mmarkk) Date: 2011-11-01 16:25
why not use _open_osfhandle?

Because it is wrapper for other CRT functions for Windows, like close(). In other words it is an emulation. I think Python should not create wrapper around wrapper around wrapper...

For example, in Python3, open() implemented using open() and not using fopen(). Why we should use another wrapper on Windows platform?
msg146789 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-01 16:33
> why not use _open_osfhandle?
> 
> Because it is wrapper for other CRT functions for Windows, like
> close(). In other words it is an emulation. I think Python should not
> create wrapper around wrapper around wrapper...

Why do you think it makes a difference?
msg146792 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-11-01 16:49
An implementation of RawIO with the win32 API can be useful (and I'd be interested to compare the performance)
But maybe not for all usages: some Python interfaces are defined in terms of file descriptors, imp.load_module(), and PyTokenizer_FindEncoding for example.
msg146795 - (view) Author: Марк Коренберг (mmarkk) Date: 2011-11-01 16:56
> Why do you think it makes a difference?
Because adding one more dependency on unneeded libraries add the pain. Also it limit us on very restricted API of that wrapper. Windows native API is stable. So it's OK to rely on it's documented imlementation.

Suppose, we receive file-descriptor from _open_osfhandle.... For example it is a socket. It still unusable for stdin. Many standard functions does not work with such handle. The list of available functions : http://msdn.microsoft.com/en-us/library/kdfaxaay.aspx . As we see it is very narrow and function names are not POSIX-compatible (_chsize vs ftruncate). Documentation is very poor. For example, _close() is documented only here: http://msdn.microsoft.com/en-US/library/40bbyw78(v=VS.80).aspx .
msg146796 - (view) Author: Марк Коренберг (mmarkk) Date: 2011-11-01 17:02
The good example of implemetation is QT. I think we can get some things from this library. (Like using  "\\?\" for long paths (or always, as in QT)).
msg146797 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-01 17:11
> > Why do you think it makes a difference?
> Because adding one more dependency on unneeded libraries add the pain.

MSVCRT is unneeded?? What are you talking about?

> Also it limit us on very restricted API of that wrapper. Windows
> native API is stable. So it's OK to rely on it's documented
> imlementation.

Please provide a patch to demonstrate the claimed improvement.

> Like using  "\\?\" for long paths

That's completely unrelated to this issue.
msg146837 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-11-02 12:54
> Instead of rewriting your own RawIO implementation, why not use
> _open_osfhandle?

I don't know yet what is the best approach. One important point is to keep the 
HANDLE, to be able to manipulate the open file using the Windows API (e.g. call 
WriteFile instead of write).

I propose a RawIO to call directly the Windows API: file.write() would call 
directly WriteFile() instead of the POSIX wrapper (write()).

We may use it to implement new features like async I/O on Windows (e.g. 
WriteFileEx).

Modules/_multiprocessing/win32_functions.c exposes already some low-level 
Windows functions like WriteFile(). We may reuse the RawIO to offer an object 
oriented API for the Windows multiprocessing pipes. Such RawIO would have 
extra methods, maybe a file.WriteFile() method to give access to extra options 
like "overlapped".
msg178900 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-03 02:12
I'm not really motivated to work on a Windows-specific issue. The issue has no patch, even no proof-of-concept implemented in Python. Can I close the issue, or is someone interested to work on this topic?

Python 3.3 adds a opener argument to open(). It may be enough to workaround this issue?
msg178923 - (view) Author: Марк Коренберг (mmarkk) Date: 2013-01-03 08:13
Yes, re-writing windows IO to direct API, without intemediate layer is still needed.

Please don't close bug. Maybe someone will implement this.
msg178943 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-03 13:11
A while ago I did write a PipeIO class which subclasses io.RawIOBase and works for overlapped pipe handles.  (It was intended for multiprocessing and doing asynchronous IO with subprocess.)

As it is it would not work with normal files because when you do overlapped IO on files you must manually track the file position.

> Yes, re-writing windows IO to direct API, without intemediate layer is still 
> needed.

What are the expected benefits?

> It would help feature #12105 to implement "O_CLOEXEC" flag using the 
> lpSecurityAttributes argument.

Isn't O_NOINHERIT the Windows equivalent of O_CLOEXEC?
msg178969 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-03 17:36
Attached is a module for Python 3.3+ which subclasses io.RawIOBase.  The constructor signature is

    WinFileIO(handle, mode="r", closehandle=True)

where mode is "r", "w", "r+" or "w+".  Handles can be created using _winapi.CreateFile().

Issues:
- No support for append mode.
- Truncate is not atomic.  (Is atomicity supposed to be guaranteed?)
- Not properly tested.
msg179181 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-06 13:59
Attached is a patch which adds a winio module which is a replacement for io, but uses windows handles instead of fds.

It reimplements FileIO and open(), and provides openhandle() and closehandle() as replacements for os.open() and os.close().

test_io has been modified to exercise winio (in addition to _io and _pyio) and all the tests pass.

Note that some of the implementation (openhandle(), open(), FileIO.__init__()) is still done in Python rather than C.
msg179229 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-06 21:30
I don't like the idea of a specific I/O module for an OS. Is the public API
different? Can't you reuse the io module?
Le 6 janv. 2013 14:59, "Richard Oudkerk" <report@bugs.python.org> a écrit :

>
> Richard Oudkerk added the comment:
>
> Attached is a patch which adds a winio module which is a replacement for
> io, but uses windows handles instead of fds.
>
> It reimplements FileIO and open(), and provides openhandle() and
> closehandle() as replacements for os.open() and os.close().
>
> test_io has been modified to exercise winio (in addition to _io and _pyio)
> and all the tests pass.
>
> Note that some of the implementation (openhandle(), open(),
> FileIO.__init__()) is still done in Python rather than C.
>
> ----------
> keywords: +patch
> Added file: http://bugs.python.org/file28590/winfileio.patch
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12939>
> _______________________________________
>
msg179232 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-06 21:43
> I don't like the idea of a specific I/O module for an OS. Is the public API
> different?

It was partly to make integration with the existing tests easier: _io, _pyio and winio are tested in parallel.

> Can't you reuse the io module?

In what sense?

I don't really know how the API should be exposed.
msg179269 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-07 16:01
Hum, _get_osfhandle() was not mentionned in this issue. This function may be used to retrieve the internel file handle from a file descriptor.
http://msdn.microsoft.com/en-us/library/ks2530z6%28v=vs.100%29.aspx

There is also the opposite: _open_osfhandle(). This function may be used for fileno() method of the Windows implementation of FileIO.
http://msdn.microsoft.com/en-us/library/bdts1c9x%28v=vs.100%29.aspx
msg179809 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-12 15:13
Attached is a new patch which is implemented completely in C.

It adds a WinFileIO class to the io module, which has the same API 
as FileIO except that:

* It has a handle attribute instead of a fileno() method.

* It has staticmethods openhandle() and closehandle() which are
  analogues of os.open() and os.close().

The patch also adds a keyword-only "rawfiletype" argument to
io.open() so that you can write

    f = open("somefile", "w", rawfiletype=WinFileIO)
msg179811 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-12 15:19
Forgot to mention, the handles are non-inheritable.

You can use _winapi.DuplicateHandle() to create an inheritable duplicate handle if you really need to.
msg179943 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-01-14 14:36
Added some comments on Rietveld.
The .fileno() method is missing. Can this cause a problem when the file is passed to stdlib functions? subprocess for example?
msg179944 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-14 14:42
What does this proposal bring exactly?
msg179973 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-14 19:55
> Added some comments on Rietveld.
> The .fileno() method is missing. Can this cause a problem when the file 
> is passed to stdlib functions? subprocess for example?

Thanks.  An older version of the patch had a fileno() method which returned the handle -- but that would have confused anything that expects fileno() to return a true fd.

It would be possible to make fileno() lazily create an fd using open_osfhandle().
msg179975 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-14 20:17
> What does this proposal bring exactly?

Unless we are willing to completely replace fds with handles on Windows, perhaps not too much.  (At one point I had assumed that that was the plan for py3k.)

Although not advertised, openhandle() does have a share_flags parameter to control the share mode of the file.  This makes it possible to delete files for which there are open handles.  Mercurial needs a C extension to support this.  regrtest could certainly benefit from such a feature.

But one thing that I would at least like to do is create a FileIO replacement for overlapped pipe/socket handles.  Then multiprocessing.Connection could be a simple wrapper round a file object, and most of the platform specific code in multiprocessing.connection can go away.

The current patch does not support overlapped IO, but that could be added easily enough.  (Overlapped IO for normal files might be more complicated.)
msg180014 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-15 12:33
New patch reflecting Amaury's comments.
msg180043 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-15 19:31
> The current patch does not support overlapped IO, but that could be
> added easily enough.  (Overlapped IO for normal files might be more
> complicated.)

That could be cool. Wouldn't it belong in the fabled winapi module?
msg180102 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-01-16 19:04
Just a note of support for Richard -- having I/O use the native APIs directly rather than via emulation or other wrappers is a good idea, because the emulations / wrappers usually add restrictions that are not present in the native API.  This is also the reason why we dropped going through stdio -- it did not give enough control over buffering and made integration with (UNIX native) file descriptors complex.  Integration with native I/O primitives on Windows seems sense in the light of e.g. PEP 3156.
msg199413 - (view) Author: Piotr Dobrogost (piotr.dobrogost) Date: 2013-10-10 21:14
I guess extracting Richard's patch to a package and placing it on PyPI would be a good move. I recalled reading this bug after I saw "Does Python IO allow opened file to be deleted/renamed on Windows?" question on Stackoverflow (http://stackoverflow.com/q/19280836/95735)
msg222661 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-10 12:32
It strikes me as far more sense to use the native API so how do we take this forward, formal patch review, put it on pypi, or what?
msg222663 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-07-10 12:51
Serhiy implemented the FileIO class in pure Python: see the issue #21859 (patch under review). Using thre Python class, it becomes easier to reimplement FileIO using the Windows API, at least to play with a prototype in pure Python.
History
Date User Action Args
2014-11-01 20:35:07rpetrovsetnosy: + rpetrov
2014-07-10 12:51:18hayposetnosy: + serhiy.storchaka
messages: + msg222663
2014-07-10 12:32:56BreamoreBoysetnosy: + BreamoreBoy

messages: + msg222661
versions: + Python 3.5, - Python 3.4
2013-10-10 21:14:49piotr.dobrogostsetmessages: + msg199413
2013-08-13 21:29:09piotr.dobrogostsetnosy: + piotr.dobrogost
2013-01-16 19:04:21gvanrossumsetnosy: + gvanrossum
messages: + msg180102
2013-01-15 19:31:13pitrousetmessages: + msg180043
2013-01-15 12:34:12sbtsetfiles: - winfileio.patch
2013-01-15 12:34:01sbtsetfiles: + winfileio.patch

messages: + msg180014
2013-01-14 20:17:11sbtsetmessages: + msg179975
2013-01-14 19:55:51sbtsetmessages: + msg179973
2013-01-14 14:42:27pitrousetmessages: + msg179944
2013-01-14 14:36:10amaury.forgeotdarcsetmessages: + msg179943
2013-01-12 15:19:58sbtsetmessages: + msg179811
2013-01-12 15:13:59sbtsetfiles: - winfileio.patch
2013-01-12 15:13:50sbtsetfiles: - test_winfileio.py
2013-01-12 15:13:44sbtsetfiles: - winfileio.c
2013-01-12 15:13:31sbtsetfiles: + winfileio.patch

messages: + msg179809
2013-01-07 16:01:55hayposetmessages: + msg179269
2013-01-06 21:43:02sbtsetmessages: + msg179232
2013-01-06 21:30:46hayposetmessages: + msg179229
2013-01-06 13:59:18sbtsetfiles: + winfileio.patch
keywords: + patch
messages: + msg179181
2013-01-03 17:36:21sbtsetfiles: + test_winfileio.py
2013-01-03 17:36:03sbtsetfiles: + winfileio.c

messages: + msg178969
2013-01-03 13:11:57sbtsetmessages: + msg178943
2013-01-03 09:54:20serhiy.storchakasetversions: + Python 3.4, - Python 3.3
nosy: + ezio.melotti

components: + Windows
type: enhancement
stage: needs patch
2013-01-03 08:13:37mmarkksetmessages: + msg178923
2013-01-03 07:50:32pitrousetnosy: + sbt
2013-01-03 02:12:33hayposetmessages: + msg178900
2011-11-02 12:54:22hayposetmessages: + msg146837
2011-11-01 17:11:09pitrousetmessages: + msg146797
2011-11-01 17:02:44mmarkksetmessages: + msg146796
2011-11-01 16:56:43mmarkksetmessages: + msg146795
2011-11-01 16:49:26amaury.forgeotdarcsetmessages: + msg146792
2011-11-01 16:33:08pitrousetmessages: + msg146789
2011-11-01 16:25:59mmarkksetmessages: + msg146788
2011-11-01 12:12:35pitrousetnosy: + pitrou
messages: + msg146769
2011-09-12 18:38:21santoso.wijayasetnosy: + santoso.wijaya
2011-09-08 23:21:22hayposetmessages: + msg143742
2011-09-08 23:19:00haypocreate