Message 205541 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	gregory.p.smith, neologix, pitrou, sbt, serhiy.storchaka, vstinner
Date	2013-12-08.11:03:55
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1386500636.65.0.382598172751.issue19929@psf.upfronthosting.co.za>
In-reply-to

Content
Oh, Charles-François Natali replied to my review by email, and it's not archived on Rietveld. Copy of him message: > http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode420 > Lib/subprocess.py:420: _PopenSelector = selectors.SelectSelector > This code should be factorized, but I opened a new issue for that: > #19465. > http://bugs.python.org/issue19465 OK, so we'll see in the other issue :-) > http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode1583 > Lib/subprocess.py:1583: self._input_offset + _PIPE_BUF] > (Unrelated to the selector issue) It may be more efficient to use a > memory view instead of a slice here. I also noticed, but that's also a different issue. > http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode1597 > Lib/subprocess.py:1597: data = os.read(key.fd, 4096) > In the old code, poll() uses 4096 whereas select() uses 1024 bytes... Do > you know the reason? Why 4096 and not 1 byte or 1 MB? I would prefer a > global constant rather than an harded limit. When writing to the pipe, we should definitely write less than PIPE_BUF, to avoid blocking after select()/poll() returns write-ready. When reading, it's not so important, in the sense that any value will work. The only difference will be a difference speed. Logically, I would say that the ideal size is the pipe buffer size, which varies between 4K and 64K: and indeed, I wrote a quick benchmark, and a 64K value yields the best result. > Use maybe os.stat(fd).st_blksize? IMO that's not worth it, also, IIRC, some OS don't set it for pipes. > Modules/_io/fileio.c uses SMALLCHUNK to read the whole content of a > file: > > #if BUFSIZ < (81024) > #define SMALLCHUNK (81024) > #elif (BUFSIZ >= (2 << 25)) > #error "unreasonable BUFSIZ > 64MB defined" > #else > #define SMALLCHUNK BUFSIZ > #endif > > The io module uses a default buffer size of 8 KB to read. Well, that's just a heuristic: the ideal size depends on the OS, the filesystem, backing device, etc (for an on-disk file)? Anyway, this too belongs to another issue. Please try to only make comments relevant to the issue at hand! > http://bugs.python.org/review/18923/diff/9757/Lib/test/test_subprocess.py > File Lib/test/test_subprocess.py (left): > > http://bugs.python.org/review/18923/diff/9757/Lib/test/test_subprocess.py#oldcode2191 > Lib/test/test_subprocess.py:2191: class > ProcessTestCaseNoPoll(ProcessTestCase): > Oh oh, removing a test is not a good idea. You should test select and > poll selectors when poll is used by default. Well, those distinct test cases made sense before, because there were two different code paths depending on whether we were using select() or poll(). Now, the code path is exactly the same, and the different selectors implementations have their own tests, so IMO that's not necessary anymore. But I re-enabled them anyway...

Oh, Charles-François Natali replied to my review by email, and it's not archived on Rietveld. Copy of him message:

> http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode420
> Lib/subprocess.py:420: _PopenSelector = selectors.SelectSelector
> This code should be factorized, but I opened a new issue for that:
> #19465.
> http://bugs.python.org/issue19465

OK, so we'll see in the other issue :-)

> http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode1583
> Lib/subprocess.py:1583: self._input_offset + _PIPE_BUF]
> (Unrelated to the selector issue) It may be more efficient to use a
> memory view instead of a slice here.

I also noticed, but that's also a different issue.

> http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode1597
> Lib/subprocess.py:1597: data = os.read(key.fd, 4096)
> In the old code, poll() uses 4096 whereas select() uses 1024 bytes... Do
> you know the reason? Why 4096 and not 1 byte or 1 MB? I would prefer a
> global constant rather than an harded limit.

When writing to the pipe, we should definitely write less than
PIPE_BUF, to avoid blocking after select()/poll() returns write-ready.
When reading, it's not so important, in the sense that any value will work.
The only difference will be a difference speed.

Logically, I would say that the ideal size is the pipe buffer size,
which varies between 4K and 64K: and indeed, I wrote a quick
benchmark, and a 64K value yields the best result.

> Use maybe os.stat(fd).st_blksize?

IMO that's not worth it, also, IIRC, some OS don't set it for pipes.

> Modules/_io/fileio.c uses SMALLCHUNK to read the whole content of a
> file:
>
> #if BUFSIZ < (8*1024)
> #define SMALLCHUNK (8*1024)
> #elif (BUFSIZ >= (2 << 25))
> #error "unreasonable BUFSIZ > 64MB defined"
> #else
> #define SMALLCHUNK BUFSIZ
> #endif
>
> The io module uses a default buffer size of 8 KB to read.

Well, that's just a heuristic: the ideal size depends on the OS, the
filesystem, backing device, etc (for an on-disk file)?

Anyway, this too belongs to another issue.

Please try to only make comments relevant to the issue at hand!

> http://bugs.python.org/review/18923/diff/9757/Lib/test/test_subprocess.py
> File Lib/test/test_subprocess.py (left):
>
> http://bugs.python.org/review/18923/diff/9757/Lib/test/test_subprocess.py#oldcode2191
> Lib/test/test_subprocess.py:2191: class
> ProcessTestCaseNoPoll(ProcessTestCase):
> Oh oh, removing a test is not a good idea. You should test select and
> poll selectors when poll is used by default.

Well, those distinct test cases made sense before, because there were
two different code paths depending on whether we were using select()
or poll(). Now, the code path is exactly the same, and the different
selectors implementations have their own tests, so IMO that's not
necessary anymore.

But I re-enabled them anyway...

History
Date	User	Action	Args
2013-12-08 11:03:56	vstinner	set	recipients: + vstinner, gregory.p.smith, pitrou, neologix, sbt, serhiy.storchaka
2013-12-08 11:03:56	vstinner	set	messageid: <1386500636.65.0.382598172751.issue19929@psf.upfronthosting.co.za>
2013-12-08 11:03:56	vstinner	link	issue19929 messages
2013-12-08 11:03:55	vstinner	create