msg246878 - (view) |
Author: Eric O. LEBIGOT (lebigot) |
Date: 2015-07-18 02:59 |
On OS X, the Homebrew and MacPorts versions of Python 3.4.3 raise an exception when writing a 4 GB bytearray:
>>> open('/dev/null', 'wb').write(bytearray(2**31-1))
2147483647
>>> open('/dev/null', 'wb').write(bytearray(2**31))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
This has an impact on pickle, in particular (http://stackoverflow.com/questions/31468117/python-3-can-pickle-handle-byte-objects-larger-than-4gb).
|
msg246879 - (view) |
Author: Eric O. LEBIGOT (lebigot) |
Date: 2015-07-18 03:02 |
PS: I should have written "2 GB" bytearray (so this looks like a signed 32 bit integer issue).
|
msg246979 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2015-07-20 12:18 |
This is likely a platform bug, it fails with os.write as well. Interestingly enough file.write works fine on Python 2.7 (which uses stdio), that appearently works around this kernel misfeature.
A possible partial workaround is recognise this error in the implementation of os.write and then perform a partial write. Problem is: while write(2) is documented as possibly writing less data than expected most users writing to normal files (as opposed to sockets) probably don’t expect that behavior. On the other hand, os.write already limits writes to INT_MAX on Windows (see _Py_write in Python/fileutils.c)
Because of this I’m in favour of adding a simular workaround on OSX (and can provide a patch).
BTW. the manpage for write says that writev(2) might fail with EINVAL:
[EINVAL] The sum of the iov_len values in the iov array over-
flows a 32-bit integer.
I wouldn’t be surprised if write(2) is implemented using writev(2) and that this explains the problem.
> On 18 Jul 2015, at 06:05, Serhiy Storchaka <report@bugs.python.org> wrote:
>
>
> Changes by Serhiy Storchaka <storchaka@gmail.com>:
>
>
> ----------
> components: +Extension Modules, IO -Interpreter Core
> nosy: +haypo, ned.deily, ronaldoussoren
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue24658>
> _______________________________________
|
msg246983 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2015-07-20 12:50 |
The attached patch is a first stab at a workaround. It will unconditionally limit the write size in os.write to INT_MAX on OSX.
I haven't tested yet if this actually fixes the problem mentioned on stack overflow.
|
msg246985 - (view) |
Author: Eric O. LEBIGOT (lebigot) |
Date: 2015-07-20 12:57 |
Thank you for looking into this, Ronald.
What does your patch do, exactly? does it only limit the returned byte count, or does it really limit the size of the data written by truncating it?
In any case, it would be very useful to have a warning from the Python interpreter. If the data is truncated, I would even prefer an explicit exception (e.g. "data too big for this platform (>= 2 GB)"), along with an explicit mention of it in the documentation. What do you think?
|
msg246987 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2015-07-20 13:05 |
The patch limits os.write to writing at most INT_MAX bytes on OSX. Buffered I/O using open("/some/file", "wb") should still write all data (at least according to the limited tests I've done so far).
The same limitation is already present on Windows.
And as I wrote before: os.write may accoding to the manpage for write(2) already write less bytes than requested.
I'm -1 on using an explicit exception or printing a warning about this.
|
msg246993 - (view) |
Author: Eric O. LEBIGOT (lebigot) |
Date: 2015-07-20 13:33 |
I see, thanks.
This sounds good to me too: no need for a warning or exception, indeed, since file.write() should work and the behavior of os.write() is documented.
|
msg246994 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2015-07-20 13:40 |
The Windows limit to INT_MAX is one many functions:
* os.write()
* io.FileIO.write()
* hum, maybe other, I don't remember
In the default branch, there is now _Py_write(), so only one place should be fixed.
See the issue #11395 which fixed the bug on Windows.
If it's a bug, it should be fixed on Python 2.7, 3.4, 3.5 and default branches.
|
msg246999 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2015-07-20 16:25 |
The patch I attached earlier is for the default branch. More work is needed for the other active branches.
|
msg247007 - (view) |
Author: Mali Akmanalp (Mali Akmanalp) |
Date: 2015-07-20 22:41 |
I don't know how helpful it is at this point, but the issue happens while reading also.
Here's some related discussion in the numpy tracker:
https://github.com/numpy/numpy/issues/3858 (The claim was that OSX Mavericks fixed this issue, it didn't, and there is an Apple bug ID in there somewhere, plus there is a link to a patch the torch folks used)
and also in pandas: https://github.com/pydata/pandas/issues/10641
I'd be happy to try to test patches out.
|
msg247122 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2015-07-22 14:45 |
Indeed, read(2) has the same problem. I just tested this with a small C program.
I'll rework the patch for this, and will work on patches for 3.4/3.5 and 2.7 as well.
|
msg256882 - (view) |
Author: Ian Carroll (Ian Carroll) |
Date: 2015-12-22 23:42 |
Write still fails on 3.5.1 and OS X 10.11.2. I'm no dev, so can someone explain how to use the patch while it's under review?
|
msg272030 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2016-08-05 13:57 |
Here is my patch 3.6, I am going to provide the patch for 3.5
|
msg272044 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2016-08-05 17:25 |
Sorry, I was busy with a task but here is my patch for 3.5, in fact, it's just the same for 3.6
|
msg278672 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2016-10-14 22:28 |
ping
|
msg278724 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2016-10-15 12:54 |
Ned Deily, I added you because you are in the expert for the OSX platform.
|
msg279132 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2016-10-21 14:25 |
Victor, could you check the new patch ?
|
msg279159 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2016-10-21 21:38 |
upload a new version
|
msg294113 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2017-05-21 21:16 |
Hello....
I just updated this ticket with a PR on Github.
|
msg294122 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-05-22 05:35 |
I see that we have other clamps on Windows using INT_MAX:
* sock_setsockopt()
* sock_sendto_impl()
Are these functions ok on macOS? If not, a new issue should be opened ;-)
|
msg294160 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2017-05-22 16:24 |
1. in the case of Windows, maybe we could open a new issue because this fix is only for MacOS
2. the issue was only for the files and not the sockets
what do you suggest ?
|
msg294195 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-05-22 22:16 |
I don't say that something is broken. Just that it would be nice if someone
could test socket methods.
On Windows, the bug was obvious: the function takes a C int...
|
msg327912 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2018-10-17 19:30 |
Hi all,
Could you test the PR with Windows? I don't have a Windows computer.
Thank you,
Stéphane
|
msg327916 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-10-17 23:05 |
New changeset 74a8b6ea7e0a8508b13a1c75ec9b91febd8b5557 by Victor Stinner (Stéphane Wirtel) in branch 'master':
bpo-24658: Fix read/write greater than 2 GiB on macOS (GH-1705)
https://github.com/python/cpython/commit/74a8b6ea7e0a8508b13a1c75ec9b91febd8b5557
|
msg327918 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-10-17 23:52 |
New changeset a5ebc205beea2bf1501e4ac33ed6e81732dd0604 by Victor Stinner (Stéphane Wirtel) in branch '3.6':
[3.6] bpo-24658: Fix read/write greater than 2 GiB on macOS (GH-1705) (GH-9937)
https://github.com/python/cpython/commit/a5ebc205beea2bf1501e4ac33ed6e81732dd0604
|
msg327940 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-10-18 06:58 |
New changeset 178d1c07778553bf66e09fe0bb13796be3fb9abf by Miss Islington (bot) in branch '3.7':
bpo-24658: Fix read/write greater than 2 GiB on macOS (GH-1705)
https://github.com/python/cpython/commit/178d1c07778553bf66e09fe0bb13796be3fb9abf
|
msg330259 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-11-22 14:03 |
New changeset 9a0d7a7648547ffb77144bf2480155f6d7940dea by Victor Stinner in branch 'master':
bpo-24658: os.read() reuses _PY_READ_MAX (GH-10657)
https://github.com/python/cpython/commit/9a0d7a7648547ffb77144bf2480155f6d7940dea
|
msg330260 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-11-22 14:17 |
New changeset 18f3327d9a99163a658697465eb00c31f86535eb by Miss Islington (bot) in branch '3.7':
bpo-24658: os.read() reuses _PY_READ_MAX (GH-10657)
https://github.com/python/cpython/commit/18f3327d9a99163a658697465eb00c31f86535eb
|
msg330262 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-11-22 14:25 |
New changeset 0c15e508baec7e542933db2b31ea950a646cd968 by Miss Islington (bot) in branch '3.6':
bpo-24658: os.read() reuses _PY_READ_MAX (GH-10657)
https://github.com/python/cpython/commit/0c15e508baec7e542933db2b31ea950a646cd968
|
msg335566 - (view) |
Author: Barry A. Warsaw (barry) * |
Date: 2019-02-14 21:18 |
Nosying myself since I just landed here based on an internal $work bug report. We're seeing it with reads. I'll try to set aside some work time to review the PRs.
|
msg335569 - (view) |
Author: Stéphane Wirtel (matrixise) * |
Date: 2019-02-14 21:40 |
Hi @barry
normally this issue is fixed for 3.x but I need to finish my PR for 2.7.
I think to fix for 2.7 in the next weeks.
|
msg360264 - (view) |
Author: Zachary Ware (zach.ware) * |
Date: 2020-01-19 18:17 |
Since 3.x is fixed and 2.7 has reached EOL, I'm closing the issue. Thanks for getting it fixed in 3.x, Stephane and Victor!
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:19 | admin | set | github: 68846 |
2020-01-19 18:17:57 | zach.ware | set | keywords:
- needs review |
2020-01-19 18:17:44 | zach.ware | set | status: open -> closed versions:
- Python 2.7 messages:
+ msg360264
resolution: fixed stage: patch review -> resolved |
2019-02-16 07:33:22 | ned.deily | set | assignee: ned.deily -> |
2019-02-14 21:40:24 | matrixise | set | messages:
+ msg335569 |
2019-02-14 21:18:55 | barry | set | title: open().write() fails on 2 GB+ data (OS X) -> open().write() and .read() fails on 2 GB+ data (OS X) |
2019-02-14 21:18:18 | barry | set | nosy:
+ barry messages:
+ msg335566
|
2018-11-22 14:25:28 | miss-islington | set | messages:
+ msg330262 |
2018-11-22 14:17:37 | miss-islington | set | messages:
+ msg330260 |
2018-11-22 14:04:07 | miss-islington | set | pull_requests:
+ pull_request9912 |
2018-11-22 14:03:57 | miss-islington | set | pull_requests:
+ pull_request9911 |
2018-11-22 14:03:45 | vstinner | set | messages:
+ msg330259 |
2018-11-22 12:58:56 | vstinner | set | pull_requests:
+ pull_request9910 |
2018-10-18 06:58:44 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg327940
|
2018-10-18 00:25:45 | matrixise | set | pull_requests:
+ pull_request9289 |
2018-10-17 23:52:27 | vstinner | set | messages:
+ msg327918 |
2018-10-17 23:27:19 | matrixise | set | pull_requests:
+ pull_request9287 |
2018-10-17 23:09:15 | vstinner | set | versions:
+ Python 2.7, Python 3.7, Python 3.8, - Python 3.5 |
2018-10-17 23:06:01 | miss-islington | set | pull_requests:
+ pull_request9286 |
2018-10-17 23:05:09 | vstinner | set | messages:
+ msg327916 |
2018-10-17 19:30:39 | matrixise | set | messages:
+ msg327912 |
2017-05-22 22:16:22 | vstinner | set | messages:
+ msg294195 |
2017-05-22 16:24:26 | matrixise | set | messages:
+ msg294160 |
2017-05-22 05:39:39 | zach.ware | set | nosy:
+ zach.ware
|
2017-05-22 05:35:38 | vstinner | set | messages:
+ msg294122 |
2017-05-21 21:16:49 | matrixise | set | messages:
+ msg294113 |
2017-05-21 21:15:26 | matrixise | set | pull_requests:
+ pull_request1798 |
2016-11-06 23:27:24 | Harry Li | set | nosy:
+ Harry Li
|
2016-10-21 21:38:27 | matrixise | set | files:
+ issue24658-3-3.6.diff
messages:
+ msg279159 |
2016-10-21 14:25:03 | matrixise | set | files:
+ issue24658-2-3.6.diff
messages:
+ msg279132 |
2016-10-15 12:54:08 | matrixise | set | assignee: ned.deily messages:
+ msg278724 |
2016-10-14 22:28:06 | matrixise | set | messages:
+ msg278672 |
2016-08-05 17:28:43 | matrixise | set | files:
+ issue24658-3.5.diff |
2016-08-05 17:28:33 | matrixise | set | files:
- issue24658-3.5.diff |
2016-08-05 17:25:07 | matrixise | set | files:
+ issue24658-3.5.diff
messages:
+ msg272044 |
2016-08-05 13:57:08 | matrixise | set | files:
+ issue24658-3.6.diff nosy:
+ matrixise messages:
+ msg272030
|
2016-08-04 20:48:40 | zach.ware | set | versions:
+ Python 3.6, - Python 3.4 |
2015-12-22 23:42:26 | Ian Carroll | set | nosy:
+ Ian Carroll messages:
+ msg256882
|
2015-07-22 14:45:16 | ronaldoussoren | set | messages:
+ msg247122 |
2015-07-20 22:41:41 | Mali Akmanalp | set | nosy:
+ Mali Akmanalp messages:
+ msg247007
|
2015-07-20 16:25:52 | ronaldoussoren | set | messages:
+ msg246999 |
2015-07-20 13:40:42 | vstinner | set | messages:
+ msg246994 |
2015-07-20 13:33:35 | lebigot | set | messages:
+ msg246993 |
2015-07-20 13:05:20 | ronaldoussoren | set | messages:
+ msg246987 |
2015-07-20 12:57:44 | lebigot | set | messages:
+ msg246985 |
2015-07-20 12:50:28 | ronaldoussoren | set | keywords:
+ patch, needs review files:
+ issue24658.txt messages:
+ msg246983
stage: patch review |
2015-07-20 12:19:00 | ronaldoussoren | set | messages:
+ msg246979 |
2015-07-18 04:05:12 | serhiy.storchaka | set | nosy:
+ ronaldoussoren, vstinner, ned.deily components:
+ Extension Modules, IO, - Interpreter Core
|
2015-07-18 03:02:19 | lebigot | set | messages:
+ msg246879 title: open().write() fails on 4 GB+ data (OS X) -> open().write() fails on 2 GB+ data (OS X) |
2015-07-18 02:59:28 | lebigot | create | |