New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
file read preallocs 'size' bytes which can cause memory problems #47781
Comments
I wrote a buggy PNG parser which ended up doing several file.read(large I tracked it down to the implementation of read(). When given a size Here's a reproducible BLOCKSIZE = 10*1024*1024
f=open("empty.txt", "w")
f.close()
f=open("empty.txt")
data = []
for i in range(10000):
s = f.read(BLOCKSIZE)
assert len(s) == 0
data.append(s) I wasn't sure if this is properly a bug, but since the MemoryError |
I can't reproduce, your code snippet works fine. What Python version is it? |
I tested it with Python 2.5 on a Mac, Python 2.5 on FreeBSD, and Python Perhaps the memory allocator on your machine is making a promise it can't |
Perhaps. I'm under Linux. However, at the end of the file_read() implementation in fileobject.c, if (bytesread != buffersize) Which means that the string *is* resized at the end. |
You're right. I mistook the string implementation for the list one I tracked the memory allocation all the way down to That was enough to be able to find the python-dev thread "Darwin's Mind you, I also get the problem on FreeBSD 2.6 so it isn't Darwin |
Le samedi 09 août 2008 à 11:26 +0000, Andrew Dalke a écrit :
Darwin and the BSD's supposedly share a lot of common stuff. |
FreeBSD is why my hosting provider uses. Freebsd.org calls 2.6 "legacy" There is shared history with Macs. I don't know the details though. I |
FWIW: |
Why don't you use a sensible buffer size, e.g. 1MB? Reading data in |
Andrew, as for memory reallocation issues, you may take a look at bpo-3526 If nobody objects, I will close the present bug as invalid. |
Le jeudi 11 septembre 2008 à 16:01 +0200, Anthon van der Neut a écrit :
It's too complicated. Just use chunks in all cases (even small files) Using fixed-size chunks to read binary data from a file of an unknown |
I'm still undecided on if this is a bug or not. The problem occurs even The problem is Python's implementation is "alloc the requested bytes and I looked a little for real-world cases that could cause a denial-of- If there is a problem, it will occur very rarely. Go ahead an mark it |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: