This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: fix bug in StringIO.truncate - length not changed
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: arigo, davidfraser, rhettinger
Priority: high Keywords: patch

Created on 2004-05-11 14:07 by davidfraser, last changed 2022-04-11 14:56 by admin. This issue is now closed.

File name Uploaded Description Edit
StringIO-truncate-length.patch davidfraser, 2004-05-11 14:07 patch to correctly update the length of a StringIO object when truncated
StringIO_2.diff arigo, 2004-05-12 16:59
Messages (8)
msg45962 - (view) Author: David Fraser (davidfraser) Date: 2004-05-11 14:07
If truncate() is called on a StringIO object, the
length is not changed, so that seek(0, 2) calls will go
beyond the correct end of the file.
This patch adds a line to update the length, and a test
to the test method that checks that it works.
msg45963 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-05-12 15:34
Logged In: YES 
user_id=4771 needs to be reviewed.  I could spot several other (though more minor) problems in a couple of minutes.
msg45964 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-05-12 16:59
Logged In: YES 

David, it seems to me that f.truncate(huge_value) would incorrectly set f.len to huge_value with your patch.  Here is another patch fixing this and the other details I mentioned.  I also put the new test into instead.  Perhaps we should remove the if __name__=='__main__' bit, although it is nice as a quick example.

This makes me wonder if there is any reason left for which cStringIOs aren't subclassable, or if we care.

Alternatively, it makes me wonder if there wouldn't be a more efficient implementation of that would entierely avoid concatenating large strings, or if we care.  This might make StringIO at least as efficient as cStringIO for some cases, e.g. when writing a lot of strings a few kb each, by avoiding the copy overhead.
msg45965 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-06-09 18:16
Logged In: YES 

To avoid repeated string concatenation, the underlying data 
structure could be changed to an array.array object with a 
character typecode.
msg45966 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-12-20 09:25
Logged In: YES 

The current implementation already delays concatenation and typically only uses a single ''.join() to build the final buffer, so it's reasonably efficient.

What I was musing about in my previous post is a way to entierely avoid the copy overhead and let all operations deal directly with a list of strings as the basic data structure (or maybe a tree of strings, or some more advanced structure).  Repeated writes of very small strings should probably still be consolidated into larger strings (e.g. of a few KBs) but no single huge string (or array) needs to be built at all.

But this is just musing aloud, and this thread is about a bug in, so we should probably focus on fixing it...
msg45967 - (view) Author: David Fraser (davidfraser) Date: 2004-12-20 09:58
Logged In: YES 

Armin, are there any problems with your refined patch that
you are aware of? Once that is applied the other issues
could be looked at.
msg45968 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-12-20 12:55
Logged In: YES 

The patch should be reviewed by someone and then applied if no further issue is discovered, I guess.
msg45969 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-12-20 23:57
Logged In: YES 

Applied as Lib/ 1.38 and
Date User Action Args
2022-04-11 14:56:04adminsetgithub: 40244
2004-05-11 14:07:48davidfrasercreate