This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients amaury.forgeotdarc, georg.brandl, hunteke, pitrou
Date 2010-09-25.14:31:35
SpamBayes Score 4.009652e-05
Marked as misclassified No
Message-id <1285425088.3192.11.camel@localhost.localdomain>
In-reply-to <1285424764.69.0.831923069805.issue9942@psf.upfronthosting.co.za>
Content
> > Well, first, this would only work for large objects. [...]
> > Why do you think you might have such duplication in your workload?
> 
> Some of the projects with which I work involve multiple manipulations
> of large datasets.  Often, we use Python scripts as "first and third"
> stages in a pipeline.  For example, in one current workflow, we read a
> large file into a cStringIO object, do a few manipulations with it,
> pass it off to a second process, and await the results.

Why do you read it into a cStringIO? A cStringIO has the same interface
as a file, so you could simply operate on the file directly.

(you could also try mmap if you need quick random access to various
portions of the file)
History
Date User Action Args
2010-09-25 14:31:37pitrousetrecipients: + pitrou, georg.brandl, amaury.forgeotdarc, hunteke
2010-09-25 14:31:35pitroulinkissue9942 messages
2010-09-25 14:31:35pitroucreate