This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author hunteke
Recipients amaury.forgeotdarc, georg.brandl, hunteke, pitrou
Date 2010-09-25.14:26:03
SpamBayes Score 3.215206e-13
Marked as misclassified No
Message-id <1285424764.69.0.831923069805.issue9942@psf.upfronthosting.co.za>
In-reply-to
Content
> Well, first, this would only work for large objects. [...]
> Why do you think you might have such duplication in your workload?

Some of the projects with which I work involve multiple manipulations of large datasets.  Often, we use Python scripts as "first and third" stages in a pipeline.  For example, in one current workflow, we read a large file into a cStringIO object, do a few manipulations with it, pass it off to a second process, and await the results.  Meanwhile, the large file is sitting around in memory because we need to do more manipulations after we get results back from the second application in the pipeline.  "Graphically":

Python Script A    ->    External App    ->    Python Script A
read large data          process data          more manipulations

Within a single process, I don't see any gain to be had.  However, in this one use-case, this pipeline is running concurrently with a number of copies with slightly different command line parameters.
History
Date User Action Args
2010-09-25 14:26:04huntekesetrecipients: + hunteke, georg.brandl, amaury.forgeotdarc, pitrou
2010-09-25 14:26:04huntekesetmessageid: <1285424764.69.0.831923069805.issue9942@psf.upfronthosting.co.za>
2010-09-25 14:26:03huntekelinkissue9942 messages
2010-09-25 14:26:03huntekecreate