classification
Title: Allow deepcopying paused generators
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, bruno.dupuis, cool-RR, mont29, pitrou, serhiy.storchaka, terry.reedy
Priority: normal Keywords:

Created on 2011-02-23 16:15 by cool-RR, last changed 2017-02-19 15:01 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
test_live_generator.py cool-RR, 2011-02-25 21:54
Messages (17)
msg129213 - (view) Author: Ram Rachum (cool-RR) * Date: 2011-02-23 16:15
Please allow to deepcopy and to pickle paused generators, including all their state.

This is implemented in Pypy:

	Python 2.5.2 (335e875cb0fe, Dec 28 2010, 20:31:56)
	[PyPy 1.4.1] on win32
	Type "copyright", "credits" or "license()" for more information.
	DreamPie 1.1.1
	>>> import pickle, copy
	>>> def next(thing): # For compatibility
	...     return thing.next()
	>>> def g():
	...     for i in range(4):
	...         yield i
	>>> list(g())
	[0, 1, 2, 3]
	>>> live_generator = g()
	>>> next(live_generator)
	0
	>>> next(live_generator)
	1
	
Now `live_generator` holds a generator which is in the middle of its operation. It went through 0 and 1, and it still has 2 and 3 to yield.

We deepcopy it:	
	
	>>> live_generator_deepcopy = copy.deepcopy(live_generator)
	
The deepcopied generator assumes the same state of the original one. Let's exhaust it:	
	
	>>> list(live_generator_deepcopy)
	[2, 3]
	>>> list(live_generator_deepcopy)
	[]
	
Pypy also lets us pickle and unpickle the live generator:
	
	>>> live_generator_pickled = pickle.dumps(live_generator)
	>>> live_generator_unpickled = pickle.loads(live_generator_pickled)
	>>> list(live_generator_unpickled)
	[2, 3]
	>>> list(live_generator_unpickled)
	[]
	
And the original live generator was unchanged by all these operations:
	
	>>> list(live_generator)
	[2, 3]
	>>> list(live_generator)
	[]
	
All the above was demonstrated in Pypy. In Python 3.2, trying to pickle a live generator raises this exception:

	>>> pickle.dumps(live_generator)
	Traceback (most recent call last):
	  File "<stdin>", line 1, in <module>
	_pickle.PicklingError: Can't pickle <class 'generator'>: attribute lookup builtins.generator failed

And trying to deepcopy one raises this exception:
	
	>>> copy.deepcopy(live_generator)
	Traceback (most recent call last):
	  File "<stdin>", line 1, in <module>
	  File "c:\Python32\lib\copy.py", line 174, in deepcopy
		y = _reconstruct(x, rv, 1, memo)
	  File "c:\Python32\lib\copy.py", line 285, in _reconstruct
		y = callable(*args)
	  File "c:\Python32\lib\copyreg.py", line 88, in __newobj__
		return cls.__new__(cls, *args)
	TypeError: object.__new__(generator) is not safe, use generator.__new__()

It would be nice if Python 3.2 could pickle and deepcopy live generators.
msg129214 - (view) Author: Ram Rachum (cool-RR) * Date: 2011-02-23 16:21
P.S. I'm willing to write a test-case if it will help.
msg129419 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-02-25 20:31
Test cases always help when appropriate.
A link to the Pypy code that does this might also help.
Or perhaps ask them to submit a patch to this issue.
msg129438 - (view) Author: Ram Rachum (cool-RR) * Date: 2011-02-25 21:54
Tests attached.
msg129442 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2011-02-25 22:02
Although, I would really like to see support of pickling generators. It is not really possible in CPython. This is recurrent request. I even wrote a small article about it.

http://peadrop.com/blog/2009/12/29/why-you-cannot-pickle-generators/

Looking how PyPy did it, I see portability problems with the fact it dumps the byte-code of the generator to disk. Python's documentation clearly states that the byte-code is an implementation details and can (and does) change between releases. Hence, this method is not really suitable for pickle which needs to remain compatible across releases.
msg129516 - (view) Author: Ram Rachum (cool-RR) * Date: 2011-02-26 09:59
Hi Alexandre,

I read your blog post, but I don't understand-- Why does bytecode need to be pickled in order to pickle live generators? I understand that the local variables need to be pickled, (and let's assume they're all pickleable,) and that a pointer to the current instruction needs to be pickled, but why the bytecode? When you pickle a normal function or a class, no bytecode gets pickled, so why does it have to be pickled here?
msg129565 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2011-02-26 16:03
The value of the instruction pointer depends on the byte-code. So it's not portable either.

But, the bigger issue is the fact generator objects do not have names we can refer to, unlike top-level functions and classes which pickle supports. Similarly, we can't pickle lambdas and nested-functions for this exact reason.

Personally, I haven't found a way around this. But, that doesn't mean there isn't one. If you find one, I will more than pleased to review it.
msg129572 - (view) Author: Ram Rachum (cool-RR) * Date: 2011-02-26 16:45
*"generator objects do not have names we can refer to"*

How about referring to the generator function that created them and to all the arguments?

Regarding instruction pointer, I really don't know the internals of how this works. Can we make some arrangement so we can specify line number (and possibly column number too) as a cross-interpreter instruction pointer?
msg129576 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-02-26 17:28
Alexandre,
Do the considerations against pickling apply to deep copying? It would seem that copying bytecode and pointer within a run should be ok.
msg204968 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013-12-01 21:34
Allowing generators to be deepcopied via their code object should be fine.
msg205596 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-08 19:25
Instead of copy.deepcopy, why not call itertools.tee?

For the record, pickling a live generator implies pickling a frame object. We wouldn't be able to guarantee cross-version compatibility for such pickled objects.
msg205598 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013-12-08 19:37
The issue here is copy.deepcopy will raise an exception whenever it encounters a generator. We would like to do better here. Unfortunately, using itertools.tee is not a solution here because it does not preserve the type of the object.
msg205599 - (view) Author: Ram Rachum (cool-RR) * Date: 2013-12-08 19:38
"Instead of copy.deepcopy, why not call itertools.tee?"

It's hard for me to give you a good answer because I submitted this ticket 2 years ago, and nowadays I don't have personal interest in it anymore.

But, I think `itertools.tee` wouldn't have worked for me, because it just saves the value rather than really duplicating the generator.
msg205600 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-08 19:59
> The issue here is copy.deepcopy will raise an exception whenever it
> encounters a generator. We would like to do better here.
> Unfortunately, using itertools.tee is not a solution here because it
> does not preserve the type of the object.

Indeed, itertools.tee is not a general solution for copy.deepcopy, but
it's a good solution to *avoid* calling copy.deepcopy when you simply
want to "fork" a generator.

IMHO supporting live generators (and therefore frame objects) in
copy.deepcopy would be a waste of effort.
msg205601 - (view) Author: Bastien Montagne (mont29) Date: 2013-12-08 20:23
Yes, itertools.tee just keep in memory elements produced by the "most advanced" iterator, until the "least advanced" iterator consumes them. It may not be a big issue in most cases, but I can assure you that when you have to iter several times over a million of vertices, this is not a good solution… ;)

Fortunately, in this case I can just produce several times the same generator, but still, would be nicer (at least on the “beauty of the code” aspect) if there was a way to really duplicate generators.

Unless I misunderstood things, and deepcopying a generator would imply to also copy its whole source of data?
msg205602 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-08 20:29
> Unless I misunderstood things, and deepcopying a generator would imply
> to also copy its whole source of data?

deepcopying is "deep", and so would have to recursively deepcopy the
generator's local variables...
msg288141 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-19 15:01
I concur with Antoine.

copy.deepcopy() should be used with care since it recursively copies all referred data. In case of generators the data can be referred implicitly. Every global value cached in local variable, every passed argument, every nonlocal variable should be copied. This may not just wastes memory and CPU time, but change the semantic.
History
Date User Action Args
2017-02-19 15:01:18serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg288141

resolution: rejected
stage: needs patch -> resolved
2013-12-08 20:29:08pitrousetmessages: + msg205602
2013-12-08 20:23:26mont29setmessages: + msg205601
2013-12-08 19:59:02pitrousetmessages: + msg205600
2013-12-08 19:38:29cool-RRsetmessages: + msg205599
2013-12-08 19:37:40alexandre.vassalottisetmessages: + msg205598
2013-12-08 19:25:52pitrousetmessages: + msg205596
2013-12-08 14:46:07mont29setnosy: + mont29
2013-12-01 21:34:23alexandre.vassalottisettitle: Allow deepcopying and pickling paused generators -> Allow deepcopying paused generators
stage: test needed -> needs patch
messages: + msg204968
versions: + Python 3.5, - Python 3.3
2012-12-06 01:38:11bruno.dupuissetnosy: + bruno.dupuis
2011-02-26 17:28:41terry.reedysetnosy: terry.reedy, pitrou, alexandre.vassalotti, cool-RR
messages: + msg129576
2011-02-26 16:45:21cool-RRsetnosy: terry.reedy, pitrou, alexandre.vassalotti, cool-RR
messages: + msg129572
2011-02-26 16:03:21alexandre.vassalottisetnosy: terry.reedy, pitrou, alexandre.vassalotti, cool-RR
messages: + msg129565
2011-02-26 09:59:18cool-RRsetnosy: terry.reedy, pitrou, alexandre.vassalotti, cool-RR
messages: + msg129516
2011-02-25 22:02:08alexandre.vassalottisetnosy: terry.reedy, pitrou, alexandre.vassalotti, cool-RR
messages: + msg129442
2011-02-25 21:54:36cool-RRsetfiles: + test_live_generator.py
nosy: terry.reedy, pitrou, alexandre.vassalotti, cool-RR
messages: + msg129438
2011-02-25 20:31:29terry.reedysetnosy: + alexandre.vassalotti, terry.reedy, pitrou

messages: + msg129419
stage: test needed
2011-02-23 16:21:52cool-RRsetmessages: + msg129214
2011-02-23 16:15:38cool-RRcreate