This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mstefanro
Recipients Arfrever, alexandre.vassalotti, asvetlov, mstefanro, neologix, pitrou, rhettinger, serhiy.storchaka
Date 2013-05-11.00:09:06
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <518D8C1C.5080209@gmail.com>
In-reply-to <1368218790.01.0.562055588211.issue17810@psf.upfronthosting.co.za>
Content
On 5/10/2013 11:46 PM, Stefan Mihaila wrote:
> Changes by Stefan Mihaila <mstefanro@gmail.com>:
>
>
> ----------
> nosy: +mstefanro
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue17810>
> _______________________________________
>
Hello. I've worked on implementing PEP3154 as part of GSoC2012.
My work is available in a repo at [1].
The blog I've used to report my work is at [2] and contains some useful 
information.

Here is a list of features that were implemented as part of GSoC:

* Pickling of very large bytes and strings
* Better pickling of small string and bytes (+ tests)
* Native pickling of sets and frozensets (+ tests)
* Self-referential sets and frozensets (+ tests)
* Implicit memoization (BINPUT is implicit for certain opcodes)
   - The argument against this was that pickletools.optimize would
     not be able to prevent memoization of objects that are not
     referred later. For such situations, a special flag at beginning
     could be added, which indicates whether implicit BINPUT is enabled.
     This flag could be added as one of the higher-order bits of the 
protocol
     version. For instance:
         PROTO \x04 + BINUNICODE ".."
         and
         PROTO \x84 + BINUNICODE ".." + BINPUT 1
     would be equivalent. Then pickletools.optimize could choose whether
     it wants implicit BINPUT or not. Sure, this would complicate 
matters and it's
     not for me to decide whether it's worth it.
     In my midterm report at [3] there are some examples of what a 
pickled string
     looks in v4 without implicit memoization, and some size comparisons
     to v3.
* Pickling of nested globals, methods etc. (+ tests)
* Pickling calls to __new__ with keyword args (+ tests)
* A BAIL_OUT opcode was always outputted when pickling failed, so that
   the Pickler and Unpickler can be both run at once on different ends
   of a stream. The Pickler could guarantee to always send a
   correct pickle on the stream. The Unpickler would never end up hanging
   when Pickling failed mid-work.
   -  At the time, Alexandre suggested this would probably not be a great
      idea because it should be the responsibility of the protocol used
      to assure some consistency. However, this does not appear to be
      a trivial task to achieve. The size of the pickle is not known in
      advance, and waiting for the Pickler to complete before sending
      the data via stream is not as efficient, because the Unpickler
      would not be able to run at the same time.
      write and read methods of the stream would have to be wrapped and
      some escape sequence used. This would
      increase the size of the pickled string for some sort of worst-case
      of the escape sequence, probably. My thought was that it would be
      beneficial for the average user to have the guarantee that the Pickler
      always outputs a correct pickle to a stream, even if it raises an 
exception.
* Other minor changes that I can't really remember.

Although I'm sure Alexandre had his good reasons to start the work from
scratch, it would be a shame to waste all this work. The features mentioned
above are working and although the implementation may not be ideal (I don't
have the cpython experience of a regular dev), I'm sure useful bits can be
extracted from it.
Alexandre suggested that I extract bits and post patches, so I have 
attached,
for now, support for pickling methods and nested globals (+tests).
I'm willing to do so for some or the rest of the features, should this 
be requested
and should I have the necessary time to do so.

[1] https://bitbucket.org/mstefanro/pickle4/
[2] https://pypickle4.wordpress.com/
[3] https://gist.github.com/mstefanro/3086647
Files
File name Uploaded
methods.patch mstefanro, 2013-05-11.00:09:04
History
Date User Action Args
2013-05-11 00:09:08mstefanrosetrecipients: + mstefanro, rhettinger, pitrou, alexandre.vassalotti, Arfrever, asvetlov, neologix, serhiy.storchaka
2013-05-11 00:09:06mstefanrolinkissue17810 messages
2013-05-11 00:09:06mstefanrocreate