Title: Unexpected error while unpickling lxml.etree.Element object
msg327054 - (view) Author: Marcin Raczyński (marc1nr) Date: 2018-10-04 15:13
If we use pickle.HIGHEST_PROTOCOL we can pickle lxml.etree.Element object but unpickling give us misleading error description:

>>> from lxml import etree
>>> import pickle
>>> import sys
'2.7.15rc1 (default, Apr 15 2018, 21:51:34) \n[GCC 7.3.0]'
>>> etree.__version__
>>> pickled = pickle.dumps(etree.Element('x'), protocol=pickle.HIGHEST_PROTOCOL)
>>> pickle.loads(pickled)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "src/lxml/etree.pyx", line 1131, in lxml.etree._Element.__repr__
  File "src/lxml/etree.pyx", line 981, in lxml.etree._Element.tag.__get__
  File "src/lxml/apihelpers.pxi", line 19, in lxml.etree._assertValidNode
AssertionError: invalid Element proxy at 140260172089392

See also:
msg327060 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-04 15:56
This is a tracker for bugs in the C implementation of Python. lxml is not a part of the Python standard library. Use corresponded bug trackers for reporting bugs in third-party packages.
msg327114 - (view) Author: Marcin Raczyński (marc1nr) Date: 2018-10-05 09:19
How are you sure that a bug is not in the CPython implementation of the pickle module but in the lxml?
msg327116 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-05 09:36
lxml.etree classes don't implement any methods related to pickling: __reduce__, __reduce_ex__, __getstate__, __setstate__, __getnewargs__, __getnewargs_ex__. But there are extension classes which contain the state invisible to Python. In this case they are pickled as empty classes that leads to unexpected error while unpickling. Python 3 detects such cases and raise exceptions while pickling. This change was not backported to 2.7 for compatibility reasons.

The only way to fix this issue in 2.7 is implementing pickle related methods (e.g. __getstate__ or __reduce__) in lxml.etree classes. They should either raise an exception, preventing pickling these objects, or implement support of pickling.
msg327120 - (view) Author: Marcin Raczyński (marc1nr) Date: 2018-10-05 10:24
Thanks Serhiy for explanation! 

I quote your comment in a lxml issue tracker:
