This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: Unexpected error while unpickling lxml.etree.Element object
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: marc1nr, scoder, serhiy.storchaka
Priority: normal Keywords:

Created on 2018-10-04 15:13 by marc1nr, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg327054 - (view) Author: Marcin Raczyński (marc1nr) Date: 2018-10-04 15:13
If we use pickle.HIGHEST_PROTOCOL we can pickle lxml.etree.Element object but unpickling give us misleading error description:

>>> from lxml import etree
>>> import pickle
>>> import sys
'2.7.15rc1 (default, Apr 15 2018, 21:51:34) \n[GCC 7.3.0]'
>>> etree.__version__
>>> pickled = pickle.dumps(etree.Element('x'), protocol=pickle.HIGHEST_PROTOCOL)
>>> pickle.loads(pickled)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "src/lxml/etree.pyx", line 1131, in lxml.etree._Element.__repr__
  File "src/lxml/etree.pyx", line 981, in lxml.etree._Element.tag.__get__
  File "src/lxml/apihelpers.pxi", line 19, in lxml.etree._assertValidNode
AssertionError: invalid Element proxy at 140260172089392

See also:
msg327060 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-04 15:56
This is a tracker for bugs in the C implementation of Python. lxml is not a part of the Python standard library. Use corresponded bug trackers for reporting bugs in third-party packages.
msg327114 - (view) Author: Marcin Raczyński (marc1nr) Date: 2018-10-05 09:19
How are you sure that a bug is not in the CPython implementation of the pickle module but in the lxml?
msg327116 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-05 09:36
lxml.etree classes don't implement any methods related to pickling: __reduce__, __reduce_ex__, __getstate__, __setstate__, __getnewargs__, __getnewargs_ex__. But there are extension classes which contain the state invisible to Python. In this case they are pickled as empty classes that leads to unexpected error while unpickling. Python 3 detects such cases and raise exceptions while pickling. This change was not backported to 2.7 for compatibility reasons.

The only way to fix this issue in 2.7 is implementing pickle related methods (e.g. __getstate__ or __reduce__) in lxml.etree classes. They should either raise an exception, preventing pickling these objects, or implement support of pickling.
msg327120 - (view) Author: Marcin Raczyński (marc1nr) Date: 2018-10-05 10:24
Thanks Serhiy for explanation! 

I quote your comment in a lxml issue tracker:
Date User Action Args
2022-04-11 14:59:06adminsetgithub: 79075
2018-10-05 10:24:43marc1nrsetstatus: open -> closed
resolution: wont fix
messages: + msg327120
2018-10-05 09:37:24serhiy.storchakasetnosy: + scoder
2018-10-05 09:36:56serhiy.storchakasetmessages: + msg327116
2018-10-05 09:19:29marc1nrsetstatus: closed -> open
resolution: third party -> (no value)
messages: + msg327114
2018-10-04 15:56:02serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg327060

resolution: third party
stage: resolved
2018-10-04 15:13:53marc1nrcreate