This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Memory corruption using pickle over pipe to subprocess
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, nagle, pitrou, serhiy.storchaka
Priority: normal Keywords:

Created on 2015-03-13 06:47 by nagle, last changed 2022-04-11 14:58 by admin.

Messages (8)
msg238009 - (view) Author: John Nagle (nagle) Date: 2015-03-13 06:47
I'm porting a large, working system from Python 2 to Python 3, using "six", so the same code works with both. One part of the system works a lot like the multiprocessing module, but predates it. It launches child processes with "Popen" and talks to them using "pickle" over stdin/stdout as pipes.  Works fine under Python 2, and has been working in production for years.

Under Python 3, I'm getting errors that indicate memory corruption:

Fatal Python error: GC object already tracked

Current thread 0x00001a14 (most recent call first):
  File "C:\python34\lib\site-packages\pymysql\connections.py", line 411
in description
  File "C:\python34\lib\site-packages\pymysql\connections.py", line 1248
in _get_descriptions
  File "C:\python34\lib\site-packages\pymysql\connections.py", line 1182
in _read_result_packet
  File "C:\python34\lib\site-packages\pymysql\connections.py", line 1132
in read
  File "C:\python34\lib\site-packages\pymysql\connections.py", line 929
in _read_query_result
  File "C:\python34\lib\site-packages\pymysql\connections.py", line 768
in query
  File "C:\python34\lib\site-packages\pymysql\cursors.py", line 282 in
_query
  File "C:\python34\lib\site-packages\pymysql\cursors.py", line 134 in
execute
  File "C:\projects\sitetruth\domaincacheitem.py", line 128 in select
  File "C:\projects\sitetruth\domaincache.py", line 30 in search
  File "C:\projects\sitetruth\ratesite.py", line 31 in ratedomain
  File "C:\projects\sitetruth\RatingProcess.py", line 68 in call
  File "C:\projects\sitetruth\subprocesscall.py", line 140 in docall
  File "C:\projects\sitetruth\subprocesscall.py", line 158 in run
  File "C:\projects\sitetruth\RatingProcess.py", line 89 in main
  File "C:\projects\sitetruth\RatingProcess.py", line 95 in <module>

That's clear memory corruption.

Also,

  File "C:\projects\sitetruth\InfoSiteRating.py", line 200, in scansite
    if len(self.badbusinessinfo) > 0 :                  # if bad stuff
NameError: name 'len' is not defined

There are others, but those two should be impossible to cause from Python source. 

I've done the obvious stuff - deleted all .pyc files and Python cache directories.  All my code is in Python. Every library module came in via "pip", into a clean Python 3.4.3 (32 bit) installation on Win7/x86-64.

Currently installed packages (via "pip list")

beautifulsoup4 (4.3.2)
dnspython3 (1.12.0)
html5lib (0.999)
pip (6.0.8)
PyMySQL (0.6.6)
pyparsing (2.0.3)
setuptools (12.0.5)
six (1.9.0)

Nothing exotic there.  The project has zero local C code; any C code came 
from the Python installation or the above packages, most of which are pure Python.

It all works fine with Python 2.7.9.  Everything else in the program seems
to be working fine under both 2.7.9 and 3.4.3, until subprocesses are involved.

What's being pickled is very simple; no custom objects, although Exception types are sometimes pickled if the subprocess raises an exception.  

Pickler and Unpickler instances are being reused here.  A message is pickled, piped to the subprocess, unpickled, work is done, and a response comes back later via the return pipe.  A send looks like:

    self.writer.dump(args)      # send data
    self.dataout.flush()        # finish output
    self.writer.clear_memo()    # no memory from cycle to cycle

and a receive looks like:

    result = self.reader.load() # read and return from child
    self.reader.memo = {}       # no memory from cycle to cycle

Those were the recommended way to reset "pickle" for new traffic years ago.
(You have to clear the receive side as well as the send side, or the dictionary
of saved objects grows forever.) My guess is that there's something about reusing "pickle" instances that botches memory uses in CPython 3's C code 
for "cpickle".  That should work, though; the "multiprocessing" module works
by sending pickled data over pipes.

The only code difference between Python 2 and 3 is that under Python 3 I have to use "sys.stdin.buffer" and "sys.stdout.buffer" as arguments to Pickler and Unpickler. Otherwise they complain that they're getting type "str".

Unfortunately, I don't have an easy way to reproduce this bug yet. 

Is there some way to force the use of the pure Python pickle module under Python 3? That would help isolate the problem.

				John Nagle
msg238012 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-13 07:27
sys.modules['_pickle']
del sys.modules['pickle'] # if exists
import pickle

Or just use pickle._Pickler instead of pickle.Pickler and like (implementation detail!).
msg238049 - (view) Author: John Nagle (nagle) Date: 2015-03-13 19:48
> Or just use pickle._Pickler instead of pickle.Pickler and like 
> (implementation detail!).

Tried that.  Changed my own code as follows:

25a26
> 
71,72c72,73
<         self.reader = pickle.Unpickler(self.proc.stdout)    # set up reader
<         self.writer = pickle.Pickler(self.proc.stdin,kpickleprotocolversion)
---
>         self.reader = pickle._Unpickler(self.proc.stdout)    # set up reader
>         self.writer = pickle._Pickler(self.proc.stdin,kpickleprotocolversion
125,126c126,127
<         self.reader = pickle.Unpickler(self.datain)     # set up reader
<         self.writer = pickle.Pickler(self.dataout,kpickleprotocolversion)   
---
>         self.reader = pickle._Unpickler(self.datain)     # set up reader
>         self.writer = pickle._Pickler(self.dataout,kpickleprotocolversion)  

Program runs after those changes.

So it looks like CPickle has a serious memory corruption problem.
msg238053 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-13 20:42
Could you please try to minimize you data and try to reproduce an issue without using third-party modules if this is possible?
msg238055 - (view) Author: John Nagle (nagle) Date: 2015-03-13 21:30
"minimize you data" - that's a big job here. Where are the tests for "pickle"?  Is there one that talks to a subprocess over a pipe? Maybe I can adapt that.
msg238075 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-14 08:54
No, there are no subprocess specific tests for pickle. Pickle tests are in Lib/test/pickletester.py  and Lib/test/test_pickle.py.

First try dump pickled data to a file and then load it in other process. Is it still failed?
msg238158 - (view) Author: John Nagle (nagle) Date: 2015-03-15 20:17
More info: the problem is on the "unpickle" side.  If I use _Unpickle and Pickle, so the unpickle side is in Python, but the pickle side is in C, no problem. If I use Unpickle and _Pickle, so the unpickle side is C, crashes.
msg288160 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-19 19:31
Without additional information we can't solve this issue. Is the problem still reproduced?
History
Date User Action Args
2022-04-11 14:58:13adminsetstatus: pending -> open
github: 67843
2017-02-19 19:31:51serhiy.storchakasetstatus: open -> pending
2017-02-19 19:31:43serhiy.storchakasetstatus: pending -> open

messages: + msg288160
2017-02-19 19:29:47serhiy.storchakasetstatus: open -> pending
2015-07-21 07:22:27ethan.furmansetnosy: - ethan.furman
2015-03-18 16:51:06ethan.furmansetnosy: + ethan.furman
2015-03-15 20:17:37naglesetmessages: + msg238158
2015-03-14 08:54:53serhiy.storchakasetmessages: + msg238075
2015-03-13 21:30:32naglesetmessages: + msg238055
2015-03-13 20:42:42serhiy.storchakasetmessages: + msg238053
2015-03-13 19:48:31naglesetmessages: + msg238049
2015-03-13 07:27:14serhiy.storchakasetversions: + Python 3.5
nosy: + alexandre.vassalotti, serhiy.storchaka, pitrou

messages: + msg238012

type: behavior
stage: test needed
2015-03-13 06:47:14naglecreate