This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author buhtz
Recipients buhtz
Date 2021-08-12.13:33:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1628775182.23.0.332446987158.issue44901@roundup.psfhosted.org>
In-reply-to
Content
I read some of the PEPs about pickeling. But I would not say that I understood everything.

Of course I checked the docu about multiprocessing.Queue. Currently it is not clear for me which pickle protocol is used by multiprocessing.Queue.
Maybe I missed something in the docu or the docu can be improved?

 - Is there a fixed default - maybe different between the Python versions?
 - Or is the pickle protocol version dynamicly selected depending on the kind/type/size of data put() into the Queue?

Is there a way to find out at runtime which protocol version is used for a specific Queue instance with a specific piece of data?

Background:
I use Python 3.7 and 3.9 with Pandas 1.3.5.
I parallelize work with hugh(?) pandas.DataFrame objects. I simply cut them into pieces (on row axis) which number is limited to the machines CPU cores (minus 1). The cutting happens several times in my sripts because
for some things I need the data as one complete DataFrame.
Just for example here is one of such pieces which is given to a worker by argument and send back via Queue - 7 workers!

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 226687 entries, 0 to 226686
Data columns (total 38 columns):
 #   Column              Non-Null Count   Dtype
---  ------              --------------   -----
 0   HASH_ ....
 ....
 37  NAME_ORG            226687 non-null  object
dtypes: datetime64[ns](6), float64(1), int64(1), object(30)
memory usage: 65.7+ MB 

I am a bit "scared" that Python wasting my CPU time and does some compression on that data. ;) I just want to get a better idea what is done in the background.
History
Date User Action Args
2021-08-12 13:33:02buhtzsetrecipients: + buhtz
2021-08-12 13:33:02buhtzsetmessageid: <1628775182.23.0.332446987158.issue44901@roundup.psfhosted.org>
2021-08-12 13:33:02buhtzlinkissue44901 messages
2021-08-12 13:33:01buhtzcreate