This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: python takes long time when return big data
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: HumberMe, mark.dickinson
Priority: normal Keywords:

Created on 2022-03-10 09:40 by HumberMe, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python_performance_issue.py HumberMe, 2022-03-10 09:43 reproducer
Messages (5)
msg414836 - (view) Author: Hu Di (HumberMe) * Date: 2022-03-10 09:43
it takes a long time when python return big data.
generally, when a function return something, it only take less than 1e-5 second,
but when the result is big, like np.random.rand(2048,3,224,224), the time cost will increase to 0.1-0.2 second
msg414842 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2022-03-10 11:28
This is expected. Your timing measures the time for garbage collection of the large arrays in addition to the time for the result to be returned.

In the line `result = myfunc()`, the name `result` gets rebound to the value of `myfunc()`. That means that `result` is unbound from whatever it was previously bound to, and the old value then gets garbage collected.

You can test this by adding a "del result" line as the last line inside the "for" loop block.
msg414883 - (view) Author: Hu Di (HumberMe) * Date: 2022-03-11 01:03
thanks for your explaining, by the way, why it costs lots of time when del  
 a large array?
msg414885 - (view) Author: Hu Di (HumberMe) * Date: 2022-03-11 02:40
I am currently processing large data, and the time spent by del is unacceptable. Is there any way to process del in parallel?
msg414923 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2022-03-11 17:26
> why it costs lots of time when del a large array?

That's probably a question for the NumPy folks, or possibly for Stack Overflow or some other question-and-answer resource. It'll depend on how NumPy arrays are de-allocated.

> Is there any way to process del in parallel?

Seems unlikely, given GIL constraints.
History
Date User Action Args
2022-04-11 14:59:57adminsetgithub: 91127
2022-03-11 17:26:56mark.dickinsonsetmessages: + msg414923
2022-03-11 02:40:00HumberMesetmessages: + msg414885
2022-03-11 01:03:01HumberMesetmessages: + msg414883
2022-03-10 11:28:55mark.dickinsonsetstatus: open -> closed

nosy: + mark.dickinson
messages: + msg414842

resolution: not a bug
stage: resolved
2022-03-10 09:43:06HumberMesetfiles: + python_performance_issue.py

messages: + msg414836
2022-03-10 09:40:49HumberMecreate