Title: python takes long time when return big data
Created on 2022-03-10 09:40 by HumberMe, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg414836 - (view) Author: Hu Di (HumberMe) * Date: 2022-03-10 09:43
it takes a long time when python return big data.
generally, when a function return something, it only take less than 1e-5 second,
but when the result is big, like np.random.rand(2048,3,224,224), the time cost will increase to 0.1-0.2 second
msg414842 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2022-03-10 11:28
This is expected. Your timing measures the time for garbage collection of the large arrays in addition to the time for the result to be returned.

In the line `result = myfunc()`, the name `result` gets rebound to the value of `myfunc()`. That means that `result` is unbound from whatever it was previously bound to, and the old value then gets garbage collected.

You can test this by adding a "del result" line as the last line inside the "for" loop block.
msg414883 - (view) Author: Hu Di (HumberMe) * Date: 2022-03-11 01:03
thanks for your explaining, by the way, why it costs lots of time when del  
 a large array?
msg414885 - (view) Author: Hu Di (HumberMe) * Date: 2022-03-11 02:40
I am currently processing large data, and the time spent by del is unacceptable. Is there any way to process del in parallel?
msg414923 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2022-03-11 17:26
> why it costs lots of time when del a large array?

That's probably a question for the NumPy folks, or possibly for Stack Overflow or some other question-and-answer resource. It'll depend on how NumPy arrays are de-allocated.

> Is there any way to process del in parallel?

Seems unlikely, given GIL constraints.
