This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: weakmethod's ref is deleted before weakref's garbage-collect callback is executed
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: fdrake, kratsg, nascheme, pablogsal
Priority: normal Keywords: patch

Created on 2020-09-11 01:32 by kratsg, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 16495 kratsg, 2020-09-14 11:44
Messages (5)
msg376699 - (view) Author: Giordon Stark (kratsg) Date: 2020-09-11 01:32
Hi, this is my first issue, so I hope to try my best to explain the problem. Unfortunately, I cannot get an easy minimum-reproducible-example of this because I can only seem to invoke this behavior using pytest (6.0.1) on two tests of our code. First, let me explain the issue.

AttributeError: 'NoneType' object has no attribute '_alive'
Exception ignored in: <function WeakMethod.__new__.<locals>._cb at 0x1696c2a70>
Traceback (most recent call last):
  File "/Users/kratsg/.virtualenvs/pyhf-dev/lib/python3.7/weakref.py", line 55, in _cb
    if self._alive:

occurs sometimes. Not always (feels like a race condition) and occurs after pytest has finished, which indicates it must be hitting garbage-collect. The backtrace isn't as helpful as it seems to jump from our code straight to the callback (indicating a garbage-collect):

/Users/kratsg/pyhf/tests/test_validation.py(1082)test_optimizer_stitching()
-> pdf = pyhf.simplemodels.hepdata_like([50.0], [100.0], [10])
  /Users/kratsg/pyhf/src/pyhf/simplemodels.py(64)hepdata_like()
-> return Model(spec, batch_size=batch_size)
  /Users/kratsg/pyhf/src/pyhf/pdf.py(590)__init__()
-> config=self.config, batch_size=self.batch_size
  /Users/kratsg/pyhf/src/pyhf/pdf.py(339)__init__()
-> self.config.auxdata_order,
  /Users/kratsg/pyhf/src/pyhf/parameters/paramview.py(66)__init__()
-> self._precompute()
  /Users/kratsg/pyhf/src/pyhf/parameters/paramview.py(78)_precompute()
-> self.allpar_viewer, self.selected_viewer, self.all_indices
  /Users/kratsg/pyhf/src/pyhf/parameters/paramview.py(27)extract_index_access()
-> index_selection = baseviewer.split(indices, selection=subviewer.names)
  /Users/kratsg/pyhf/src/pyhf/tensor/common.py(59)split()
-> data = tensorlib.einsum('...j->j...', tensorlib.astensor(data))
  /Users/kratsg/pyhf/src/pyhf/tensor/numpy_backend.py(268)einsum()
-> return np.einsum(subscripts, *operands)
  <__array_function__ internals>(6)einsum()
> /Users/kratsg/.virtualenvs/pyhf-dev/lib/python3.7/weakref.py(56)_cb()
-> if self._alive:

Essentially, inside weakref.py's _cb(), I tried to figure out what "self" was:

(Pdb) self
(Pdb) !arg
<weakref at 0x16f31c5d0; dead>

and it seems like the evaluation of "self._alive" is doomed to fail as self is None. So meth comes in, we take it apart into obj and func, define an inner function _cb that closes over a weakref to a weakref to obj and registers that function to fire when the underlying object gets gc'd. However, there seems to be an assumption that "self" is not None by the time the callback is fired.

-------------------------------------------------------

Steps to reproduce:

Clone: https://github.com/scikit-hep/pyhf
Set up virtual env/install: python3 -m pip install -e .[complete]
Run pytest: pytest tests/test_validation.py -vvvv -k "test_optimizer_stitching[scipy-numpy or test_optimizer_stitching[minuit-numpy" -s
msg376700 - (view) Author: Giordon Stark (kratsg) Date: 2020-09-11 01:38
This PR seems highly related: https://github.com/python/cpython/pull/18189. Not sure if it should be linked to this issue or not.
msg376870 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-09-14 06:45
Hi Giordon and thanks for your issue.

Could you please check if you can reproduce this behaviour with Python3.8 and with the current master? We have some new
code that may be preventing this to happen:

https://github.com/python/cpython/blob/master/Modules/gcmodule.c#L780

Unfortunately, without a simpler reproducer is difficult to tell what's going on so is not possible for me to say
that this is fixed in current master.
msg376872 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-09-14 06:57
Indeed, I cannot reproduce this with Pyhon3.8:

(vev) ❯ python -m pytest tests/test_validation.py -vvvv -k "test_optimizer_stitching" -v
============================================================================================================= test session starts ==============================================================================================================
platform darwin -- Python 3.8.2, pytest-3.10.1, py-1.9.0, pluggy-0.13.1 -- /Users/pgalindo3/github/python/master/vev/bin/python                                                                                                                 cachedir: .pytest_cache
Matplotlib: 3.3.1
Freetype: 2.6.1                                                                                                                                                                                                                                 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/pgalindo3/github/python/master/pyhf, inifile: pytest.ini
plugins: mpl-0.11, mock-3.3.1, cov-2.10.1, console-scripts-0.2.0, benchmark-3.2.3                                                                                                                                                               collected 26 items / 18 deselected

tests/test_validation.py::test_optimizer_stitching[scipy-numpy_backend] PASSED                                                                                                                                                           [ 12%] tests/test_validation.py::test_optimizer_stitching[scipy-jax_backend] PASSED                                                                                                                                                             [ 25%]
tests/test_validation.py::test_optimizer_stitching[scipy-tensorflow_backend] PASSED                                                                                                                                                      [ 37%]
tests/test_validation.py::test_optimizer_stitching[scipy-pytorch_backend] PASSED                                                                                                                                                         [ 50%]
tests/test_validation.py::test_optimizer_stitching[minuit-numpy_backend] PASSED                                                                                                                                                          [ 62%]
tests/test_validation.py::test_optimizer_stitching[minuit-jax_backend] PASSED                                                                                                                                                            [ 75%]
tests/test_validation.py::test_optimizer_stitching[minuit-tensorflow_backend] PASSED                                                                                                                                                     [ 87%]
tests/test_validation.py::test_optimizer_stitching[minuit-pytorch_backend] PASSED

============================================================================================= 8 passed, 18 deselected, 50 warnings in 3.29 seconds =============================================================================================
msg376876 - (view) Author: Giordon Stark (kratsg) Date: 2020-09-14 11:44
Hi Pablo,

Thanks for looking at this. Indeed, this seems to be fixed for me on python 3.8 (I'm using docker image python:3.8-slim) and I'm unable to force the issue to come up.

I will go ahead and close this and link the corresponding PR (https://github.com/python/cpython/pull/16495) that fixed things up.
History
Date User Action Args
2022-04-11 14:59:35adminsetgithub: 85923
2020-09-14 11:44:35kratsgsetstatus: open -> closed
messages: + msg376876
pull_requests: + pull_request21295

keywords: + patch
resolution: fixed
stage: patch review
2020-09-14 06:57:19pablogsalsetmessages: + msg376872
2020-09-14 06:45:56pablogsalsetmessages: + msg376870
2020-09-14 06:18:08pablogsalsetnosy: + pablogsal
2020-09-14 06:09:27ned.deilysetcomponents: + Library (Lib), - macOS
2020-09-14 06:08:37ned.deilysetnosy: + fdrake, nascheme, - ronaldoussoren, ned.deily
2020-09-11 01:38:14kratsgsetmessages: + msg376700
2020-09-11 01:32:52kratsgcreate