Title: pyclbr.readmodule_ex traversing "import __main__": dies with ValueError: __main__.__spec__ is None / is not set
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.10
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: andrei.avk, kxrob
Priority: normal Keywords: patch

Created on 2021-02-22 17:48 by kxrob, last changed 2021-07-12 17:42 by andrei.avk.

Pull Requests
URL Status Linked Edit
PR 24623 open kxrob, 2021-02-22 18:46
Messages (7)
msg387520 - (view) Author: Robert (kxrob) * Date: 2021-02-22 17:48
When pyclbr.readmodule_ex() is traversing "import __main__" or another 
module without __spec__, it dies completely 
with "ValueError: __main__.__spec__ is None / is not set".

=> It should at least continue with the (big) rest 
as the comment in _ModuleBrowser.visit_Import() says:
            # If we can't find or parse the imported module,
            # too bad -- don't die here.

And optionally fall back to using __file__ when present?

Traceback (most recent call last):
  File "C:\Python310\Lib\site-packages\pythonwin\pywin\framework\editor\", line 128, in OnActivateView
  File "C:\Python310\Lib\site-packages\pythonwin\pywin\framework\editor\", line 181, in CheckRefreshList
  File "C:\Python310\Lib\site-packages\pythonwin\pywin\framework\editor\", line 173, in CheckMadeList
    self.rootitem = root = self._MakeRoot()
  File "C:\Python310\Lib\site-packages\pythonwin\pywin\framework\editor\", line 153, in _MakeRoot
    data = reader(mod, path and [path])
  File "C:\Python310\lib\", line 120, in readmodule_ex
    return _readmodule(module, path or [])
  File "C:\Python310\lib\", line 159, in _readmodule
    return _readmodule(submodule, parent['__path__'], package)
  File "C:\Python310\lib\", line 184, in _readmodule
    return _create_tree(fullmodule, path, fname, source, tree, inpackage)
  File "C:\Python310\lib\", line 272, in _create_tree
  File "C:\Python310\lib\", line 410, in visit
    return visitor(node)
  File "C:\Python310\lib\", line 418, in generic_visit
  File "C:\Python310\lib\", line 410, in visit
    return visitor(node)
  File "C:\Python310\lib\", line 243, in visit_Import
    _readmodule(, [])
  File "C:\Python310\lib\", line 167, in _readmodule
    spec = importlib.util._find_spec_from_path(fullmodule, search_path)
  File "C:\Python310\lib\importlib\", line 69, in _find_spec_from_path
    raise ValueError('{}.__spec__ is None'.format(name))
ValueError: __main__.__spec__ is None
msg397054 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-06 20:08
Robert: I haven't worked with importlib or pyclbr before, so these may be naive questions, but:

- can your usecase be resolved with a workaround, e.g. setting ModuleSpec manually in / on __main__ module, something like `from importlib._bootstrap import ModuleSpec; __spec__ = ModuleSpec('main',None)

- You mention that this issue may apply to other modules, but the Python docs say __main__ is the only case that may have __spec__=None in some cases (in my testing it indeed have it set to None). Do you have any examples where other modules have __spec__=None?

What's your usecase for examining __main__ with pyclbr? Not trying to sound doubtful, just really curious.

- Re your PR I have a bit of a concern that catching a ValueError silently might mask other types of ValueError, but I haven't looked more closely into that, just want to note it here for the future.

Btw thanks for the reply on PR to my question, it makes sense.
msg397247 - (view) Author: Robert (kxrob) * Date: 2021-07-10 11:40
You see the usecase from the stack trace: PythonWin (the IDE from pywin32 package) uses pyclbr - to inspect arbitrary user code.  
(Neither code is from me)

I'm not inspecting __main__ explicitely. The problem seems to arise in newer Python versions (3.10+?) because the class browser now seems to parse imports somehow recursively (_readmodule() several times in the stack trace) and when user code somewhere contains e.g. "import __main__" ...

pyclbr should perhaps handle (not fail in) all legal cases w/o breaking: when some strange builtin/extension/artificial has .__spec__ as None or not set or no python code. (We cannot force any user code to monkey patch __main__.__spec__ or potential other modules.)

>>> mod = types.ModuleType('mymod')
>>> mod.__spec__
# (None)

importlib.util._find_spec_from_path() has choosen to raise ValueError instead of an extra custom Error (or a special return value) for those cases and to return None for the similar case of 'not found') . Though those 3 cases are similiar in consequence here.  pyclbr also "cheaply abuses" ImportError / ModuleNotFound to translate one of those cases (None return value) for internal signaling. (There is never a real ImportError just remote linguistic similarity ;-) ) 

Hence the simple pragmatic fix by kind of reunification of signaling the "end of the game" here - as the way of signaling here is anyway rather pragmatic and evolution style.

ValueError is often (ab)used to signal application level errors "cheaply" (to not define and distribute an extra Exception type) - and its a limited internal case here where mix-up with errors from something like "math.sqrt(-1)" is not possible w/o coding bugs (which are to be detected by tests)

But you may establish a more meticulous error / return value signaling system - which though will require changes / refactoring in several places and consideration for compatibility ...
(Think its hardly worth it)
msg397279 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-12 05:22
Robert: thanks for the response..

I looked a bit more into it and I'm starting to think it's a PyWin issue, not pyclbr.

It seems to me that PyWin is calling pyclbr readmodule_ex with module='__main__', and pyclbr is supposed to find it on the filesystem. The problem is, '__main__' is the special name for current script or interactive session. It doesn't give pyclbr any info on where the script might be, if it exists (maybe it's pywin's interactive window?).

So, unless someone more familiar with pyclbr steps in and gives us a bit more insight, I think the best course may be to report this to PyWin. They might also know more about pyclbr, perhaps they've run into similar issues with it before!
msg397280 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-12 05:23
(Also if they are able to help, please report back here and update this issue, that will be much appreciated!)
msg397330 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-12 15:25
Actually, I was mistaken about this, `readmodule_ex` can get the __main__ from sys.modules.. I will look more into this later today.
msg397342 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-12 17:42
Upon further pondering, I'm not sure PyWin is doing the right thing here. I believe it's best to first file it with PyWin and find out what it is trying to do and why.

Pyclbr behavior sort of makes sense to me -- pyclbr goal is to parse and browse the source of a module, but the __main__ module is either the current script, in which case it can be imported under another name (and then it would have the __spec__), or it's an interactive session, which you probably don't need to browse (maybe?).
Date User Action Args
2021-07-12 17:42:22andrei.avksetmessages: + msg397342
2021-07-12 15:25:50andrei.avksetmessages: + msg397330
2021-07-12 05:23:29andrei.avksetmessages: + msg397280
2021-07-12 05:22:27andrei.avksetmessages: + msg397279
2021-07-10 11:40:50kxrobsetmessages: + msg397247
2021-07-06 20:08:44andrei.avksetnosy: + andrei.avk
messages: + msg397054
2021-07-06 19:14:30andrei.avksettype: crash -> behavior
2021-02-22 18:46:38kxrobsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23407
2021-02-22 17:48:25kxrobcreate