classification
Title: Regression in pkgutil: iter_modules stopped taking Path argument in python 3.8.10 and 3.9.5
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: steve.dower Nosy List: gregory.p.smith, lukasz.langa, magul, miguendes, rikard.nordgren, shreyanavigyan, steve.dower
Priority: release blocker Keywords: 3.8regression, 3.9regression, easy, patch

Created on 2021-05-06 17:54 by rikard.nordgren, last changed 2021-05-12 10:18 by lukasz.langa. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25964 merged miguendes, 2021-05-07 07:49
PR 26052 merged steve.dower, 2021-05-11 23:31
PR 26056 merged lukasz.langa, 2021-05-12 09:54
Messages (12)
msg393120 - (view) Author: Rikard Nordgren (rikard.nordgren) Date: 2021-05-06 17:54
The pkgutil.iter_modules crash when using Path object in the first argument. The code below works in python 3.8.9 and 3.9.4, but stopped working in python 3.8.10 and 3.9.5. Changing from Path to str works in all versions.


import pkgutil
from pathlib import Path

for _, modname, ispkg in pkgutil.iter_modules([Path("/home")], 'somepackage.somesubpackage'):
        print(modname, ispkg)


Error message from python 3.8.10 (other path was used):

Traceback (most recent call last):
  File "/home/devel/Python-3.8.10/Lib/pkgutil.py", line 415, in get_importer
    importer = sys.path_importer_cache[path_item]
KeyError: PosixPath('/home/devel/pharmpy/src/pharmpy/plugins')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pyissue.py", line 5, in <module>
    for _, modname, ispkg in pkgutil.iter_modules([Path("/home/devel/pharmpy/src/pharmpy/plugins")], 'pharmpy.plugins'):
  File "/home/devel/Python-3.8.10/Lib/pkgutil.py", line 129, in iter_modules
    for i in importers:
  File "/home/devel/Python-3.8.10/Lib/pkgutil.py", line 419, in get_importer
    importer = path_hook(path_item)
  File "<frozen importlib._bootstrap_external>", line 1594, in path_hook_for_FileFinder
  File "<frozen importlib._bootstrap_external>", line 1469, in __init__
  File "<frozen importlib._bootstrap_external>", line 177, in _path_isabs
AttributeError: 'PosixPath' object has no attribute 'startswith'
msg393128 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-05-06 18:38
https://bugs.python.org/issue43105 and https://github.com/python/cpython/pull/25121 caused this regression.
msg393130 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-05-06 18:40
(obviously we're missing any tests for use of Path objects in this situation)
msg393132 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-05-06 18:59
Looks like it was always getting lucky in the past, as sys.path requires strings, and the "path" argument here is an alternative to it. The cache was definitely not working as intended.

So while it's not clearly documented anywhere (other than the related pkgutil.extend_path() method saying it'll ignore non-str paths), we do need to only be passing str into _importlib_bootstrap.

I can't do the pkgutil updates right now, but I think it's just get_importer than needs an os.fsdecode() call around its argument. Maybe some others in the same module. And add tests :)
msg393133 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-05-06 19:00
To be clear, I'll get to this when I can, but if someone else wants to write the fix I'm happy to review and merge. Don't wait for me to do this one just because I'm assigned.
msg393135 - (view) Author: Shreyan Avigyan (shreyanavigyan) * Date: 2021-05-06 19:04
It's mentioned that Python 3.9.5 has this regression but the code works fine on my windows machine. Is this only reproducible on POSIX?
msg393147 - (view) Author: Miguel Brito (miguendes) * Date: 2021-05-06 20:37
I can reproduce it on latest master running on Linux.

steve.dower: I wrote some tests and wrapping get_importer argument with os.fsdecode() fixes it.

I'm happy to open an PR, just let me know or if OP is not willing to do so either.
msg393154 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-05-06 23:29
> I can reproduce it on latest master running on Linux.

Interesting. Perhaps my actual change was to cause it to raise an error 
earlier (outside of a handler)? Which would mean that the Path object 
was always failing before, just silently.

Arguably we should filter them out instead, then, rather than 
introducing a new supported type (as has been argued with similar 
changes elsewhere) in a security branch. Though I'm sure the OP would 
have noticed if their call was never returning any modules, so 
presumably it must have worked somehow.

> I'm happy to open an PR, just let me know or if OP is not willing to do so either.

Feel free. I assume they'd have arrived with a PR if they were keen to 
write it themselves.
msg393168 - (view) Author: Rikard Nordgren (rikard.nordgren) Date: 2021-05-07 06:48
Thanks all for looking into this! 

> Though I'm sure the OP would 
> have noticed if their call was never returning any modules, so 
> presumably it must have worked somehow.

Yes, that snippet returns modules with the previous versions of python 3.8 and 3.9.

For more background to what I tested please see:
https://github.com/actions/virtual-environments/issues/3330

> I'm happy to open an PR, just let me know or if OP is not willing to do so either.

Please go ahead!
msg393478 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-05-11 23:34
Thanks for the patch!

I don't think this meets the 3.8 bar now, but if Łukasz wants it then it should be easy.
msg393494 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-05-12 09:48
New changeset 0cb9775a85a051556461ea9c3089244601b7d4f8 by Steve Dower in branch '3.9':
bpo-44061: Fix pkgutil.iter_modules regression when passed a pathlib.Path object (GH-25964) (GH-26052)
https://github.com/python/cpython/commit/0cb9775a85a051556461ea9c3089244601b7d4f8
msg393498 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-05-12 10:18
New changeset 4844abdd700120120fc76c29d911bcb547700baf by Łukasz Langa in branch '3.8':
[3.8] bpo-44061: Fix pkgutil.iter_modules regression when passed a pathlib.Path object (GH-25964). (GH-26056)
https://github.com/python/cpython/commit/4844abdd700120120fc76c29d911bcb547700baf
History
Date User Action Args
2021-05-12 10:18:11lukasz.langasetmessages: + msg393498
2021-05-12 09:54:04lukasz.langasetpull_requests: + pull_request24696
2021-05-12 09:48:57lukasz.langasetmessages: + msg393494
2021-05-11 23:34:04steve.dowersetstatus: open -> closed
resolution: fixed
messages: + msg393478

stage: patch review -> resolved
2021-05-11 23:31:57steve.dowersetpull_requests: + pull_request24692
2021-05-11 20:33:57magulsetnosy: + magul
2021-05-07 07:49:27miguendessetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request24621
2021-05-07 06:48:17rikard.nordgrensetmessages: + msg393168
2021-05-06 23:29:05steve.dowersetmessages: + msg393154
2021-05-06 20:37:10miguendessetnosy: + miguendes
messages: + msg393147
2021-05-06 19:06:38shreyanavigyansetversions: + Python 3.11
2021-05-06 19:04:57shreyanavigyansetnosy: + shreyanavigyan

messages: + msg393135
versions: - Python 3.11
2021-05-06 19:00:09steve.dowersetmessages: + msg393133
versions: + Python 3.11
2021-05-06 18:59:14steve.dowersetkeywords: + easy

messages: + msg393132
2021-05-06 18:40:29gregory.p.smithsettype: crash -> behavior
messages: + msg393130
stage: needs patch
2021-05-06 18:38:19gregory.p.smithsetpriority: normal -> release blocker

assignee: steve.dower
versions: + Python 3.10
keywords: + 3.8regression, 3.9regression
nosy: + gregory.p.smith, lukasz.langa, steve.dower

messages: + msg393128
2021-05-06 17:54:01rikard.nordgrencreate