Issue41109
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2020-06-25 02:51 by conchylicultor, last changed 2022-04-11 14:59 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 21920 | open | Jeffrey.Kintscher, 2020-08-19 05:31 |
Messages (12) | |||
---|---|---|---|
msg372297 - (view) | Author: Etienne POT (conchylicultor) * | Date: 2020-06-25 02:51 | |
I have a subclass GithubPath of PurePosixPath. ``` class GithubPath(pathlib.PurePosixPath): def __new__(cls, *args, **kwargs): print('New') return super().__new__(cls, *args, **kwargs) def __init__(self, *args, **kwargs): print('Init') super().__init__() ``` Calling `child.parent` create a new GithubPath but without ever calling __new__ nor __init__. So my subclass is never notified it is created. ``` p = GithubPath() # Print "New", "Init" p.parent # Create a new GithubPath but bypass the constructors ``` The reason seems to be that parent calls _from_parts which create a new object through `object.__new__(cls)`: https://github.com/python/cpython/blob/cf18c9e9d4d44f6671a3fe6011bb53d8ee9bd92b/Lib/pathlib.py#L689 A hack is to subclass `_init` but it seems hacky as it relies on internal implementation detail. |
|||
msg372298 - (view) | Author: Etienne POT (conchylicultor) * | Date: 2020-06-25 03:03 | |
Note that this likely affect all methods which returns new Path by calling `_from_parts` or `_from_parsed_parts`, like `.absolute`, `.resolve`,... |
|||
msg375030 - (view) | Author: Jeffrey Kintscher (Jeffrey.Kintscher) * | Date: 2020-08-08 01:36 | |
The workaround is to override _init(), which I agree is not desirable. This is the relevant code in PurePath, which is the super class of PurePosixPath: @classmethod def _from_parsed_parts(cls, drv, root, parts, init=True): self = object.__new__(cls) self._drv = drv self._root = root self._parts = parts if init: self._init() return self ... def _init(self): # Overridden in concrete Path pass To me, the clean way to get the desired behavior seems like it would be to have _init() call self.__init__(). def _init(self): # Overridden in concrete Path self.__init__() This fixes p.parent, but GithubPath() ends up calling GithubPath.__init__() twice – the first time by PurePath.__new__() calling PurePath._init() and the second time by the GithubPath object creation. |
|||
msg375031 - (view) | Author: Jeffrey Kintscher (Jeffrey.Kintscher) * | Date: 2020-08-08 02:05 | |
Clarification: PurePath.__new__() calls PurePath._from_parts(), which then calls PurePath._init() |
|||
msg375041 - (view) | Author: Etienne POT (conchylicultor) * | Date: 2020-08-08 10:34 | |
Before solving this issue, I think it would be best to think on a more generic solution on how to make Pathlib more extensible. Related to: https://discuss.python.org/t/make-pathlib-extensible/3428 For instance, if childs created with `p.parent()`, `p / 'subdir'` need to forward some state (e.g. `RemotePath(path, password=, user=)`). Rather than __init__, maybe there should be some __post_init__ like dataclasses. |
|||
msg375047 - (view) | Author: Louis-Vincent Boudreault (louis-vincent.boudre) | Date: 2020-08-08 13:47 | |
Path.__new__ should not call _from_parts because it breaks the specialization by directly using object.__new__. This is why `_from_parts` has to be duplicated in each subclass's `__new__`. My suggestion (first draft) is to do the parsing of arguments inside an `__init__` in the Path class hierarchy and deprecate `_from_parts`. ``` class PurePath: def __new__(cls, *args): if cls is PurePath: cls = PureWindowsPath if os.name == 'nt' else PurePosixPath super().__new__(cls, *args) # Here we remove call to from_parts def __init__(self, *args): # We should have an __init__ in the hierarchy. drv, root, parts = self._parse_args(args) # this would get the proper _flavour. self._drv = drv self._root = root self._parts = parts ... class Path(PurePath): def __new__(cls, *args, **kwargs): if cls is Path: cls = WindowsPath if os.name == 'nt' else PosixPath # REMOVE THIS LINE: self = cls._from_parts(args, init=False) # if not self._flavour.is_supported: raise NotImplementedError("cannot instantiate %r on your system" % (cls.__name__,)) return super().__new__(cls, *args, **kwargs) # Use super ``` I don't know what is the purpose of `_init` and if it could be replaced by an extra keyword argument in __init__. The class method `_from_parts` would become deprecated since the argument parsing would be now handled by `__init__`. By using __init__ and avoid the direct call to `object.__new__` we would respect class specialization and custom class could be implemented intuitively. |
|||
msg375162 - (view) | Author: Jeffrey Kintscher (Jeffrey.Kintscher) * | Date: 2020-08-11 03:07 | |
The purpose of the _init() function in PurePath is to allow PurePath methods to call the Path subclass override _init() function when initializing a Path object. |
|||
msg375332 - (view) | Author: Jeffrey Kintscher (Jeffrey.Kintscher) * | Date: 2020-08-13 19:27 | |
Adding __init__() to PurePath complicates things and doesn't provide any benefit. A subclass that calls super.__init__() ends up invoking object.__init__(), which is perfectly fine. I was able to find a solution by calling type(self)() instead of object.__new__() in most cases. I am working on a PR. |
|||
msg375333 - (view) | Author: Louis-Vincent Boudreault (louis-vincent.boudre) | Date: 2020-08-13 19:31 | |
This is not true, because the classmethod use the library shortcuts the class mro order, to prevent infinite loop inthe __new__. However, it was using __init__ before hand, we would Not have this issue Le jeu. 13 août 2020 à 15:27, Jeffrey Kintscher <report@bugs.python.org> a écrit : > > Jeffrey Kintscher <websurfer@surf2c.net> added the comment: > > Adding __init__() to PurePath complicates things and doesn't provide any > benefit. A subclass that calls super.__init__() ends up invoking > object.__init__(), which is perfectly fine. > > I was able to find a solution by calling type(self)() instead of > object.__new__() in most cases. I am working on a PR. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <https://bugs.python.org/issue41109> > _______________________________________ > |
|||
msg375334 - (view) | Author: Louis-Vincent Boudreault (louis-vincent.boudre) | Date: 2020-08-13 19:33 | |
This is not true, because the classmethod used the library shortcuts the class mro order, this is to prevent infinite loop inthe __new__. However, If it was using __init__ before hand, we would Not have this issue Le jeu. 13 août 2020 à 15:31, Louis-Vincent Boudreault < lv.boudreault95@gmail.com> a écrit : > This is not true, because the classmethod use the library shortcuts the > class mro order, to prevent infinite loop inthe __new__. However, it was > using __init__ before hand, we would > Not have this issue > > > Le jeu. 13 août 2020 à 15:27, Jeffrey Kintscher <report@bugs.python.org> > a écrit : > >> >> Jeffrey Kintscher <websurfer@surf2c.net> added the comment: >> >> Adding __init__() to PurePath complicates things and doesn't provide any >> benefit. A subclass that calls super.__init__() ends up invoking >> object.__init__(), which is perfectly fine. >> >> I was able to find a solution by calling type(self)() instead of >> object.__new__() in most cases. I am working on a PR. >> >> ---------- >> >> _______________________________________ >> Python tracker <report@bugs.python.org> >> <https://bugs.python.org/issue41109> >> _______________________________________ >> > |
|||
msg375364 - (view) | Author: Jeffrey Kintscher (Jeffrey.Kintscher) * | Date: 2020-08-14 02:12 | |
The current implementation calls object.__new__(cls), where cls is the child class type, from within a class method (@classmethod). This is fine for Path.__new__() and PurePath.__new__(), which are called by the child class's __new__(), because we don't want them to recursively call the child class's __new__() when the child class is created. This all works as expected when the child class is instantiated outside of Path and PurePath, and the child's __init__() gets called as expected. I don't see any point in making changes to this behavior. When one of approximately 20 PurePath and Path functions and properties instantiate a new child class object the same way PurePath.__new__() and Path.__new__() do, the child class's __new__() and __init__() functions are not called. This is the problem we are trying to solve. My fix is to add normal functions (i.e. not decorated with @classmethod) to instantiate child class objects using obj = type(self)() This creates a child class instance, and the child's __new__() and __init__() functions get called. Once I have finished re-plumbing Path and PurePath to use the new functions and created the necessary unit tests (to make sure I didn't break anything), I will also look at adding a proper __init__() function to the two classes instead of having __new__() initialize the member variables. I didn't mean to imply that __init__() isn't useful. It is required to allow the child class to initialize its own variable. I just meant it isn't required to force calling __init__() and __new__() in the child class. |
|||
msg375638 - (view) | Author: Jeffrey Kintscher (Jeffrey.Kintscher) * | Date: 2020-08-19 06:25 | |
I created a PR that should provide the desired behavior: __init__() and __new__() get called in subclass objects that are created by Path and PurePath. Also, Path and PurePath now have __init__() functions, and the __new__() functions only return new objects and rely upon __init__() to perform the object initialization. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:32 | admin | set | github: 85281 |
2020-08-19 06:37:57 | Jeffrey.Kintscher | set | components: + Library (Lib) |
2020-08-19 06:25:13 | Jeffrey.Kintscher | set | messages: + msg375638 |
2020-08-19 05:31:42 | Jeffrey.Kintscher | set | keywords:
+ patch stage: patch review pull_requests: + pull_request21034 |
2020-08-14 02:12:51 | Jeffrey.Kintscher | set | messages: + msg375364 |
2020-08-13 19:33:20 | louis-vincent.boudre | set | messages: + msg375334 |
2020-08-13 19:31:25 | louis-vincent.boudre | set | messages: + msg375333 |
2020-08-13 19:27:23 | Jeffrey.Kintscher | set | messages: + msg375332 |
2020-08-11 03:07:33 | Jeffrey.Kintscher | set | messages: + msg375162 |
2020-08-08 13:47:37 | louis-vincent.boudre | set | nosy:
+ louis-vincent.boudre messages: + msg375047 |
2020-08-08 10:34:16 | conchylicultor | set | messages: + msg375041 |
2020-08-08 02:05:42 | Jeffrey.Kintscher | set | messages: + msg375031 |
2020-08-08 01:36:03 | Jeffrey.Kintscher | set | messages: + msg375030 |
2020-08-07 20:38:30 | Jeffrey.Kintscher | set | nosy:
+ Jeffrey.Kintscher |
2020-06-25 03:08:40 | xtreak | set | nosy:
+ pitrou |
2020-06-25 03:03:34 | conchylicultor | set | messages: + msg372298 |
2020-06-25 02:51:23 | conchylicultor | create |