classification
Title: Add __file__ attribute to frozen modules
Type: Stage:
Components: Interpreter Core, Library (Lib) Versions: Python 3.5, Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, brett.cannon, eric.snow, lemburg, ncoghlan
Priority: normal Keywords: patch

Created on 2014-06-12 16:14 by lemburg, last changed 2014-06-15 09:22 by lemburg.

Files
File name Uploaded Description Edit
__file__-for-frozen-modules.patch lemburg, 2014-06-12 16:15
Messages (7)
msg220362 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2014-06-12 16:14
The missing __file__ attribute on frozen modules causes lots of issues with the stdlib (see e.g. Issue21709 and the stdlib test suite) and other tools that expect this attribute to always be present.

The attached patch for 3.4.1 adds this attribute to all frozen modules and resolves most issues. It cannot resolve the problem of not necessarily finding the directories/files listed in those attributes, but at least makes it possible to continue using code that only uses the attribute for error reporting.
msg220365 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2014-06-12 16:35
I'm -0 on this patch.  I can understand that in some sense, frozen modules do semantically have an associated file, but OTOH, once they're frozen the connection to their file is broken.  Also, I think anything that assumes __file__ exists is simply broken and should be fixed.  There are other cases than frozen modules where a module would have no reasonable value for __file__ and thus shouldn't have one.
msg220367 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2014-06-12 16:41
On 12.06.2014 18:35, Barry A. Warsaw wrote:
> 
> I'm -0 on this patch.  I can understand that in some sense, frozen modules do semantically have an associated file, but OTOH, once they're frozen the connection to their file is broken.  Also, I think anything that assumes __file__ exists is simply broken and should be fixed.  There are other cases than frozen modules where a module would have no reasonable value for __file__ and thus shouldn't have one.

This one falls into the practicality beats purity category. Of
course, the __file__ attribute doesn't always makes sense as
file path, but it does serve an information purpose.

We're doing this in eGenix PyRun to get 3rd party code working
(including parts of the Python stdlib :-)). Not doing so
would simply lead to the whole freezing approach pretty much
useless, since so much code uses the attribute without checking
or providing a fallback solution.
msg220368 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2014-06-12 16:43
PBP might be reasonably used to justify it for the frozen case.  I just don't want to use that as a wedge to define __file__ in *all* cases, even when no reasonable file name exists.
msg220586 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-06-14 21:57
__file__ is the filename from which the module *was* loaded (the inspect doc [1] should probably reflect that [2]).  The import machinery only uses the module's __spec__ (origin, etc.).  __file__ is set strictly as informational (and for backward-compatibility).

Per the language reference [3], __file__ may be omitted when it does not have semantic meaning.  It also says "Ultimately, the loader is what makes use of __file__", but that hasn't been accurate since PEP 451 landed. [4]  Notably, though, for now the module __repr__() *does* use __file__ if it is available (and the loader doesn't implement module_repr).  The counterpart of __file__ within a module's spec is __spec__.origin.  The two should stay in sync.  In the case of frozen modules origin is set to "frozen".

Giving __file__ to frozen modules is inaccurate.  The file probably won't be there and the module certainly wasn't loaded from that location.

Stdlib modules should not rely on all module's having __file__.  Removing __file__ from frozen modules was a change in 3.4 and I'd consider it a 3.4 bug if any stdlib module relied on it.  For that matter, I'd consider it a bug if a module relied on all modules having __file__ or even __file__ being set to an actual filename.

Would it be inappropriate to set an additional informational attribute on frozen modules to indicate the original path?  Something like __frozen_filename__.  Then you wouldn't need to rely on __code__.co_filename.

p.s. Searching for __file__ in the docs [5] illustrates its prevalence.

[1] https://docs.python.org/3/library/inspect.html#types-and-members
[2] issue #21760
[3] https://docs.python.org/3/reference/import.html#__file__
[4] issue #21761
[5] https://docs.python.org/3/search.html?q=__file__
msg220598 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-06-14 23:53
Can we just drop "__file__" and set the origin for frozen modules to
something that includes the original file name?
msg220623 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2014-06-15 09:22
On 15.06.2014 01:53, Nick Coghlan wrote:
> 
> Can we just drop "__file__" and set the origin for frozen modules to
> something that includes the original file name?

This wouldn't really help, because too much code out there uses
the __file__ attribute and assumes it's always available.

Note that the filename information is already available in the
code object's co_filename attribute.
History
Date User Action Args
2014-06-15 09:22:27lemburgsetmessages: + msg220623
2014-06-14 23:53:07ncoghlansetmessages: + msg220598
2014-06-14 21:57:23eric.snowsetnosy: + eric.snow, brett.cannon, ncoghlan
messages: + msg220586
2014-06-12 16:43:15barrysetmessages: + msg220368
2014-06-12 16:41:28lemburgsetmessages: + msg220367
2014-06-12 16:35:39barrysetmessages: + msg220365
2014-06-12 16:33:04barrysetnosy: + barry
2014-06-12 16:15:21lemburgsetfiles: + __file__-for-frozen-modules.patch
keywords: + patch
2014-06-12 16:14:48lemburgcreate