This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author paul.moore
Recipients brett.cannon, eric.snow, erik.bray, jdemeyer, ncoghlan, paul.moore, petr.viktorin, scoder, sth
Date 2018-08-05.11:54:39
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CACac1F-pAW__7P8DEpt7w4qGV4k6765kU2FGO1fDadgA_Br+Bw@mail.gmail.com>
In-reply-to <1533448269.57.0.56676864532.issue32797@psf.upfronthosting.co.za>
Content
On Sun, 5 Aug 2018 at 06:51, Stefan Behnel <report@bugs.python.org> wrote:
> This whole idea looks backwards and complicated. As Brett noted, .pyc files were moved out of the source tree, because they are build time artifacts and not sources. With that analogy, it's the .so files that would have to move, not the .pyx or .py source files, which are sources for the compiled .so files.

I disagree. In *Python* terms:

* .py files are the executable units of Python code
* .so/.pyd files are the executable units of extension modules
* .pyc/.pyo files are runtime artifacts that cache the results of the
compilation step of loading .py code. There's no equivalent for
.so/.pyd files as there's no compilation step needed when loading
them.
* .c, .h., .pyx, ... files are the source code for .so/.pyd files.
There's no equivalent for .py files, because Python code is executable
from source.

Executable units of code go on sys.path. Cached runtime artifacts go
in __pycache__. There's no defined location for source code. Whether
you agree with the above or not, please accept that it's a consistent
view of things. Sourceless distributions of Python code are an oddball
corner case, but they do *not* demonstrate that Python is a compiled
language and the .pyc/.pyo is the "executable" and the .py file is the
"source". At least not in my opinion.

It's entirely reasonable that we want to ensure that an exception
object, whether raised from a Python file or a compiled extension
module, includes a reference to the source of the issue, and that the
traceback mechanism can, if at all possible, locate that source and
display it in the exception traceback. That's basically what the
loader "get_source" method is for - isn't the problem here simply that
get_source for .pyd/.so files returns None, which in turn is because
there's no obvious way to get to the source file for a general binary
(and so the current implementation takes the easy way out and gives
up[1])?

Note that Cython is *not* unique here. Any tool that generates
.so/.pyd files has the same problem, and we should be looking for a
general solution. C code is the obvious example, and while I doubt
anyone is going to ship C sources with a Python extension, it's still
a valid case. Doesn't CFFI have a mode where it compiles a .pyd at
wheel build time? Shouldn't that be able to locate the source? As a
separate example, consider Jython, which may want to be able to locate
the Java (or Groovy, or Scala, ...) source for a .class file on
sys.path, and would have the same problem with get_source that CPython
does with .so/.pyd files.

So basically what's needed here is a modification to
ExtensionFileLoader to return a more useful value from get_source. How
it *does* that is up for debate, and brings in questions around what
standards the packaging community want to establish for shipping
extension sources, etc. And because the resulting decisions will end
up implemented as stdlib code, they need to be properly thought
through, as the timescales to fix design mistakes are quite long.
"Bung stuff on sys.path" isn't really a well thought through solution
in that sense, at least IMO - regardless of the fact that it's in
effect what linecache supported prior to Python 3.3.

So as a way forward, how about this:

1. Agree that what we actually need a better get_source method on
ExtensionFileLoader
2. Adjourn to distutils-sig to agree a standard for where tools that
generate .pyd/.so files should put their sources, if they should
choose to ship their sources in the wheel with the final built
extension.
3. Come back here with the results of that discussion to produce a PR
that implements that change to ExtensionFileLoader

The one possible fly in the ointment is if there are use cases that we
need to support where a single .so/.pyd file is built from *multiple*
source files, as get_source doesn't allow for that. Then it's back to
the drawing board...

Paul

[1] I remember working on the original design of PEP 302 and "take the
easy way out and give up" is *precisely* the thinking behind this :-)
History
Date User Action Args
2018-08-05 11:54:40paul.mooresetrecipients: + paul.moore, brett.cannon, ncoghlan, scoder, petr.viktorin, erik.bray, eric.snow, sth, jdemeyer
2018-08-05 11:54:40paul.moorelinkissue32797 messages
2018-08-05 11:54:39paul.moorecreate