This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PEP 235: Implement on every POSIX system, and clean up the C code for `import'
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, mfwitten, ncoghlan, tim.peters
Priority: normal Keywords: patch

Created on 2016-11-11 21:18 by mfwitten, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pep-235-on-posix.export mfwitten, 2016-11-11 21:18 3 patches for 3 commits; use `hg import /path/to/pep-235-on-posix.export'
pep-235-on-posix.patch mfwitten, 2016-11-11 22:04 This is the exact same as `pep-235-on-posix.export'
Messages (11)
msg280613 - (view) Author: Michael Witten (mfwitten) Date: 2016-11-11 21:18
The attached file, `pep-235-on-posix.export', contains 3 patches; the file includes the intended commit messages and authorship information. To apply these patches, save the file to:

  /path/to/pep-235-on-posix.export

and then run the following from within your Mercurial repository:

  hg import /path/to/pep-235-on-posix.export

I ran `make test' or `./python -m test.regrtest' multiple times, and all but one attempt succeeded: There was a transient and probably unrelated error that occured whilst running `test_thread'.

Only the initial patch alters functionality; the other patches, though structurally useful, are a matter of essentially opportunistic aesthetic reconstruction.

Here is the commit message of the initial patch (the formatting of which may be mangled here):

----8<------8<------8<------8<----

PEP 235: Extend to every POSIX system the case-sensitivity semantics for `import'

(Note: As per PEP 235, this explicit checking of case may be effectively disabled by defining the environment variable `PYTHONCASEOK'.)

From time to time, even a user of a sane system has been known [to be forced] to use an insane file system; the semantics of PEP 235 need to be implemented in this case as well.

On a sane system, mount an insane file system at "$mount", and then run this example:

  $ cd "$mount"
  $ touch Insane.py
  $ python -c 'import Insane; import insane; print(dir())'

Before this revision, the resulting output is:

  ['Insane', '__builtins__', '__doc__', '__name__', '__package__', 'insane']

After this commit, the resulting output is sane:

  Traceback (most recent call last):
  File "<string>", line 1, in <module>
  ImportError: No module named insane

Because POSIX systems may be a subset of sane systems, this is only a partial fix to the overall problem; as much as possible, *every* system should implement the same semantics.

----8<------8<------8<------8<----

The one irritation of this patch is that it arguably adds overhead to "sane" systems, which will almost never run into this corner case.

However, there are at least 4 replies to this criticism:

1) As per PEP 235, the overhead can be virtually removed by setting the `PYTHONCASEOK' environment variable.

2) It's probably most important for Python to present consistent behavior across systems; if one needs every cycle, then one can hack the source for oneself (and if imports are a major bottleneck for some program, then there is probably something seriously wrong with the design of that program, anyway).

3) Ultimately, I can say that while working on a supposedly sane system, I *did* run into this corner case: The program I was trying to run was located on an HFS+ file system; I didn't even know that this program employed Python for one of its components, but it failed with strange, nearly inscrutable errors, which turned out to be the result of the sane system's `python' knowing nothing about PEP 235. Once fixed, that program ran without any problems; the sole pain in the arse was `python'.

4) Come on! Come on, man! Come on; I did a good job here. Come on!

As an aside, I believe that Python 3 is not affected by this problem.
msg280617 - (view) Author: Michael Witten (mfwitten) Date: 2016-11-11 22:04
I've attached as `pep-235-on-posix.patch' a copy of `pep-235-on-posix.export', in order to see whether that helps the system recognize the content as a patch.

I'm loath to rename the original upload, because I'm not able to edit my original comment to reflect a new name.
msg280816 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-11-14 22:35
Because this would be a feature change in Python 2.7 it can't go in there. This could potentially go into 3.7 if it's deemed useful.

As for why the code review tool isn't picking up your uploads, it's because we work in patch files and not patch sets. See https://docs.python.org/devguide/patch.html on how to create a patch file that can be reviewed.
msg280825 - (view) Author: Michael Witten (mfwitten) Date: 2016-11-15 05:49
Thank you for the reply.

* As already stated, I believe that Python 3
  is not affected by this problem; certainly,
  version 3.5.2 does not seem to be affected,
  as per my ad hoc testing.

* Very many programs are targeted to Python 2.7,
  and probably will be for quite some time.

  Without my patch, `python' (2.7) itself breaks
  them as described; yet, with my patch, these
  programs will just work (without intervention).

  It seems to me that this patch isn't a change
  in a feature, but is rather the overdue
  implementation of promised behavior; the
  file system is what matters, not the host
  operating system, which has hitherto been
  conflated with the file system.
  
  That is, this patch is a bug fix.

* I'm not sure how a test could be written to
  handle this situation.
  
  I suppose that by default, such a test could
  be skipped with a warning; a user could be
  required to set up a relevant mount point,
  and then to signal the testing infrastructure
  to use it.

---------------

* With regard to the form of the patches, my
  intention is that they be introduced into
  the history severally, with the commit
  messages and other metadata as specified.
  
  It would probably make most sense for these
  commits to be the ancestors of a merge commit
  that provides the sole reference to this
  tracker issue.

  For instance, the most correct and useful
  application of the patch series is given
  by the following:

    $ hg pull --branch 2.7 # This must pull in something new for the rest to work.
    $ tip=$(hg log --limit 1 --template '{node}' -r tip)
    $ hg update -c -r 59260b38f7cd4a2ae66928de5227798524006e64
    $ hg import /path/to/pep-235-on-posix.patch
    $ hg merge -r "$tip"
    $ hg commit --message "Issue #28670: Implement PEP 235 on every POSIX system, and clean up the C code for \`import'"

  That produces a single commit whose message supplies
  the usual dearth of information, but whose ancestral
  history captures the rich details.

  It should be noted that `hg bundle' and `hg unbundle'
  could be used to import more directly the proper
  pre-merge history, albeit at the expense of the
  human-readability of the "patch" file.

  That is to say, I'm underwhelmed by the way you guys
  are managing this project's history. Despite all of its
  faults, Mercurial is still being underutilized.

* I could split the original patch file into 3 separate
  patch files, but what should I do with the metadata?
  If each separate patch were to contain its own metadata,
  would the code review tool still choke? I suspect it would.

  What's the point?

  The original file itself is perfectly readable, and you
  can simply review the diffs after applying them within
  your repo, anyway. Ironically, the very point of having
  multiple changesets is to make digesting the changes as
  easy as possible---not only now, but far into the future,
  when you're simply looking at the graph of the history!
msg280881 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-11-15 19:47
So there's no "promised behavior" that's missing in Python 2.7. If you read PEP 235 it's very clear what platforms it was meant for: cygwin, macOS, and Windows. There's no promise of supporting PYTHONCASEOK for POSIX in general so it isn't as if the PEP is not fully implemented. And even if it was promised, this is a potential breaking change as the semantics of Python 2.7 would shift in a rather key way on certain platforms based on the external factor of PYTHONCASEOK simply being set which someone may have carelessly done. In other words while you view this as a fix for breakage on a platform, I view it as adding support for a certain platform configuration on POSIX which is a new feature.

Since you said this doesn't affect Python 3, I'm closing this as rejected. I appreciate the attempt at a patch, but this is considered a new feature for Python 2.7 which is not open to new features.

In case you choose to submit other patches in the future I'll address your other comments you left about how to test and our development workflow.

To test this what you would basically need to do is detect when the test suite was run on a platform that was case-preserving but case-insensitive and then on such a platform make sure imports worked as expected with and without PYTHONCASEOK set (see the tests that already do this on macOS and Windows).

As for your patchset, I understand your intention, but Python's workflow simply doesn't work the way you want it to. The commit messages that go into version control are set by core developers on purpose to make sure they are formatted as expected and contain the appropriate information. For instance, while your commit messages are very detailed, we tend to askew long commit messages and go for succinct messages that explain the "why" something was done (a paragraph of explanation is itself rare). Your commit messages were also not formatted correctly, e.g. we always list the relevant issue that motivated a change first like "Issue #28670: ...". And lastly, we want commits to represent a single unit of semantic change when possible, so if your work made sense to break up into multiple patches then we would need to open multiple issues for each semantic change to be discussed in isolation and on their own merits. It also makes tracking what semantic changes broke something easier when running a bisection on commits to find the change that broke something (and thus easier to also back out instead of searching for every related commit because it spanned more than one).
msg280895 - (view) Author: Michael Witten (mfwitten) Date: 2016-11-15 21:58
Guess what?

    Linux can access HFS+ and NTFS volumes.

Firstly, how does that fit into your ideas for testing? It doesn't;
however, you'll note that my own brief analysis did attempt to wrestle
with this nuance.

Secondly, it was (and is) clearly asinine to conflate an operating
system with a particular file system; PEP 235 betrays the naive
ways of ancient thinkers---the spirit of the text of PEP 235
has never been completely implemented, and the result of this
naivete is a broken userspace *today*.

Here are the cases for my patch:

  * Non-POSIX platforms:         Nothing changes.

  * POSIX platforms:

      * PYTHONCASEOK set:        Nothing changes.

      * PYTHONCASEOK not set:    Almost nothing changes.
                                 -----------------------
                                 Accessing an insane file
                                 system now works just like
                                 on a Non-POSIX platform. In
                                 most cases, this won't change
                                 anything; yet, rare cases
                                 will *now* Just Work, rather
                                 than crapping out with some
                                 inscrutable error.

Where is your qualm?

As for the organization of patches, what I have presented (and
especially what I describe for a merge commit) not only meets
your stated goals, but *exceeds* them in every way.

Nevertheless, I would be willing to dumb down my submission if
it meant getting this bug fixed.

----------

It's rude to close abruptly an issue without even the implicit
consent of your collocutor, especially when the reasoning for
such an action is, once again, based explicitly on a startling
degree of willful ignorance and maybe even technical incompetence.
msg280941 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-16 12:52
Michael, please do not reopen feature requests that have been declared out of scope for maintenance releases.

Python 2.7 DOES NOT support filesystem semantics that differ from the "default" semantics for the host operating system. Even Python 3 requires that the filesystems attached be at least somewhat consistent (e.g. using a common filesystem encoding) if you want to use the default Python-level APIs.

If you wish to pursue this topic further, please take it to the issue tracker for your preferred Linux distribution and convince *them* that this is an operating system level bug in their system Python installation. If multiple Linux distributions agree that this is a bug that should be fixed rather than a new feature, and are willing to carry a patch to address it, then that will add weight to your argument that the change should be applied in an upstream maintenance release.

In the meantime, folks that really want the behaviour you propose may upgrade to Python 3, or potentially explore the importlib2 backport of the Python 3 import system to the Python 2 series.
msg280954 - (view) Author: Michael Witten (mfwitten) Date: 2016-11-16 14:04
* This is not a feature request;
  this is a bug fix for errant behavior.

  However, in the interest of civility, I have not re-opened this issue.

* > Python 2.7 DOES NOT support filesystem semantics that differ
  > from the "default" semantics for the host operating system.

  That statement makes no sense; the decisions being made here
  are based on nonsense.

* > folks that really want the behaviour you propose may upgrade to Python 3

  Ah. I'm getting the distinct impression that this is a political
  matter, rather than a technical matter.

  No wonder the responses have been incomprehensible.

  Here's an idea that echos your own: If you wish to pursue this outcome
  further, please take it to the issue tracker for your preferred Linux
  distribution and convince *them* that all the software they provide
  should be ported to Python 3. If multiple Linux distributions agree
  that it is no longer valuable to support Python 2, then that will add
  weight to your argument that there is little reason to fix its bugs,
  and that Python 3 is actually useful.
msg280961 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-16 15:14
We work closely with Linux distribution providers, and a number of us work *for* commercial Linux distributors (most notably Red Hat and Canonical).

Linux distributions have already agreed that Python 2.7 is in maintenance mode, and already influence the process of backporting any Python 3 changes that they consider sufficiently valuable to their users.

(For example, Red Hat has backported the SSL/TLS improvements to the system Python in RHEL 7 and to the Python 2.7 Software Collection after OpenStack developers at Rackspace convinced myself and others that the old behaviours were genuinely problematic)

Thus, is you want to convince us that this is a bug that actually needs fixing rather than a new feature request, you need to come up with a convincing explanation for why no Linux distribution (whether community run or commercial) has seen fit to report it as a bug at any point in the past 15 years.
msg280966 - (view) Author: Michael Witten (mfwitten) Date: 2016-11-16 16:54
* Bugs, by their very nature, are often obscure; some of the worst
  in history have lain dormant, unseen, for years or perhaps even
  decades.

  Unsurprisingly, then, this bug is also a corner case that would be
  unknowingly triggered in practice only rarely; consequently, it is
  unsurprising that it has not been reported previously, despite the
  fact that the mistake is OBVIOUS in a retrospective (and literal!)
  reading of PEP 235, which is clearly naive in its view of the world
  of computing systems.

* Furthermore, in any computing system that is sufficiently complex,
  there is usually a workaround for any particular bug, which thus
  diminishes the impetus to report the problem at all; this just
  compounds the obscurity of the bug.

* The more obscure a bug, the less compelling the ratio of the
  reward to the solution effort, particularly when the objections
  are mired not in technical analysis, but rather in an incomplete
  understanding of the report, as well as perhaps some kind of
  political puffery.

  So, why bother even beginning a discussion of the issue?

  Indeed, this correspondence has proven to me (once again,
  unfortunately) that it's usually an utter waste of resources to
  attempt to solve problems purely for the benefit of others. Let
  the confused gnash their teeth, and let the clever hack their
  own way out of trouble. As long as my personal itch has been
  scratched in some way, that's good enough.

* How is it that you did not perceive the irrelevance of [at least
  the rest of] your reply?
msg281021 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-17 03:31
Michael, you seem to be operating under the assumption that the fact that the CPython VM assumes by default that Linux systems use consistent filesystem semantics is an accidental oversight.

It's not - it's a pervasive simplifying assumption that runs throughout various parts of the interpreter design, mainly in the form of scoping various settings to "the machine" based on the detected operating system that are in reality specific to a particular filesystem within the machine.

That's the reason what you propose is a feature request for Python 2.7 rather than a bug fix - you've picked one particular instance of that general assumption and proposed not making it anymore.

However, you've also indicated that Python 3 (and hence presumably the `importlib2` backport of the Python 3 import system to Python 2) have already removed that assumption in this particular case.

Hence my suggestion to discuss your concerns with Linux distribution providers - if something has changed in the past few years to make them more concerned about it, and they aren't satisfied with resolving it through their current Python 3 migration efforts [1,2], the use of `importlib2`, or the use of other filesystem interface bindings (such as PyFilesystem, GObject introspection, or Qt), then that would make a difference in the disposition of the issue report.

[1] https://wiki.ubuntu.com/Python
[2] http://fedora.portingdb.xyz/
History
Date User Action Args
2022-04-11 14:58:39adminsetgithub: 72856
2016-11-17 03:31:32ncoghlansetmessages: + msg281021
2016-11-16 18:14:50brett.cannonsetnosy: - brett.cannon
2016-11-16 16:54:33mfwittensetmessages: + msg280966
2016-11-16 15:14:42ncoghlansetmessages: + msg280961
2016-11-16 14:04:40mfwittensettype: enhancement -> behavior
messages: + msg280954
2016-11-16 12:52:24ncoghlansetstatus: open -> closed
type: behavior -> enhancement
messages: + msg280941

resolution: rejected
stage: test needed -> resolved
2016-11-15 21:58:58mfwittensetstatus: closed -> open
resolution: rejected -> (no value)
messages: + msg280895
2016-11-15 19:47:20brett.cannonsetstatus: open -> closed
resolution: rejected
messages: + msg280881
2016-11-15 05:49:15mfwittensetmessages: + msg280825
2016-11-14 22:35:43brett.cannonsetstage: test needed
2016-11-14 22:35:16brett.cannonsetmessages: + msg280816
2016-11-11 22:04:05mfwittensetfiles: + pep-235-on-posix.patch
keywords: + patch
messages: + msg280617
2016-11-11 21:18:31mfwittencreate