classification
Title: os module: use keyword-only arguments for dir_fd and nofollow to reduce function count
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: larry Nosy List: Arfrever, Yury.Selivanov, benjamin.peterson, georg.brandl, gvanrossum, larry, neologix, pitrou, python-dev, r.david.murray, rosslagerwall, serhiy.storchaka
Priority: release blocker Keywords: patch

Created on 2012-04-19 23:52 by larry, last changed 2012-06-23 20:41 by georg.brandl. This issue is now closed.

Files
File name Uploaded Description Edit
larry.os.keyword.arguments.collapse.1.diff larry, 2012-05-26 04:54 review
larry.os.keyword.arguments.collapse.2.diff larry, 2012-06-04 16:15 review
larry.os.keyword.arguments.collapse.3.diff larry, 2012-06-08 01:30 My third patch implementing all this mess. review
larry.os.keyword.arguments.collapse.4.diff larry, 2012-06-21 08:23 Fourth patch. review
larry.os.keyword.arguments.collapse.5.diff larry, 2012-06-22 10:47 Fifth patch. review
Messages (31)
msg158777 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-04-19 23:52
There are some functions in the os module that do much the same thing but differ only in minor ways, like

* whether or not they follow symbolic links (stat vs lstat)
* taking an extra "dir_fd" parameter (chmod vs fchmodat(3.3))

It would be better if we consolidated these variations into one function which took keyword-only parameters that allow access to the alternate functionality.

Here is a sample of the new possible function signatures, as originally proposed by Serhiy Storchaka:

access(path, mode, *, followlinks=True, dirfd=None, eaccess=False)
chmod(path, mode, *, followlinks=True, dirfd=None)
chown(path, uid, gid, *, followlinks=True, dirfd=None)
link(srcpath, dstpath, *, followlinks=True, srcdirfd=None, dstdirfd=None)
mkdir(path, mode=0o777, *, dirfd=None)
mknod(path, mode=0o600, device=0, *, dirfd=None)
open(path, flag, mode=0o777, *, dirfd=None)
readlink(path, *, dirfd=None)
rename(oldpath, newpath, *, olddirfd=None, newdirfd=None)
stat(path, *, followlinks=True, dirfd=None)
symlink(src, dst, *, dirfd=None)
unlink(path, *, removedir=False, dirfd=None)
utimes(path[, (atime, mtime)], *, ns=False, dirfd=None)
mkfifoat(path, mode=0o666, *, followlinks=True, dirfd=None)

My opinion about naming: PEP 8 suggests underscores to separate words in non-class identifiers.  So I'd spell those "dir_fd", "src_dir_fd" (etc), "remove_dir", and "follow_symlinks".

(While thinking about this, I remembered the infrequently-cited maxim "no boolean parameters".  But that's more a guideline than a rule, and it tends to be a complaint about how the meaning of the parameter is unclear at a call site.  As these will be keyword-only parameters I think their meanings will be perfectly clear, so I shan't worry about it.)
msg158779 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-04-19 23:56
storchaka: sorry for the long delay, somehow I missed your reply in python-ideas.)

You said you envision this as a big patch.  Could I convince you to try and make a series of smaller patches?  It should be easy to break up into small pieces--do one patch adding dir_fd, the next with follow_symlinks, etc.
msg158788 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-04-20 00:40
I think it most (all?) of these cases, it's better to mirror the os level APIs, since many people already know them. Such fanciness is better left to high level wrappers (like path objects).
msg158795 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-04-20 04:45
No, Guido's boolean keyword dislike is not about things being unclear at the call site.  It's about boolean parameters instead of separate functions.  That is (I believe) he prefers lstat/stat to having stat take a boolean keyword, keyword-only or not.

Did he weigh in on the python-ideas thread?
msg158796 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-04-20 04:55
r.david.murray said:
> No, Guido's boolean keyword dislike is not about things
> being unclear at the call site.

I wasn't referring to Guido specifically, just general code smell complaints about boolean parameters.

> That is (I believe) he prefers lstat/stat to having stat take
> a boolean keyword, keyword-only or not.
>
> Did he weigh in on the python-ideas thread?

Not per se.  But I ran into him when he was leaving PyCon 2012 for good (second night of the sprints iirc), and he steered me to the original python-ideas thread and asked me to follow up on it / work with the poster.  (Probably because of my doing the nanosecond-precision os.stat / utime work.)  If there's genuine opposition to the patch being generated here I'll ping him.

Oh, btw storchaka, in the email thread you asked

> how the user code will check the availability of
> dirfd functionality. Special global flags in os or sys?

No--it'll just be part of a release, and you'll check which version of Python 3 you have before using it.

    if sys.version_info.major >= 3 or sys.version_info.minor >= 3:
       # use extra flags

p.s. http://mail.python.org/pipermail/python-ideas/2012-March/014482.html
msg158799 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-20 06:35
Thank you, Larry. I was going to do it, but got stuck with other things.

The main objective of this proposal is to get rid of litter os module by dozen rarely popular and non-portable functions (introduced by issue4761). Moreover, the descriptions are given in alphabetical order and related functions are located far from each other.

> So I'd spell those "dir_fd", "src_dir_fd" (etc), "remove_dir", and "follow_symlinks".

I follow the common style. `followlinks` is already used in other functions.

> No--it'll just be part of a release, and you'll check which version of Python 3 you have before using it.

Presence of function depends not only on the version, but also from the platform.

Benjamin, therefore I believe it is critically important to do this work before the release of Python 3.3. To people and have not started to use the new features. Otherwise, get rid of them will be more difficult. But I have nothing against to put "at"-functions in a separate submodule (os.posix?).

David, let lstat remains. fstatat include functionality both stat and lstat (but has a different interface). I suggest to unite fstatat and stat, while maintaining backward compatibility and using a more user-friendly interface.

In the C extension of the functions is impossible, that is why were introduced new functions with other names.
msg158801 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-20 06:52
Before starting to code, it is necessary to solve the problem of interface.

With the majority of the functions all is good, but the `link` and `rename` have two `dirfd` parameters (even with different names). So I suggest two more alternatives.

One is the filename and dirfd are combined in a tuple. Instead of a tuple, you can specify only the name.

  link ((srcpath, srcdirfd), (dstpath, dstdirfd), *, followlinks=True)

The other -- `dirfd`s are combined in a tuple. You can specify a number, then `dirfd` is the same for both filenames.

  link (srcpath, dstpath, *, followlinks=True, dirfd=(srcdirfd, dstdirfd))

Which of these options (or the original, with different keywords) is preferable?
msg158802 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-04-20 07:40
It's true that, for example, dir_fd parameters won't work on Windows.  The solution is to always accept the parameters and throw NotImplementedError on platforms where the functionality isn't available.

Here are my thoughts on the interface for link().  Since the two dir_fd parameters are independent--you might specify none, one, or both--I think the dir_fd=(src,dst) form would be obnoxious.  And polymorphic parameters (accept a string or a tuple of string and fd) are nearly always a bad idea; the % operator on strings is a good example of what can go wrong.  So I think you should stick with the original interface, with independent explicit keyword arguments.

I'd prefer that we did this everywhere it made sense, including adding a follow_symlinks parameter to stat().  But obviously you should prioritize places where we want to get rid of functions that are only in trunk (3.3) so far.

I suppose there's precedent for "followlinks"; it's very old, predating PEP 8.  Personally I'd overlook it if I were doing the implementation and spell the new parameters "follow_symlinks"--or at least I'd try it and see what the community thought.  Anyway, there's no established precedent for "dir_fd" and "remove_dir".  So I'd certainly prefer PEP 8 spellings for those.
msg159518 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-04-28 10:38
Here my first stab at a comprehensive proposal.  Each section represents a specific new function argument, and a list of functions that the argument be added to.

All new arguments are keyword-only and optional.

All functions mentioned are in the os module.
_______________________________________

This annotation:
		foo             [X bar]
means "if we add this argument to function foo, we can remove function bar".  This is because bar is a new function only in trunk and has no installed base.

This annotation:
		foo             [- bar]
means "if we add this argument to function foo, we could theoretically start deprecating function bar".  bar shipped in a previous version of Python and we can't simply remove it.

However!  I am *not* proposing doing *any* of these deprecations--I suspect the right thing to do is to leave all the existing functions in.

_______________________________________
follow_symlinks=bool (default True)

Allow functions to either follow symlinks (the default) or examine symlinks.

	access              [X faccessat]
	chflags             [- lchflags]
	chmod               [- lchmod]
	chown               [- lchown]
	getxattr            [X lgetxattr]
	link                [X linkat]
	listxattr           [X llistxattr]
	removexattr         [X lremovexattr]
	setxattr            [X lsetxattr]
	stat                [- lstat]
	utime               [X lutimes]
_______________________________________
effective_ids=bool (default False)

For functions that take the AT_EACCESS flag or otherwise operate on effective uid/gid.

	access              [X faccessat]
	    (this also lets us skip ever adding euidaccess!)
_______________________________________
dir_fd=int (default None)

For functions that take a "dir_fd" parameter instead of / in addition to a "path" parameter.

	access              [X faccessat]
	chmod               [X fchmodat]
	chown               [X fchownat]
	getxattr            [X fgetxattr]
	link                [X linkat] (note! need two parameters here.)
	mkdir               [X mkdirat]
	mkfifo              [X mkfifoat]
	mknod               [X mknodat]
	open                [X openat]
	readlink            [X readlinkat]
	rename              [X renameat]
	stat                [X fstatat]
	symlink             [X symlinkat]
	unlink              [X unlinkat]
	utime               [X futimesat utimensat]
_______________________________________
fd=int (default None)

For functions that take a "path" parameter and have an "f"-equivalent that take an "fd" instead.  The "path" parameter and "fd" parameters would be exclusive; you must specify exactly one, never both.  Both parameters would accept "None" as equivalent to being unspecified.  For the functions that only take one parameter (chdir, listdir, stat, statvfs) I propose we give that parameter a default of None.

	chdir               [- fchmod]
	chown               [- fchown]
	execve              [X fexecve]
	getxattr            [X fgetxattr]
	listdir             [X flistdir]
	listxattr           [X flistxattr]
	removexattr         [X fremovexattr]
	setxattr            [X fsetxattr]
	stat                [- fstat]
	statvfs             [- fstatvfs]
	utime               [X futimes futimens]

_______________________________________
remove_dir=bool (default False)

Allows unlink to behave like rmdir.

	unlink              [X unlinkat] [- rmdir]
_______________________________________


One question:

If we do as I propose, and dir_fd is always a parameter to a pre-existing function, can we drop support for AT_FDCWD?  It seems to me that
    funlink("../foo", dir_fd=AT_FDCWD)
is equivalent to
    unlink("../foo")
but I fear I'm missing something important.
msg159626 - (view) Author: Charles-Fran├žois Natali (neologix) * (Python committer) Date: 2012-04-29 16:35
Well, it always boils down to the same problem: should we offer thin wrappers around underlying syscalls or offer higher-level functions?
I personally think that offering mere wrappers around syscalls doesn't make much sense: Python is a very-high level language, and so should its library be.

Now, I'm in favor of adding optional argument to tune the behavior with regard to symlinks, however I'm skeptical about merging regular syscalls and *at() syscalls, for the following reasons:
- *at() syscalls are not portable: if it's not supported, what's remove(path='toto', dir_fd=fd) supposed to do? Throw an exception?
- it's not clear how one's supposed to check that the functions are available: right now, one can do
if hasattr(os, 'unlinkat'):
    # use unlinkat()
else:
    # fallback to unlink()
How would that work with dir_fd argument?
- some functions have actually really different prototypes:
for example:
rename(<old path>, <new path>) -> renameat(<old dir_fd>, <old path>, <new dir_fd>, <new path>)
  trying to coalesce them into a single function will lead to awkward and ugly API: some arguments will be exclusive (which is almost always a sign of a bad API), and the order sounds really wrong:

stat(path, *, followlinks=True, dirfd=None)
is backwards, it should be
stat(dirfd, path, *, followlinks=True)

since, `path` is relative to `dirfd`.

Actually, the proper way to do this would be to use overloading, but Python doesn't support it (and even if it did, I think overloading would probably be a bad idea anyway).
- taking for a second the point of view of a user that has never heard of the *at() family, if I come across such a function:
chmod(path, mode, *, followlinks=True, dir_fd=None)
I'll probably have to think some time to understand what the hell is the last dir_fd argument. Renaming, removing a file is simple and direct, so should the API be.
- finally, although useful, the *at() family is a really poor and confusing API, which will likely be used by less than 0.1% of the overall Python users. Complicating the whole posix module API just to shave off a few functions is IMHO a bad idea.
msg159772 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-02 03:00
> I personally think that offering mere wrappers around syscalls doesn't
> make much sense: Python is a very-high level language, and so should
> its library be.

I very much agree.  I suggest anyone who thinks the os module is a thin
veneer over syscalls needs to examine the Win32 implementation of os.stat.


> - *at() syscalls are not portable: if it's not supported, what's
> remove(path='toto', dir_fd=fd) supposed to do? Throw an exception?

NotImplementedError.


> - it's not clear how one's supposed to check that the functions are
> available: right now, one can do [a hasattr check]
> How would that work with dir_fd argument?

try/except.  If someone (maybe even me) championed PEP 362 (Function
Signature Object, introspection on functions) then we could LBYL,
but that's only a theory right now.


>   stat(path, *, followlinks=True, dirfd=None)
> is backwards, it should be
>   stat(dirfd, path, *, followlinks=True)
> since, `path` is relative to `dirfd`.

You could hypothetically rearrange all the arguments
at the call site by using 100% keyword arguments:
  stat(dir_fd=whatever, path=thingy, follow_symlinks=True)

I admit it's unlikely anyone will bother.


> Actually, the proper way to do this would be to use overloading, but
> Python doesn't support it (and even if it did, I think overloading
> would probably be a bad idea anyway).

Speaking *purely hypothetically*, we could actually overload it.  It'd
be at runtime, as what in Python is not.  Since stat is implemented in
C it would look like:

  if (-1 == PyArg_ParseTupleAndKeyword(arguments-with-dir_fd)) {
    PyErr_Clear();
    if (-1 == PyArg_ParseTupleAndKeyword(arguments-without-dir_fd)) {
      /* etc. */

However, Python doesn't have overloading because we shouldn't do it.
I'm against overloading here, for the same reason that I'm against
argument polymorphism (fold faccess into access by allowing the first
argument to be either an int or a string).  This sort of polymorphism
leads to lack of clarity and awkward problems.


> I'll probably have to think some time to understand what the hell is
> the last dir_fd argument. Renaming, removing a file is simple and
> direct, so should the API be.

I counter with TOOWWTDI.  There are currently four chmod functions in
os: chmod, fchmod, fchmodat, and lchmod.  If you want to change the
mode of a file, which do you use?

You may also face the problem that you don't want chmod to follow
symbolic links, yet you've never heard of lchmod.

I feel the sad truth of the situation is, it is desirable that Python
support all this functionality, but there is no simple and
obviously-right way to do so.  So which alternative is less undesirable:
a combinatoric explosion of functions, or a combinatoric explosion of
arguments to existing functions?  I suggest the latter is less
undesirable because it is marginally clearer: at least you always
know which function to call, and the documentation on how to use it
in its multiple forms will be all in one place.

(Though this would not help the hapless programmer who suspected maybe
there *was* a second function somewhere, searching forever for a
function that doesn't actually exist.)

Also, I suggest that the programmer reading the documentation for os
would see the same parameters (dir_fd, follow_symlinks) as part of many
different functions and would quickly learn their semantics.


> - finally, although useful, the *at() family is a really poor and
> confusing API, which will likely be used by less than 0.1% of the
> overall Python users. Complicating the whole posix module API just
> to shave off a few functions is IMHO a bad idea.

I understand your critique.  In defense of the proposal, I will say that
when Guido asked me to connect with the storchaka and try to shepherd
the process, he specifically cited the *at() functions as examples of
new functions that could be obviated by folding the functionality
into the existing APIs.

Not that I claim this gives me carte blanche to check an implementation
in.  Simply that I believe Guido liked the idea in the abstract.  It
could be that he wouldn't like the result, or that my proposal (which
purposely carried out the idea to its logical extreme) goes too far.

In the absence of a BDFL ruling on the proposal, I think I should
produce a patch implementing the full proposal, then bring it up in
c.l.p-d and get a community vote.  How's that sound?
msg161646 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-26 04:54
Here's my first pass at a patch.  For this patch,
I took the proposal to its logical extreme: I removed
every function in os that was both mildly redundant
with an existing function *and* has been added since
3.2, and moved that functionality to the equivalent
existing function, making it accessible with the use
of keyword-only parameters.

Specifically:

This function has been removed, and instead
|              this parameter has been added to
|              |                this function
|              |                |
v              v                v
-------------------------------------
faccessat      dir_fd           access
faccessat      effective_ids    access
faccessat      follow_symlinks  access
fchmodat       dir_fd           chmod
fchmodat       follow_symlinks  chmod
fchownat       dir_fd           chown
fchownat       follow_symlinks  chown
fexecve        fd               execve
fgetxattr      fd               getxattr
flistdir       fd               listdir
flistxattr     fd               listxattr
fremovexattr   fd               removexattr
fsetxattr      fd               setxattr
fstatat        dir_fd           stat
futimens       fd               utime
futimes        fd               utime
futimesat      dir_fd           utime
lgetxattr      follow_symlinks  getxattr
linkat         dst_dir_fd       link
linkat         src_dir_fd       link
linkat         follow_symlinks  link
llistxattr     follow_symlinks  listxattr
lremovexattr   follow_symlinks  removexattr
lsetxattr      follow_symlinks  setxattr
lutimes        follow_symlinks  utime
mkdirat        dir_fd           mkdir
mkfifoat       dir_fd           mkfifoat
mknodat        dir_fd           mknod
open           dir_fd           openat
readlinkat     dir_fd           readlink
renameat       dst_dir_fd       rename
renameat       src_dir_fd       rename
symlinkat      dir_fd           symlink
unlinkat       dir_fd           unlink
unlinkat       remove_directory unlink
utimensat      dir_fd           utime
utimensat      follow_symlinks  utime

Additionally, we *could* deprecate this function,
|              as I have added this parameter
|              |                to this function:
|              |                |
v              v                v
--------------------------------------
fchdir         fd               chdir
fchmod         fd               chmod
fstat          fd               stat
fstatvfs       fd               statvfs
lchflags       follow_symlinks  chflags
lchmod         follow_symlinks  chmod
fchown         fd               chown
lchown         follow_symlinks  chown
lstat          follow_symlinks  stat

I doubt we'll ever deprecate those functions.
This patch does not deprecate those functions.
I don't propose deprecating those functions.


Notes:

* What do you do on platforms where the functionality isn't available?
  I believe it's better to always accept parameters, but throw a
  NotImplementedError if the functionality they represent is
  unavailable on the current platform.  Adding and removing
  parameters based on the current platform... that way lies madness.
  (It's like scrollbars that pop in and out of existance depending
  on whether or not you need them.  Better that the scrollbars are
  always there, and simply disabled when the content fits in the
  current window.  Users like a stable foundation under their feet.)

* The patch is... pretty big.  But you can divide-and-conquer it
  into a series of self-contained parts.  First I add path_converter,
  then I modify existing functions / remove new functions.  Each of
  those can be reviewed in isolation.

  Also, the new implementations generally follow the same pattern:
    initialize
    call PyArg_ParseTupleAndKeywords
    error out early if user asks for functionality
      unavailable on the current platform
    ensure that combinations of parameters
      (dir_fd, fd, follow_symlinks) are permitted
    do actual work, turning on/off advanced functionality
      based on configure ifdefs (HAVE_FCHOWNAT etc)
    if error, raise exception
    compute return value
    exit:
    cleanup all path variables
    return return_value

  Here's a list of all the functions I added arguments to:
    access chdir chflags chmod chown execve getxattr link listdir
    listxattr mkdir mkfifo mknod open readlink removexattr rename
    setxattr stat statvfs symlink unlink utime

* The one new bit of technology: a PyArg_ParseTuple "converter"
  function called path_converter.  Here's its documentation:

  /*
   * A PyArg_ParseTuple "converter" function
   * that handles filesystem paths in the manner
   * preferred by the os module.
   *
   * path_converter accepts (Unicode) strings and their
   * subclasses, and bytes and their subclasses.  What
   * it does with the argument depends on the platform:
   *
   *   * On Windows, if we get a (Unicode) string we
   *     extract the wchar_t * and return it; if we get
   *     bytes we extract the char * and return that.
   *
   *   * On all other platforms, strings are encoded
   *     to bytes using PyUnicode_FSConverter, then we
   *     extract the char * from the bytes object and
   *     return that.
   *
   * Input fields:
   *   path.nullable
   *     If nonzero, the path is permitted to be None.
   *   path.function_name
   *     If non-NULL, path_converter will use that as the name
   *     of the function in error messages.
   *     (If path.argument_name is NULL it omits the function name.)
   *   path.argument_name
   *     If non-NULL, path_converter will use that as the name
   *     of the parameter in error messages.
   *     (If path.argument_name is NULL it uses "path".)
   *
   * Output fields:
   *   path.wide
   *     Points to the path if it was expressed as Unicode
   *     and was not encoded.  (Only used on Windows.)
   *   path.narrow
   *     Points to the path if it was expressed as bytes,
   *     or it was Unicode and was encoded to bytes.
   *   path.length
   *     The length of the path in characters.
   *   path.object
   *     The original object passed in.
   *   path.cleanup
   *     For internal use only.  May point to a temporary object.
   *     (Pay no attention to the man behind the curtain.)
   *
   *   At most one of path.wide or path.narrow will be non-NULL.
   *   If path was None and path.nullable was set,
   *     both path.wide and path.narrow will be NULL,
   *     and path.length will be 0.
   *   
   *   path_converter takes care to not write to the path_t
   *   unless it's successful.  However it must reset the
   *   "cleanup" field each time it's called.
   *
   * Use as follows:
   *      path_t path;
   *      memset(&path, 0, sizeof(path));
   *      PyArg_ParseTuple(args, "O&", path_converter, &path);
   *      // ... use values from path ...
   *      path_cleanup(&path);
   *
   * (Note that if PyArg_Parse fails you don't need to call
   * path_cleanup().  However it is safe to do so.)
   */
  typedef struct {
      char *function_name;
      char *argument_name;
      int nullable;
      wchar_t *wide;
      char *narrow;
      Py_ssize_t length;
      PyObject *object;
      PyObject *cleanup;
  } path_t;

  I assert path_converter is Very Useful.  It nearly always
  reduced the argument processing from three code paths
  (two for Windows, one for everyone else) to one.  Even if
  the rest of the patch isn't accepted I'm sure we'll keep
  path_converter.

* In a lot of places I combined together several functions, or
  several large blobs of #ifdef'd code, into one large
  slightly-shaggy function.  Like, Windows code is now often
  streamlined in, instead of being a big separate #ifdef.  And
  instead of three passes at decoding arguments (one for Windows
  wide, one for Windows narrow, one for everyone else) there is
  generally just one.  (Thanks, path_converter!)  Again, even
  if we don't keep the extra keyword arguments / functionality,
  I'm hoping at least some of that cleanup will survive.

* utime is the whale.  It obviates *five* other functions
  (futimens, futimes, futimesat, lutimes, and utimensat).
  I spent a fair amount of time refactoring it.  It's so complex,
  it was hard to make it anything like readable.  I like to
  think I reached a local maximum of readability given what I had
  to work with, and I like to think that the resulting dog's
  breakfast is less unreadable than the previous iteration.  Anyway
  I'm definitely open to suggestions on how to restructure it
  further to enhance readability.

* The tests used to use e.g. hasattr(os, 'futimes') and the like to
  determine what functionality was available.  In order to LBYL,
  I had to resort to e.g. sysconfig.get_config_var('HAVE_FUTIMES').
  If users need to LBYL here too, well, it's kind of a problem.
  (I hope function signature objects solve this--we're talking
  about it.)

* On a very related topic: os.openat() is one of the functions I
  removed.  And then we have this in Lib/os.py:
      if _exists('openat'):
          def fwalk(...)
  Since I removed os.openat completely, we could longer tell
  whether or not the functionality is available.  And I couldn't
  use my sysconfig.get_config_var trick because it's os.py and
  we're bootstrapping the interpreter.  So I went with a dirty
  hack--uh, I mean, practicality beats purity.  posixmodule.c
  conditionally exposes a symbol called _HAVE_OPENAT if openat()
  is available, and fwalk() is now gated on that.

* A minor semantic change: symlink() now accepts the same arguments
  everywhere.  Previously it had an extra optional argument only on
  Windows ("target_is_directory").  Non-Windows now accepts that too
  and ignores it, in much the same way that os.mkdir ignores the mode
  on Windows.  Also, os.symlink now accepts byte strings for the
  paths on Windows.  (Even though we'll remove byte string paths
  in 3.4, yes?)

* The patch is still sloppy:
    * The docstrings are incomplete.
    * I haven't touched the docs at all.
    * There are > 80 col lines.
  Clearly I'll clean up this stuff if the patch has a shot at
  going in.

I do think it's an improvement.  But I won't check any part of it
in without some consensus (or BFDL ruling).

I look forward to your feedback!
msg162271 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-04 16:15
Second pass at my patch.  Incorporates suggestions from Serhiy's review--thanks, Serhiy!

Still not ready for checkin.  > 80 col lines, no docs, docstrings are messy.  But code is ready for (further) review.  Code passes regression test suite without errors.
msg162274 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-06-04 16:52
Well, I'm going to ignore the long lines and documentation.  The patch is 
really big and impressive.
msg162275 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-04 16:54
I'm not sure that "long" and "impressive" are words that go together when describing a patch ;-)
msg162514 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-08 01:30
Here's a nice fresh minor update.

Notes on this third patch:

* The docstrings are now done.

* I discovered that fchmodat() doesn't actually support
  AT_SYMLINK_NOFOLLOW!  glibc documents using the flag, then
  comically notes that it doesn't actually work.  Specifying
  it results in ENOTSUP every time.  os.chmod() now accomodates
  this; it throws NotImplementedError if it detects this situation.
  However it should also Just Work once glibc supports it.

* While editing the docstrings, I noticed that several of them
  had old-style octal constants ;-)  I fixed 'em.

* I added support for the remove_dir argument to unlink/remove
  on Windows.  I also made it work on non-Windows even when
  unlinkat() is not available.
msg162548 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-08 19:02
BTW: If PEP 362 is accepted, and this patch makes it for 3.3 (both of which I think will happen), I'll hand-code signatures for the functions that may throw NotImplementedError so users can use "is_implemented" to LBYL.
msg162553 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2012-06-09 02:46
I haven't read the code, but from Larry's description this looks great to me. It's amazing how many extra functions were added to the os module since 3.2! I also agree that the redundant functions that existed in 3.2 should stay and I don't see it's fair to deprecate them. I do hope that not too many people have written code based on the 3.3 alphas using all those extra functions, but I suppose they will get what they paid for.

Everything else Larry wrote also sounds reasonable to me.
msg162554 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * Date: 2012-06-09 02:52
Previously existing redundant functions could be deprecated in documentation.
msg162555 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-09 04:23
> Previously existing redundant functions could be
> deprecated in documentation.

As in, don't start a "deprecation cycle" (warning in 3.3, deprecated in 3.4, gone in 3.5), just document "consider using this other function instead"?  That's probably worth doing.  I wouldn't use the word "deprecated" though, I'd just suggest a "see also".

Maybe we could remove the redundant functions in 4.0.  I'll put it on my wishlist :)
msg163311 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-21 08:23
New patch!  What's new:

* The big change: I removed the "fd=" parameters.  Now, functions that
  accept either a path or a file descriptor simply take either as their
  "path" argument.  I ran it by Guido and he thought it sounded fine,
  so I tried it--and I think it's a definite improvement.  (Thanks to
  Jim Jewett for suggesting it--tbh I'd considered it before, but
  looking at it through fresh eyes helped!)

* Also new in this patch: you can now LBYL for the fd, dir_fd, and
  follow_symlinks parameters.  Just check to see if the function is
  in os.supports_<name of parameter>.  For example:
     if os.chown in os.supports_dir_fd:
         os.chown(path, dir_fd=whatnot)

* The third big bit of news: the patch works under Windows!

* I attempted to support Mac OS X 10.3, specifically the weak linking
  to statvfs, fstatvfs, and lchown.  However I don't have a Mac (much
  less one running 10.3) so I can't test this.

I *think* the docstrings are all fixed.  The only thing I know that
needs to be done are the docs (and Misc/NEWS).

I really wanna get this in before the feature freeze.  I promise to
support it through the betas... can I puh-leez check it in?
msg163393 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-06-22 10:47
Fifth iteration of my patch.  Everything is done, and I really think it's ready to be checked in.

* The documentation is done, including Misc/NEWS.
* All the code is now < 80 columns.
* The docstrings have been double- and triple-checked.
* It passes all regression tests on Linux 64-bit; it processes the
  regression test suite identically to trunk on Windows 32-bit.
msg163464 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-06-22 19:22
> > http://bugs.python.org/review/14626/diff/5182/Doc/library/os.rst#newcode1210
> Doc/library/os.rst:1210: using it will raise a
> :exc:`NotImplementedError`.
> Same as above: would it make sense to ignore the arg if follow_symlinks
> is not supported, but the fact is irrelevant because the system doesn't
> support symlinks at all?

And what if the system supports symlink and doesn't support
at-functions, follow_symlinks=False and file is not symlink. Should it
be NotImplementedError?
msg163503 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-22 23:30
New changeset 27f9c26fdd8b by Larry Hastings in branch 'default':
Issue #14626: Large refactoring of functions / parameters in the os module.
http://hg.python.org/cpython/rev/27f9c26fdd8b
msg163506 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-23 00:02
New changeset 04fd8f77a58e by Larry Hastings in branch 'default':
Issue #14626: Fix buildbot issues on FreeBSD (AMD64).  (Fingers crossed.)
http://hg.python.org/cpython/rev/04fd8f77a58e
msg163507 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-23 00:07
New changeset e1e0eeb07398 by Larry Hastings in branch 'default':
Issue #14626: Fix buildbot issue on x86 Tiger 3.x.
http://hg.python.org/cpython/rev/e1e0eeb07398
msg163525 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-23 02:51
New changeset 66f7377547d5 by Larry Hastings in branch 'default':
Issue #14626: Fix buildbot issue on OpenIndiana 3.x machines.  (Hopefully.)
http://hg.python.org/cpython/rev/66f7377547d5
msg163576 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-23 10:37
27f9c26fdd8b broke test_shutil on the Windows buildbots:


======================================================================
FAIL: test_basic (test.test_shutil.TestWhich)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1146, in test_basic
    self.assertEqual(rv, self.temp_file.name)
AssertionError: None != 'c:\\users\\db3l\\appdata\\local\\temp\\tmpxqw4gu\\tmp7ugfmm.exe'

======================================================================
FAIL: test_full_path_short_circuit (test.test_shutil.TestWhich)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1152, in test_full_path_short_circuit
    self.assertEqual(self.temp_file.name, rv)
AssertionError: 'c:\\users\\db3l\\appdata\\local\\temp\\tmpmwer14\\tmpeacfbz.exe' != None

======================================================================
FAIL: test_non_matching_mode (test.test_shutil.TestWhich)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1158, in test_non_matching_mode
    self.assertIsNone(rv)
AssertionError: 'c:\\users\\db3l\\appdata\\local\\temp\\tmp7n6ojp\\tmp5tt9pa.exe' is not None

======================================================================
FAIL: test_pathext_checking (test.test_shutil.TestWhich)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1181, in test_pathext_checking
    self.assertEqual(self.temp_file.name, rv)
AssertionError: 'c:\\users\\db3l\\appdata\\local\\temp\\tmpipmbe3\\tmpx43hex.exe' != None

======================================================================
FAIL: test_relative (test.test_shutil.TestWhich)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1166, in test_relative
    self.assertEqual(rv, os.path.join(tail_dir, self.file))
AssertionError: None != 'tmpcluw7l\\tmp6sy_py.exe'
msg163643 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-23 16:49
The `rmdir` argument to unlink() looks completely crazy to me. Instead rmdir() should be called when one wants to rmdir() a fd.
msg163648 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-06-23 17:15
I agree with Antoine.  I've opened #15154 to track this.
msg163669 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-06-23 20:41
The Windows buildbots are now content; closing.
History
Date User Action Args
2012-06-23 20:41:18georg.brandlsetstatus: open -> closed
resolution: fixed
messages: + msg163669
2012-06-23 17:15:20georg.brandlsetnosy: + georg.brandl
messages: + msg163648
2012-06-23 16:49:17pitrousetmessages: + msg163643
2012-06-23 10:37:53pitrousetpriority: normal -> release blocker
nosy: + pitrou
messages: + msg163576

2012-06-23 02:51:05python-devsetmessages: + msg163525
2012-06-23 00:07:36python-devsetmessages: + msg163507
2012-06-23 00:02:14python-devsetmessages: + msg163506
2012-06-22 23:30:51python-devsetnosy: + python-dev
messages: + msg163503
2012-06-22 19:22:52serhiy.storchakasetmessages: + msg163464
2012-06-22 10:48:21larrysetfiles: + larry.os.keyword.arguments.collapse.5.diff

messages: + msg163393
2012-06-21 08:24:00larrysetfiles: + larry.os.keyword.arguments.collapse.4.diff

messages: + msg163311
2012-06-09 04:23:42larrysetmessages: + msg162555
2012-06-09 02:52:06Arfreversetmessages: + msg162554
2012-06-09 02:46:40gvanrossumsetnosy: + gvanrossum
messages: + msg162553
2012-06-08 19:02:41larrysetmessages: + msg162548
2012-06-08 01:30:42larrysetfiles: + larry.os.keyword.arguments.collapse.3.diff

messages: + msg162514
2012-06-06 16:24:37Yury.Selivanovsetnosy: + Yury.Selivanov
2012-06-04 16:54:24larrysetmessages: + msg162275
2012-06-04 16:52:49serhiy.storchakasetmessages: + msg162274
2012-06-04 16:15:46larrysetfiles: + larry.os.keyword.arguments.collapse.2.diff

messages: + msg162271
2012-05-26 04:54:45larrysetfiles: + larry.os.keyword.arguments.collapse.1.diff
messages: + msg161646

assignee: larry
keywords: + patch
stage: patch review
2012-05-16 20:09:31rosslagerwallsetnosy: + rosslagerwall
2012-05-02 03:00:43larrysetmessages: + msg159772
2012-04-29 16:35:17neologixsetmessages: + msg159626
2012-04-28 14:36:58pitrousetnosy: + neologix
2012-04-28 10:38:40larrysetmessages: + msg159518
2012-04-20 07:40:18larrysetmessages: + msg158802
2012-04-20 06:52:54serhiy.storchakasetmessages: + msg158801
2012-04-20 06:35:47serhiy.storchakasetmessages: + msg158799
components: + Library (Lib)
versions: + Python 3.3
2012-04-20 04:55:03larrysetmessages: + msg158796
2012-04-20 04:45:47r.david.murraysettype: enhancement

messages: + msg158795
nosy: + r.david.murray
2012-04-20 02:31:59Arfreversetnosy: + Arfrever
2012-04-20 00:40:33benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg158788
2012-04-19 23:56:58larrysetmessages: + msg158779
2012-04-19 23:54:18larrysetnosy: + serhiy.storchaka
2012-04-19 23:52:38larrycreate