classification
Title: Support errors with two filenames for errno exceptions
Type: enhancement Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: larry Nosy List: Arfrever, georg.brandl, haypo, larry, python-dev, r.david.murray, richard, serhiy.storchaka, vajrasky
Priority: normal Keywords: patch

Created on 2014-02-05 03:43 by larry, last changed 2014-03-13 18:44 by larry. This issue is now closed.

Files
File name Uploaded Description Edit
larry.oserror.add.filename2.1.diff larry, 2014-02-09 09:09 review
larry.oserror.add.filename2.2.diff larry, 2014-02-09 09:45 review
larry.oserror.add.filename2.3.diff larry, 2014-02-09 15:47 review
larry.oserror.add.filename2.4.diff larry, 2014-02-09 18:27 review
larry.oserror.add.filename2.5.diff larry, 2014-02-09 19:43 review
larry.oserror.add.filename2.6.diff larry, 2014-02-10 01:45 review
larry.oserror.remove.with.filenames.etc.1.diff larry, 2014-02-10 11:38 review
Messages (34)
msg210290 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-05 03:43
There are a bunch of functions provided by Python, e.g. PyErr_SetFromErrnoWithFilenameObject(), that allow specifying a filename associated with the error.  But there are some errors that really need two filenames, like copy(), symlink(), and rename().  The error could be on only one file, but some errors could apply to either or both, and errno's error doesn't always provide enough context to tell which it would be.

I propose that we add new APIs that allow specifying a second filename.  We take all the *WithFilename* APIs and add the *WithFilenames* equivalent (e.g. PyErr_SetFromErrnoWithFilenameObjects()).  Internally, oserror_parse_args() would now parse an extra "filename2" entry in the tuple, just after the "filename" entry (but before the possible "winerror" entry).

Currently when formatting an error with a filename, the format string looks like
    [Errno {errno}] {errstring}: {filename}
I propose that for two filenames it look like
    [Errno {errno}] {errstring}: \"{filename}\" -> \"{filename2}\"
msg210294 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2014-02-05 09:48
"But there are some errors that really need two filenames, like copy(), symlink(), and rename()."

I think *need* is too strong word in this case. I agree that two filenames is better than none. But I don't see anything wrong from omitting filenames in error messages for copy(), etc. If PHP and Perl are taking "omitting filenames" way, surely there is some merit in that way.

I am in if we are taking "two filenames" way just like Ruby does. It's just isn't it too rush for Python 3.4?
msg210301 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-05 11:26
As release manager Larry has the right to add a new feature after feature freeze.
msg210365 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-06 08:00
Serhiy: I'm not sure if it's the language barrier, but that came across kind of mean.  It's not that I can do whatever I want as release manager, but that I have the say in whether something is a "bug fix" or a "new feature", which is how we decide whether or not something is allowed into Python after feature freeze.

This issue has been on my radar for a while (originally #16074).  But I wasn't paying strong attention to it.  Nobody in that issue came up with a solution I liked.  Finally when you posted your patch I said "ugh, can't we do better" and had to think about it before I realized we should just display both filenames.  If somebody had posted a patch with that two months ago I would have happily accepted it and we wouldn't be having this conversation now.

Vajrasky: My goal is that Python is nicer to use than PHP or Perl.  And it's more than a month before 3.4 final is scheduled to be released.  This patch is a pretty mechanical change--create new function, accept extra parameter, make the tuple one entry longer.  I don't expect it to be destabilizing.

However, I *was* hoping that one of the original authors of the code in question would come forth and say a) whether or not they think it's a good idea in general, and b) if they think the specific approach is fine.

The patch is a bit stalled because of higher-priority Argument Clinic changes.  I could post a partial patch if someone wanted to pick it up and finish it.
msg210366 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-06 08:32
> This issue has been on my radar for a while (originally #16074).  But I
> wasn't paying strong attention to it.  Nobody in that issue came up with a
> solution I liked.  Finally when you posted your patch I said "ugh, can't we
> do better" and had to think about it before I realized we should just
> display both filenames.  If somebody had posted a patch with that two
> months ago I would have happily accepted it and we wouldn't be having this
> conversation now.

Actually that was Vajrasky's patch, and that was extended version of your 
patch. I asked you not because you are release manager, but because you are 
the author of the code and original patch.

I agree that support errors with two filenames is better solution than remove 
ambiguous filename attribute, but there is no much time left and we still don't 
have a patch. Vajrasky's patch is alternate variant for 3.4 in case when 
better patch will not be ready and is only solution for 3.3 (if we decide to 
fix this in 3.3).
msg210730 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 09:09
Here's a first cut at a patch.  With this applied Python passes the whole test suite.

I was surprised at how ticklish the OSError object was about adding a fifth member, with this weird "exception tuples can only have two members" policy.  But test_exceptions helped me find all the problems.
msg210737 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 09:45
Added a test checking that the error messages show up properly.
msg210753 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-09 12:40
Are *WithUnicodeFilenames() functions needed? Py_UNICODE API considered as deprecated and there is no need to support compatibility with older versions.
msg210754 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 12:42
There aren't any deprecation warnings in the code.
msg210755 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-09 13:05
http://docs.python.org/3/c-api/unicode.html#deprecated-py-unicode-apis
msg210757 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 13:07
But the PyErr_ functions that accept Py_UNICODE aren't marked deprecated.

http://docs.python.org/3.4/c-api/exceptions.html#unicode-exception-objects
msg210760 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-09 13:18
Perhaps they should be. Note that all functions that accept Py_UNICODE are not 
a part of stable API.

In any case I don't think we should add *new* functions with deprecated API.
msg210767 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-02-09 14:29
I agree.
msg210768 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 15:47
Attached is patch #3.  This one has been tested on Linux, Windows 7 64-bit, and Snow Leopard 64-bit.  Windows Server 2008 32-bit and 64-bit
are running now, looking good so far.

Changes:

* The order of arguments for OSError is now:
    (errno, string, filename, winerror, filename2)
  The winerror argument is always ignored on non-Windows platforms and
  may be any value.  Preserving the order of existing calls is
  important.  I considered making filename2 a keyword-only argument
  (as Georg suggested in IRC) but this would have made pickling and
  unpickling much more elaborate.

* Removed new functions taking Py_UNICODE at Serhiy's insistence.

* Removed PyErr_SetFromWindowsErrWithFilenameObject from documentation,
  as it doesn't even exist.

* Further documentation fixes.
msg210774 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-09 17:42
I'm not sure that PyErr_SetFromWindowsErrWithFilenames() and PyErr_SetExcFromWindowsErrWithFilenames() will be useful. "const char*" filenames are not recommended on Windows. Victor can say more.
msg210778 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 18:27
New patch incorporating Serhiy's suggestions.  Thanks, Serhiy!

Did more testing with the buildbots.  Windows Server 2008 32-bit and 64-bit were both fine.  So were ARMv7, OpenIndiana 64-bit, Gentoo 32-bit, FreeBSD 10 64-bit, and PowerLinux PPC 64-bit.  (This was all run using diff #3, but diff #4 doesn't change any C code.  It just changes the test and some docs.)

Can I get a LGTM?
msg210782 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 19:43
One more tweak from Serhiy.
msg210784 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-09 19:51
Support of bytes filenames has ben deprecated on Windows, Unicode is really
the native type.
msg210796 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-10 01:45
Okay.  I have revived the Py_UNICODE functions I removed in patch #3.  The patch now works fine on my Ubuntu 13.10 64-bit box, and at least compiled on a Windows buildbot.  It's now building on nine buildbots.

Assuming the buildbots look good, can I check this in?
msg210798 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-10 06:06
New changeset 081a9d8ba3c7 by Larry Hastings in branch 'default':
Issue #20517: Functions in the os module that accept two filenames
http://hg.python.org/cpython/rev/081a9d8ba3c7
msg210801 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-10 06:55
It's in!  And the buildbots look healthy.
msg210814 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-10 09:59
Why did you add so many versions of the same functions? Only PyErr_SetExcFromWindowsErrWithFilenameObjects() and PyErr_SetFromErrnoWithFilenameObjects() are used.

The Py_UNICODE type is deprecated, you should not add new functions using it in Python 3.4.

+PyObject *PyErr_SetFromWindowsErrWithUnicodeFilenames(
+    int ierr,
+    const Py_UNICODE *filename,
+    const Py_UNICODE *filename2)

And you should avoid passing raw bytes string to build an error message, you probably has the Python object version of the filename somewhere in your code.

+PyAPI_FUNC(PyObject *) PyErr_SetFromErrnoWithFilenames(
+    PyObject *exc,
+    /* decoded from the filesystem encoding */
+    const char *filename,
+    const char *filename2
+    );

+PyObject *PyErr_SetExcFromWindowsErrWithFilenames(
+    PyObject *exc,
+    int ierr,
+    const char *filename,
+    const char *filename2)

In Python 3.3, there are also too many functions to raise an OSError, I don't that you should so many new functions. Please remove:

- PyErr_SetFromWindowsErrWithUnicodeFilenames
- PyErr_SetFromErrnoWithFilenames
- PyErr_SetExcFromWindowsErrWithFilenames

Having two filenames in OSError is the best fix for functions like os.rename when we don't know which path raised the error. I remember that it was hard to guess if the source or the destination was the problem, so thanks for working on this.

Note: When I wrote "Unicode is really the native type", I mean a PyObject* object which is a str, not Py_UNICODE* which is deprecated.
msg210818 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-10 10:21
The family of "PyErr_SetExcFrom..." functions was used when there were various kind of exceptions: select.error, mmap.error, OSError, IOError, socket.error, etc. The "PEP 3151: Reworking the OS and IO exception hierarchy" has been implemented in Python 3.3. I'm not sure that we need such function anymore, a function always raising OSError is probably enough:

"PyErr_SetExcFromWindowsErrWithFilenameObjects" name should be just "PyErr_SetFromWindowsErrWithFilenames". I hate such long names :-(
msg210824 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-10 11:10
"And you should avoid passing raw bytes string to build an error message, you probably has the Python object version of the filename somewhere in your code."

Oh, I remember the reason why char* must not be used to build an OSError: on Windows, you should never try to decode a bytes filename, because it may raise a UnicodeDecodeError while you are trying to build an OSError. See issue #15478 for the rationale.

Just pass the original PyObject* you get in path_t. There is even an unit test to ensure that OSError.filename is the original name: OSErrorTests.test_oserror_filename() in test_os.
msg210826 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-10 11:38
Talked it over with Victor in IRC.  I agree it's best to only add the WithFilenameObjects functions, as best practice requires using the original PyObject * passed in when creating the OSError.

The attached patch removes all the new WithFilenames and WithUnicodeFilenames functions.  Note that I merely amend the existing NEWS entry rather than add a new one; the functions only lived in trunk for a couple of hours.
msg210827 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-10 11:40
I applied larry.oserror.remove.with.filenames.etc.1.diff on default: On Windows, the code compiles fine  and test_os pass.
msg210830 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-10 12:16
New changeset 6343bdbb7085 by Larry Hastings in branch 'default':
Issue #20517: Removed unnecessary new (short-lived) functions from PyErr.
http://hg.python.org/cpython/rev/6343bdbb7085
msg210831 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-10 12:37
Buildbots are basically happy with it.  It's checked in.  This was the last checkin before 3.4.0rc1 was tagged!
msg211357 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-16 20:22
I think it is worth to mention this in Doc/whatsnew/3.4.rst, as this is a little incompatible change.

Python 3.3:

>>> x = OSError(2, 'No such file or directory', 'foo', 0, 'bar')
>>> x.args
(2, 'No such file or directory', 'foo', 0, 'bar')

Python 3.4:

>>> x = OSError(2, 'No such file or directory', 'foo', 0, 'bar')
>>> x.args
(2, 'No such file or directory')
msg213408 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-13 15:43
In 3.3:

>>> x = OSError(2, 'No such file or directory', 'foo', 0, 'bar')
>>> str(x)
"(2, 'No such file or directory', 'foo', 0, 'bar')"

So, I don't see this as a realistic backwards compatibility problem worthy of a porting note.
msg213429 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-03-13 18:14
OK then.
msg213435 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-13 18:33
I was going to wonder if the args thing was a bug, but I see that actually it continues the backward-compatibility tradition already established (python3.3):

>>> x = OSError(2, 'No such file or directory', 'abc')
>>> str(x)
"[Errno 2] No such file or directory: 'abc'"
>>> x.args
(2, 'No such file or directory')
msg213436 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-13 18:33
(Ether that, or it is a long-standing bug.)
msg213439 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-03-13 18:44
Yeah, I admit I don't understand what problem that code was solving.  But it looked Very Deliberate so I preserved the behavior.
History
Date User Action Args
2014-03-13 18:44:47larrysetmessages: + msg213439
2014-03-13 18:33:55r.david.murraysetmessages: + msg213436
2014-03-13 18:33:15r.david.murraysetmessages: + msg213435
2014-03-13 18:14:19serhiy.storchakasetmessages: + msg213429
2014-03-13 15:43:59r.david.murraysetnosy: + r.david.murray
messages: + msg213408
2014-02-16 20:22:05serhiy.storchakasetmessages: + msg211357
2014-02-10 12:37:48larrysetstatus: open -> closed
resolution: fixed
messages: + msg210831
2014-02-10 12:16:55python-devsetmessages: + msg210830
2014-02-10 11:40:32hayposetmessages: + msg210827
2014-02-10 11:38:12larrysetfiles: + larry.oserror.remove.with.filenames.etc.1.diff

messages: + msg210826
2014-02-10 11:10:26hayposetmessages: + msg210824
2014-02-10 10:21:16hayposetmessages: + msg210818
2014-02-10 09:59:15hayposetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg210814
2014-02-10 06:55:32larrysetstatus: open -> closed
resolution: fixed
messages: + msg210801

stage: patch review -> resolved
2014-02-10 06:06:06python-devsetnosy: + python-dev
messages: + msg210798
2014-02-10 01:45:40larrysetfiles: + larry.oserror.add.filename2.6.diff

messages: + msg210796
2014-02-09 19:51:15hayposetmessages: + msg210784
2014-02-09 19:43:45larrysetfiles: + larry.oserror.add.filename2.5.diff

messages: + msg210782
2014-02-09 18:27:39larrysetfiles: + larry.oserror.add.filename2.4.diff

messages: + msg210778
2014-02-09 17:42:35serhiy.storchakasetnosy: + haypo
messages: + msg210774
2014-02-09 15:47:53larrysetfiles: + larry.oserror.add.filename2.3.diff

messages: + msg210768
2014-02-09 14:29:57georg.brandlsetmessages: + msg210767
2014-02-09 13:18:47serhiy.storchakasetmessages: + msg210760
2014-02-09 13:07:56larrysetmessages: + msg210757
2014-02-09 13:05:06serhiy.storchakasetmessages: + msg210755
2014-02-09 12:42:32larrysetmessages: + msg210754
2014-02-09 12:40:13serhiy.storchakasetmessages: + msg210753
2014-02-09 09:45:52larrysetfiles: + larry.oserror.add.filename2.2.diff
assignee: larry
messages: + msg210737

stage: needs patch -> patch review
2014-02-09 09:09:49larrysetfiles: + larry.oserror.add.filename2.1.diff
keywords: + patch
messages: + msg210730
2014-02-06 08:32:13serhiy.storchakasetmessages: + msg210366
2014-02-06 08:04:47Arfreversetnosy: + Arfrever
2014-02-06 08:00:29larrysetmessages: + msg210365
2014-02-06 07:54:43larrylinkissue16074 superseder
2014-02-05 11:26:05serhiy.storchakasetmessages: + msg210301
2014-02-05 09:48:11vajraskysetnosy: + vajrasky
messages: + msg210294
2014-02-05 03:43:10larrycreate