Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support errors with two filenames for errno exceptions #64716

Closed
larryhastings opened this issue Feb 5, 2014 · 34 comments
Closed

Support errors with two filenames for errno exceptions #64716

larryhastings opened this issue Feb 5, 2014 · 34 comments
Assignees
Labels
type-feature A feature request or enhancement

Comments

@larryhastings
Copy link
Contributor

BPO 20517
Nosy @birkenfeld, @vstinner, @larryhastings, @bitdancer, @serhiy-storchaka, @vajrasky
Files
  • larry.oserror.add.filename2.1.diff
  • larry.oserror.add.filename2.2.diff
  • larry.oserror.add.filename2.3.diff
  • larry.oserror.add.filename2.4.diff
  • larry.oserror.add.filename2.5.diff
  • larry.oserror.add.filename2.6.diff
  • larry.oserror.remove.with.filenames.etc.1.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/larryhastings'
    closed_at = <Date 2014-02-10.12:37:48.369>
    created_at = <Date 2014-02-05.03:43:10.744>
    labels = ['type-feature']
    title = 'Support errors with two filenames for errno exceptions'
    updated_at = <Date 2014-03-13.18:44:47.996>
    user = 'https://github.com/larryhastings'

    bugs.python.org fields:

    activity = <Date 2014-03-13.18:44:47.996>
    actor = 'larry'
    assignee = 'larry'
    closed = True
    closed_date = <Date 2014-02-10.12:37:48.369>
    closer = 'larry'
    components = []
    creation = <Date 2014-02-05.03:43:10.744>
    creator = 'larry'
    dependencies = []
    files = ['34004', '34005', '34010', '34012', '34015', '34017', '34019']
    hgrepos = []
    issue_num = 20517
    keywords = ['patch']
    message_count = 34.0
    messages = ['210290', '210294', '210301', '210365', '210366', '210730', '210737', '210753', '210754', '210755', '210757', '210760', '210767', '210768', '210774', '210778', '210782', '210784', '210796', '210798', '210801', '210814', '210818', '210824', '210826', '210827', '210830', '210831', '211357', '213408', '213429', '213435', '213436', '213439']
    nosy_count = 9.0
    nosy_names = ['richard', 'georg.brandl', 'vstinner', 'larry', 'Arfrever', 'r.david.murray', 'python-dev', 'serhiy.storchaka', 'vajrasky']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue20517'
    versions = ['Python 3.4']

    @larryhastings
    Copy link
    Contributor Author

    There are a bunch of functions provided by Python, e.g. PyErr_SetFromErrnoWithFilenameObject(), that allow specifying a filename associated with the error. But there are some errors that really need two filenames, like copy(), symlink(), and rename(). The error could be on only one file, but some errors could apply to either or both, and errno's error doesn't always provide enough context to tell which it would be.

    I propose that we add new APIs that allow specifying a second filename. We take all the *WithFilename* APIs and add the *WithFilenames* equivalent (e.g. PyErr_SetFromErrnoWithFilenameObjects()). Internally, oserror_parse_args() would now parse an extra "filename2" entry in the tuple, just after the "filename" entry (but before the possible "winerror" entry).

    Currently when formatting an error with a filename, the format string looks like
    [Errno {errno}] {errstring}: {filename}
    I propose that for two filenames it look like
    [Errno {errno}] {errstring}: \"{filename}\" -> \"{filename2}\"

    @larryhastings larryhastings added the type-feature A feature request or enhancement label Feb 5, 2014
    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Feb 5, 2014

    "But there are some errors that really need two filenames, like copy(), symlink(), and rename()."

    I think *need* is too strong word in this case. I agree that two filenames is better than none. But I don't see anything wrong from omitting filenames in error messages for copy(), etc. If PHP and Perl are taking "omitting filenames" way, surely there is some merit in that way.

    I am in if we are taking "two filenames" way just like Ruby does. It's just isn't it too rush for Python 3.4?

    @serhiy-storchaka
    Copy link
    Member

    As release manager Larry has the right to add a new feature after feature freeze.

    @larryhastings
    Copy link
    Contributor Author

    Serhiy: I'm not sure if it's the language barrier, but that came across kind of mean. It's not that I can do whatever I want as release manager, but that I have the say in whether something is a "bug fix" or a "new feature", which is how we decide whether or not something is allowed into Python after feature freeze.

    This issue has been on my radar for a while (originally bpo-16074). But I wasn't paying strong attention to it. Nobody in that issue came up with a solution I liked. Finally when you posted your patch I said "ugh, can't we do better" and had to think about it before I realized we should just display both filenames. If somebody had posted a patch with that two months ago I would have happily accepted it and we wouldn't be having this conversation now.

    Vajrasky: My goal is that Python is nicer to use than PHP or Perl. And it's more than a month before 3.4 final is scheduled to be released. This patch is a pretty mechanical change--create new function, accept extra parameter, make the tuple one entry longer. I don't expect it to be destabilizing.

    However, I *was* hoping that one of the original authors of the code in question would come forth and say a) whether or not they think it's a good idea in general, and b) if they think the specific approach is fine.

    The patch is a bit stalled because of higher-priority Argument Clinic changes. I could post a partial patch if someone wanted to pick it up and finish it.

    @serhiy-storchaka
    Copy link
    Member

    This issue has been on my radar for a while (originally bpo-16074). But I
    wasn't paying strong attention to it. Nobody in that issue came up with a
    solution I liked. Finally when you posted your patch I said "ugh, can't we
    do better" and had to think about it before I realized we should just
    display both filenames. If somebody had posted a patch with that two
    months ago I would have happily accepted it and we wouldn't be having this
    conversation now.

    Actually that was Vajrasky's patch, and that was extended version of your
    patch. I asked you not because you are release manager, but because you are
    the author of the code and original patch.

    I agree that support errors with two filenames is better solution than remove
    ambiguous filename attribute, but there is no much time left and we still don't
    have a patch. Vajrasky's patch is alternate variant for 3.4 in case when
    better patch will not be ready and is only solution for 3.3 (if we decide to
    fix this in 3.3).

    @larryhastings
    Copy link
    Contributor Author

    Here's a first cut at a patch. With this applied Python passes the whole test suite.

    I was surprised at how ticklish the OSError object was about adding a fifth member, with this weird "exception tuples can only have two members" policy. But test_exceptions helped me find all the problems.

    @larryhastings
    Copy link
    Contributor Author

    Added a test checking that the error messages show up properly.

    @larryhastings larryhastings self-assigned this Feb 9, 2014
    @serhiy-storchaka
    Copy link
    Member

    Are *WithUnicodeFilenames() functions needed? Py_UNICODE API considered as deprecated and there is no need to support compatibility with older versions.

    @larryhastings
    Copy link
    Contributor Author

    There aren't any deprecation warnings in the code.

    @serhiy-storchaka
    Copy link
    Member

    @larryhastings
    Copy link
    Contributor Author

    But the PyErr_ functions that accept Py_UNICODE aren't marked deprecated.

    http://docs.python.org/3.4/c-api/exceptions.html#unicode-exception-objects

    @serhiy-storchaka
    Copy link
    Member

    Perhaps they should be. Note that all functions that accept Py_UNICODE are not
    a part of stable API.

    In any case I don't think we should add *new* functions with deprecated API.

    @birkenfeld
    Copy link
    Member

    I agree.

    @larryhastings
    Copy link
    Contributor Author

    Attached is patch #3. This one has been tested on Linux, Windows 7 64-bit, and Snow Leopard 64-bit. Windows Server 2008 32-bit and 64-bit
    are running now, looking good so far.

    Changes:

    • The order of arguments for OSError is now:
      (errno, string, filename, winerror, filename2)
      The winerror argument is always ignored on non-Windows platforms and
      may be any value. Preserving the order of existing calls is
      important. I considered making filename2 a keyword-only argument
      (as Georg suggested in IRC) but this would have made pickling and
      unpickling much more elaborate.

    • Removed new functions taking Py_UNICODE at Serhiy's insistence.

    • Removed PyErr_SetFromWindowsErrWithFilenameObject from documentation,
      as it doesn't even exist.

    • Further documentation fixes.

    @serhiy-storchaka
    Copy link
    Member

    I'm not sure that PyErr_SetFromWindowsErrWithFilenames() and PyErr_SetExcFromWindowsErrWithFilenames() will be useful. "const char*" filenames are not recommended on Windows. Victor can say more.

    @larryhastings
    Copy link
    Contributor Author

    New patch incorporating Serhiy's suggestions. Thanks, Serhiy!

    Did more testing with the buildbots. Windows Server 2008 32-bit and 64-bit were both fine. So were ARMv7, OpenIndiana 64-bit, Gentoo 32-bit, FreeBSD 10 64-bit, and PowerLinux PPC 64-bit. (This was all run using diff #3, but diff #4 doesn't change any C code. It just changes the test and some docs.)

    Can I get a LGTM?

    @larryhastings
    Copy link
    Contributor Author

    One more tweak from Serhiy.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 9, 2014

    Support of bytes filenames has ben deprecated on Windows, Unicode is really
    the native type.

    @larryhastings
    Copy link
    Contributor Author

    Okay. I have revived the Py_UNICODE functions I removed in patch #3. The patch now works fine on my Ubuntu 13.10 64-bit box, and at least compiled on a Windows buildbot. It's now building on nine buildbots.

    Assuming the buildbots look good, can I check this in?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 10, 2014

    New changeset 081a9d8ba3c7 by Larry Hastings in branch 'default':
    Issue bpo-20517: Functions in the os module that accept two filenames
    http://hg.python.org/cpython/rev/081a9d8ba3c7

    @larryhastings
    Copy link
    Contributor Author

    It's in! And the buildbots look healthy.

    @vstinner
    Copy link
    Member

    Why did you add so many versions of the same functions? Only PyErr_SetExcFromWindowsErrWithFilenameObjects() and PyErr_SetFromErrnoWithFilenameObjects() are used.

    The Py_UNICODE type is deprecated, you should not add new functions using it in Python 3.4.

    +PyObject *PyErr_SetFromWindowsErrWithUnicodeFilenames(
    + int ierr,
    + const Py_UNICODE *filename,
    + const Py_UNICODE *filename2)

    And you should avoid passing raw bytes string to build an error message, you probably has the Python object version of the filename somewhere in your code.

    +PyAPI_FUNC(PyObject *) PyErr_SetFromErrnoWithFilenames(
    + PyObject *exc,
    + /* decoded from the filesystem encoding */
    + const char *filename,
    + const char *filename2
    + );

    +PyObject *PyErr_SetExcFromWindowsErrWithFilenames(
    + PyObject *exc,
    + int ierr,
    + const char *filename,
    + const char *filename2)

    In Python 3.3, there are also too many functions to raise an OSError, I don't that you should so many new functions. Please remove:

    • PyErr_SetFromWindowsErrWithUnicodeFilenames
    • PyErr_SetFromErrnoWithFilenames
    • PyErr_SetExcFromWindowsErrWithFilenames

    Having two filenames in OSError is the best fix for functions like os.rename when we don't know which path raised the error. I remember that it was hard to guess if the source or the destination was the problem, so thanks for working on this.

    Note: When I wrote "Unicode is really the native type", I mean a PyObject* object which is a str, not Py_UNICODE* which is deprecated.

    @vstinner vstinner reopened this Feb 10, 2014
    @vstinner
    Copy link
    Member

    The family of "PyErr_SetExcFrom..." functions was used when there were various kind of exceptions: select.error, mmap.error, OSError, IOError, socket.error, etc. The "PEP-3151: Reworking the OS and IO exception hierarchy" has been implemented in Python 3.3. I'm not sure that we need such function anymore, a function always raising OSError is probably enough:

    "PyErr_SetExcFromWindowsErrWithFilenameObjects" name should be just "PyErr_SetFromWindowsErrWithFilenames". I hate such long names :-(

    @vstinner
    Copy link
    Member

    "And you should avoid passing raw bytes string to build an error message, you probably has the Python object version of the filename somewhere in your code."

    Oh, I remember the reason why char* must not be used to build an OSError: on Windows, you should never try to decode a bytes filename, because it may raise a UnicodeDecodeError while you are trying to build an OSError. See issue bpo-15478 for the rationale.

    Just pass the original PyObject* you get in path_t. There is even an unit test to ensure that OSError.filename is the original name: OSErrorTests.test_oserror_filename() in test_os.

    @larryhastings
    Copy link
    Contributor Author

    Talked it over with Victor in IRC. I agree it's best to only add the WithFilenameObjects functions, as best practice requires using the original PyObject * passed in when creating the OSError.

    The attached patch removes all the new WithFilenames and WithUnicodeFilenames functions. Note that I merely amend the existing NEWS entry rather than add a new one; the functions only lived in trunk for a couple of hours.

    @vstinner
    Copy link
    Member

    I applied larry.oserror.remove.with.filenames.etc.1.diff on default: On Windows, the code compiles fine and test_os pass.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 10, 2014

    New changeset 6343bdbb7085 by Larry Hastings in branch 'default':
    Issue bpo-20517: Removed unnecessary new (short-lived) functions from PyErr.
    http://hg.python.org/cpython/rev/6343bdbb7085

    @larryhastings
    Copy link
    Contributor Author

    Buildbots are basically happy with it. It's checked in. This was the last checkin before 3.4.0rc1 was tagged!

    @serhiy-storchaka
    Copy link
    Member

    I think it is worth to mention this in Doc/whatsnew/3.4.rst, as this is a little incompatible change.

    Python 3.3:

    >>> x = OSError(2, 'No such file or directory', 'foo', 0, 'bar')
    >>> x.args
    (2, 'No such file or directory', 'foo', 0, 'bar')

    Python 3.4:

    >>> x = OSError(2, 'No such file or directory', 'foo', 0, 'bar')
    >>> x.args
    (2, 'No such file or directory')

    @bitdancer
    Copy link
    Member

    In 3.3:

    >>> x = OSError(2, 'No such file or directory', 'foo', 0, 'bar')
    >>> str(x)
    "(2, 'No such file or directory', 'foo', 0, 'bar')"

    So, I don't see this as a realistic backwards compatibility problem worthy of a porting note.

    @serhiy-storchaka
    Copy link
    Member

    OK then.

    @bitdancer
    Copy link
    Member

    I was going to wonder if the args thing was a bug, but I see that actually it continues the backward-compatibility tradition already established (python3.3):

    >>> x = OSError(2, 'No such file or directory', 'abc')
    >>> str(x)
    "[Errno 2] No such file or directory: 'abc'"
    >>> x.args
    (2, 'No such file or directory')

    @bitdancer
    Copy link
    Member

    (Ether that, or it is a long-standing bug.)

    @larryhastings
    Copy link
    Contributor Author

    Yeah, I admit I don't understand what problem that code was solving. But it looked Very Deliberate so I preserved the behavior.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants