classification
Title: zipfile does not support pathlib
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, brett.cannon, ethan.furman, jtf621, ned.deily, r.david.murray, serhiy.storchaka, steve.dower
Priority: normal Keywords: patch

Created on 2016-09-21 07:32 by ethan.furman, last changed 2017-03-24 22:41 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
open-zipfile.stoneleaf.patch ethan.furman, 2016-09-21 07:32 review
zipfile-pathlib.patch serhiy.storchaka, 2017-02-27 07:09
zipfile-pathlib-3.6.1.patch serhiy.storchaka, 2017-02-28 06:02
Pull Requests
URL Status Linked Edit
PR 322 closed berker.peksag, 2017-02-26 17:00
PR 511 merged serhiy.storchaka, 2017-03-06 11:10
PR 561 merged serhiy.storchaka, 2017-03-08 12:50
PR 703 larry, 2017-03-17 21:00
Messages (19)
msg277109 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-21 08:11
Shouldn't the ZipFile.filename attribute be converted to str?

If add support of pathlib, maybe add support of bytes?

The file name of ZipFile is only a part of the issue. There are other uses of file paths: paths for added files, path for extracted directory.

And there are internal paths in zip archive. Maybe it is worth to introduce new path-like type for them and provide pathlib-like interface to ZipFile. But this is separate not easy issue.
msg277391 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-09-25 18:29
the patch LGTM, but Serhiy has a point that maybe we should add tests for dealing with other paths such as those contained within the zipfile.
msg284754 - (view) Author: Jeremy Freeman (jtf621) Date: 2017-01-05 15:51
This also affects python 3.5 on Windows and OSX.

Python 3.5.2 (default, Sep 21 2016, 15:07:18)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> import zipfile
>>> f = Path('zipfile.zip')
>>> with zipfile.ZipFile(f) as zf:
... 	pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jeremy/.pyenv/versions/3.5.2/lib/python3.5/zipfile.py", line 1026, in __init__
    self._RealGetContents()
  File "/Users/jeremy/.pyenv/versions/3.5.2/lib/python3.5/zipfile.py", line 1089, in _RealGetContents
    endrec = _EndRecData(fp)
  File "/Users/jeremy/.pyenv/versions/3.5.2/lib/python3.5/zipfile.py", line 241, in _EndRecData
    fpin.seek(0, 2)
AttributeError: 'PosixPath' object has no attribute 'seek'
msg284758 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-01-05 16:07
In 3.5 the stdlib is not supporting PathLib.  So this issue only affects 3.6 and 3.7.
msg284791 - (view) Author: Jeremy Freeman (jtf621) Date: 2017-01-06 00:56
OK, I understand.  How can I help get this issue fixed?

1) review the patch?
2) update the docs to reflect the patch?
3) find the other uses of pathlib in the zipfile module?
4) something else ...

I am a longtime user of python but a first time contributor but happy to help.
msg284792 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-01-06 01:06
I think the next steps are:

1. Add tests for the cases Serhiy has mentioned in msg277109:

   > Shouldn't the ZipFile.filename attribute be converted to str?

   and

   > The file name of ZipFile is only a part of the issue. There are other uses of file paths: paths for added files, path for extracted directory.

2. Update the docs to reflect the pathlib support.
msg285184 - (view) Author: Jeremy Freeman (jtf621) Date: 2017-01-11 03:45
I have reviewed the code and docs for the public API that should take a pathlib.Path object:

- zipfile.is_zipfile(filename)
  - filename
- zipfile.ZipFile(file)
  - file
- ZipFile.extract(member, path=None)
  - path
- ZipFile.extractall(path=None)
  - path
- ZipFile.write(filename)
  - filename
- zipfile.PyZipFile(file)
  - file
- PyZipFile.writepy(pathname)
  - pathname
- ZipInfo.from_file(filename, arcname=None)
  - filename

Does this appear complete?

Working on tests that probe each of these API points with pathlib.Path objects.

I am not sure what "Shouldn't the ZipFile.filename attribute be converted to str?" means, can you elaborate?
msg285194 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-11 07:01
> I am not sure what "Shouldn't the ZipFile.filename attribute be converted to str?" means, can you elaborate?

If you pass a Path object to ZipFile constructor, should the filename attribute be a Path object or a str?
msg285234 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2017-01-11 16:25
Any path/file attributes, etc, inside a ZipFile should be str.  ZipFile should also function properly if path/file requests are given as os.PathLike objects.
msg286497 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-01-30 19:09
Speaking as a "regular user" who just ran into this, my main concern is that PathLike paths get used properly. For filenames being passed back out, if I really want them to be Path objects, I'll wrap them in Path() anyway.

Please don't let a full conversion to pathlib hold up fixing passing a PathLike into the constructor. That's the main use case, and given how nicely most of the rest of the stdlib handles Path objects now, it's an annoying wart.
msg288602 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-02-26 17:05
PR 322 should make the example in msg284754 work:

>>> import pathlib, zipfile
>>> f = pathlib.Path('spam.zip')
>>> with zipfile.ZipFile(f) as zf:
...   zf.namelist()
... 
['LICENSE']

It doesn't implement full PathLike support, but it at least covers the use cases of Jeremy and Steve.
msg288620 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-27 07:09
I have different path. It adds the support of path-like objects for all external paths.
msg288650 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-02-27 17:27
Why can't we fix this in 3.6? We were meant to support pathlike in that version, and this is an oversight, not a new feature.
msg288653 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-27 17:36
I consider this a new feature.

Some modules got the support of path-like objects for free after implementing the support of path-like objects in os, os.path and io modules. But others need additional work for implementing it, writing tests and documentation. In case of zipfile this work is significant.
msg288656 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-02-27 17:47
Note that Ned gave us a permission to get this into 3.6.1.
msg288689 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-02-28 04:15
> Note that Ned gave us a permission to get this into 3.6.1.

I may have although I don't remember specifically discussing zipfile.  In any case, I'm willing to consider it.  I think you can make good arguments for and against.  Yes, it could smell like adding a feature but, on the other add, one of the implicit goals of 3.6.0 was to make Path objects supported across the standard library as much as possible, so the lack of support in zipfile (and a few other similar modules) could be viewed as bug.  Also, as far as I can tell, this should be a totally upwards-compatible change except in the presumably unlikely case something is counting on getting an exception when passing a Path object to zipfile.  I say we invoke "practicality beats purity" for this as long as Serhiy is OK with having it cherry-picked to 3.6 and as long as no other core developer here has a strong objection.
msg288693 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-28 06:02
The patch is backported to 3.6.1.
msg290261 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:40
New changeset eb65edd1029876a4a5bb70b009aeb914088ac749 by Serhiy Storchaka in branch '3.6':
[3.6]  bpo-28231: The zipfile module now accepts path-like objects for external paths. (#561)
https://github.com/python/cpython/commit/eb65edd1029876a4a5bb70b009aeb914088ac749
msg290262 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:41
New changeset 8606e9524a7a4065042f7f228dc57eb74f88e4d3 by Serhiy Storchaka in branch 'master':
bpo-28231: The zipfile module now accepts path-like objects for external paths. (#511)
https://github.com/python/cpython/commit/8606e9524a7a4065042f7f228dc57eb74f88e4d3
History
Date User Action Args
2017-03-24 22:41:06serhiy.storchakasetmessages: + msg290262
2017-03-24 22:40:59serhiy.storchakasetmessages: + msg290261
2017-03-17 21:00:33larrysetpull_requests: + pull_request587
2017-03-08 14:06:01serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-03-08 12:50:37serhiy.storchakasetpull_requests: + pull_request462
2017-03-06 11:10:43serhiy.storchakasetpull_requests: + pull_request420
2017-02-28 06:02:31serhiy.storchakasetfiles: + zipfile-pathlib-3.6.1.patch

messages: + msg288693
versions: + Python 3.6
2017-02-28 04:15:18ned.deilysetmessages: + msg288689
2017-02-27 19:06:55serhiy.storchakasetnosy: + ned.deily
2017-02-27 17:47:21berker.peksagsetmessages: + msg288656
2017-02-27 17:36:17serhiy.storchakasetmessages: + msg288653
2017-02-27 17:27:59steve.dowersetmessages: + msg288650
2017-02-27 07:09:57serhiy.storchakasetfiles: + zipfile-pathlib.patch
type: behavior -> enhancement
messages: + msg288620

versions: - Python 3.6
2017-02-26 17:05:02berker.peksagsetmessages: + msg288602
2017-02-26 17:00:13berker.peksagsetpull_requests: + pull_request283
2017-01-30 19:09:29steve.dowersetnosy: + steve.dower
messages: + msg286497
2017-01-11 16:25:06ethan.furmansetmessages: + msg285234
2017-01-11 07:01:41serhiy.storchakasetmessages: + msg285194
2017-01-11 03:45:25jtf621setmessages: + msg285184
2017-01-06 01:06:24berker.peksagsetmessages: + msg284792
2017-01-06 00:56:07jtf621setmessages: + msg284791
2017-01-05 16:07:17r.david.murraysetnosy: + r.david.murray
messages: + msg284758
2017-01-05 15:51:50jtf621setnosy: + jtf621
messages: + msg284754
2016-10-05 09:27:19berker.peksagsetnosy: + berker.peksag

components: + Library (Lib)
versions: + Python 3.7
2016-09-25 18:29:22brett.cannonsetmessages: + msg277391
2016-09-21 08:11:43serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg277109
2016-09-21 07:32:16ethan.furmancreate