msg274778 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-09-07 08:21 |
Bytes paths are deprecated on Windows and their support was not added in os.scandir() for purpose. But this makes harder converting filesystem walking functions to use os.scandir(). We have to add a special case for bytes paths on Windows in every such function (os.walk(), glob.iglob()). This is cumbersome inefficient and errorprone code that slows down common case due to additional checks for special case. It would be better to add a support of bytes paths directly in os.scandir().
|
msg274779 - (view) |
Author: Paul Moore (paul.moore) * |
Date: 2016-09-07 08:27 |
In the light of Steve Dower's work to "un-deprecate" bytes paths, I agree this should be added.
|
msg274782 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2016-09-07 09:07 |
This is implemented in Steve's latest patch in issue 27781. For example:
>>> sys.platform
'win32'
>>> os.mkdir('test')
>>> f = open('test/\U00010000.txt', 'w')
>>> next(os.scandir(b'test')).name
b'\xf0\x90\x80\x80.txt'
>>> next(os.scandir(b'test')).path
b'test\\\xf0\x90\x80\x80.txt'
|
msg274812 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2016-09-07 14:08 |
Heh, totally by accident as well.
|
msg274846 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2016-09-07 17:31 |
Guido just said that we should make sure that os.scandir() continues to not support bytes, even after the patch for issue27781 goes in, so I'm reopening this.
|
msg274848 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-09-07 17:40 |
But why? The code of os and glob modules could be simpler if add the support.
|
msg274850 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-09-07 17:45 |
I'm ok to support os.scandir(bytes) on Windows.
My long term goal was to drop bytes support on Windows, but it seems like the new trend is more to keep this support and even enhance it.
It's quite easy to support os.scandir(bytes) on Windows, whereas it seems tricky and annoying to have to emulate the feature in pure Python.
Note: I was the BDFL-delegate for the PEP 471 ;-)
|
msg274851 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2016-09-07 17:48 |
There are only two references to bytes in the pep:
> Like the other functions in the os module, scandir() accepts either a
> bytes or str object for the path parameter, and returns the
> DirEntry.name and DirEntry.path attributes with the same type as
> path . However, it is strongly recommended to use the str type, as
> this ensures cross-platform support for Unicode filenames. (On
> Windows, bytes filenames have been deprecated since Python 3.3).
I'm not concerned about supporting bytes here (with PEP 529, obviously), though the strong recommendation to use str should certainly stand.
|
msg277363 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-09-25 10:57 |
I don't know what is the benefit of removing support of bytes paths in os.scandir(), but supporting bytes paths allows to remove ugly workarounds from os and glob modules.
Lib/glob.py | 16 --------!!!!!!
Lib/os.py | 67 ---------------------------------------------------------!!!
2 files changed, 73 deletions(-), 10 modifications(!)
|
msg277397 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2016-09-25 20:41 |
Nice!
|
msg278071 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-10-04 17:55 |
I'm confused by the current title of this issue. Should the support of bytes paths in os.scandir() be removed or preserved? In latter case we can remove workaround code from os and glob.
|
msg278077 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2016-10-04 18:55 |
It went back and forth. The current feeling is to *keep* that support,
especially since their deprecation on Windows has been *undone* in
3.6. Feel free to update the title (though the initial comments may be
harder to understand without the context of the original title).
|
msg278085 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-10-04 20:06 |
I suggest to modify posixmodule.c to support bytes on Windows at the C
level.
Since Windows uses utf8, there is no more real reason to drop bytes support
only on Windows.
|
msg278147 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-10-05 20:19 |
New changeset 7c36e6fd0232 by Serhiy Storchaka in branch '3.6':
Issue #27998: Removed workarounds for supporting bytes paths on Windows in
https://hg.python.org/cpython/rev/7c36e6fd0232
New changeset bcee710c42fe by Serhiy Storchaka in branch 'default':
Issue #27998: Removed workarounds for supporting bytes paths on Windows in
https://hg.python.org/cpython/rev/bcee710c42fe
|
msg278150 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-10-05 21:06 |
Tests are passed on Windows.
Now we need to document that bytes paths are supported in os.scandir() on Windows since 3.6.
|
msg278285 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-10-08 07:23 |
Hmm, tests are passed on some Windows buildbots, but failed on others.
http://buildbot.python.org/all/builders/AMD64%20Windows10%203.x/builds/1631/steps/test/logs/stdio
======================================================================
ERROR: test_walk_topdown (test.test_os.BytesWalkTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\buildarea\3.x.bolen-windows10\build\lib\test\test_os.py", line 890, in test_walk_topdown
all = list(self.walk(self.walk_path))
File "D:\buildarea\3.x.bolen-windows10\build\lib\test\test_os.py", line 1055, in walk
for broot, bdirs, bfiles in os.walk(os.fsencode(top), **kwargs):
File "D:\buildarea\3.x.bolen-windows10\build\lib\os.py", line 409, in walk
yield from walk(new_path, topdown, onerror, followlinks)
File "D:\buildarea\3.x.bolen-windows10\build\lib\os.py", line 367, in walk
is_dir = entry.is_dir()
TypeError: bad argument type for built-in operation
----------------------------------------------------------------------
The error message is not very informative. Seems like some C API function takes an argument of wrong type (e.g. bytes instead of unicode).
|
msg278286 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-10-08 07:53 |
I suppose the following patch should fix the issue. Could anybody please test it on Windows?
|
msg278306 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2016-10-08 16:19 |
Here's an alternative patch, using PyUnicode_FSDecoder. It also adds path_object_error and path_object_error2 helper functions.
|
msg278308 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-10-08 16:31 |
Thank you Eryk. The patch LGTM. Did you tested that it fixes tests?
|
msg278310 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2016-10-08 16:48 |
With the patch I uploaded, test_glob and test_os BytesWalkTests both pass, in both Windows 10 and Linux. Without it those tests fail for me in Windows 10.
|
msg278311 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-10-08 17:17 |
New changeset 4ed634870a9a by Serhiy Storchaka in branch '3.6':
Issue #27998: Fixed bytes path support in os.scandir() on Windows.
https://hg.python.org/cpython/rev/4ed634870a9a
New changeset 837114dea493 by Serhiy Storchaka in branch 'default':
Issue #27998: Fixed bytes path support in os.scandir() on Windows.
https://hg.python.org/cpython/rev/837114dea493
|
msg278312 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-10-08 17:20 |
Now it is documentation issue again.
|
msg280689 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-11-13 09:04 |
Proposed patch documents bytes paths support on Windows.
|
msg281244 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2016-11-20 03:22 |
Doc change looks good to me.
|
msg281249 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-11-20 06:25 |
New changeset ac63c70635db by Serhiy Storchaka in branch '3.6':
Issue #27998: Documented bytes paths support on Windows.
https://hg.python.org/cpython/rev/ac63c70635db
New changeset 26195e07fcc5 by Serhiy Storchaka in branch 'default':
Issue #27998: Documented bytes paths support on Windows.
https://hg.python.org/cpython/rev/26195e07fcc5
|
msg281250 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-11-20 06:26 |
Thanks Steve.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:35 | admin | set | github: 72185 |
2017-03-31 16:36:24 | dstufft | set | pull_requests:
+ pull_request977 |
2016-11-20 06:26:46 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages:
+ msg281250
stage: patch review -> resolved |
2016-11-20 06:25:31 | python-dev | set | messages:
+ msg281249 |
2016-11-20 03:22:19 | steve.dower | set | messages:
+ msg281244 |
2016-11-13 09:04:52 | serhiy.storchaka | set | files:
+ doc-scandir-bytes.patch
nosy:
+ Elvis.Pranskevichus, yselivanov messages:
+ msg280689
keywords:
+ patch stage: needs patch -> patch review |
2016-10-08 17:20:04 | serhiy.storchaka | set | keywords:
- patch priority: release blocker -> normal messages:
+ msg278312
stage: patch review -> needs patch |
2016-10-08 17:17:51 | python-dev | set | messages:
+ msg278311 |
2016-10-08 16:48:30 | eryksun | set | messages:
+ msg278310 |
2016-10-08 16:31:27 | serhiy.storchaka | set | messages:
+ msg278308 |
2016-10-08 16:19:26 | eryksun | set | files:
+ issue_27998_01.patch
messages:
+ msg278306 |
2016-10-08 07:53:17 | serhiy.storchaka | set | files:
+ direntry_bytes_path.patch assignee: docs@python -> serhiy.storchaka messages:
+ msg278286
stage: needs patch -> patch review |
2016-10-08 07:23:24 | serhiy.storchaka | set | priority: normal -> release blocker nosy:
+ ned.deily messages:
+ msg278285
|
2016-10-05 21:06:15 | serhiy.storchaka | set | assignee: serhiy.storchaka -> docs@python components:
+ Documentation, - Extension Modules versions:
+ Python 3.7 nosy:
+ docs@python
messages:
+ msg278150 stage: patch review -> needs patch |
2016-10-05 20:19:18 | python-dev | set | nosy:
+ python-dev messages:
+ msg278147
|
2016-10-05 20:08:24 | serhiy.storchaka | set | assignee: serhiy.storchaka title: Remove support of bytes paths in os.scandir() -> Bytes paths now are supported in os.scandir() on Windows stage: needs patch -> patch review |
2016-10-04 20:06:56 | vstinner | set | messages:
+ msg278085 |
2016-10-04 18:55:38 | gvanrossum | set | messages:
+ msg278077 |
2016-10-04 17:55:25 | serhiy.storchaka | set | messages:
+ msg278071 |
2016-09-25 20:41:36 | gvanrossum | set | messages:
+ msg277397 |
2016-09-25 10:57:12 | serhiy.storchaka | set | files:
+ cleanup_scandir_bytes.patch keywords:
+ patch messages:
+ msg277363
|
2016-09-07 17:48:01 | steve.dower | set | messages:
+ msg274851 |
2016-09-07 17:45:55 | vstinner | set | messages:
+ msg274850 |
2016-09-07 17:40:32 | serhiy.storchaka | set | messages:
+ msg274848 |
2016-09-07 17:31:25 | steve.dower | set | title: Add support of bytes paths in os.scandir() -> Remove support of bytes paths in os.scandir() |
2016-09-07 17:31:03 | steve.dower | set | status: closed -> open
dependencies:
+ Change sys.getfilesystemencoding() on Windows to UTF-8 type: enhancement -> behavior stage: resolved -> needs patch versions:
+ Python 3.6 nosy:
+ gvanrossum messages:
+ msg274846 superseder: Change sys.getfilesystemencoding() on Windows to UTF-8 -> resolution: duplicate -> (no value) |
2016-09-07 14:08:02 | steve.dower | set | messages:
+ msg274812 |
2016-09-07 09:07:41 | eryksun | set | status: open -> closed
superseder: Change sys.getfilesystemencoding() on Windows to UTF-8
nosy:
+ eryksun messages:
+ msg274782 resolution: duplicate stage: resolved |
2016-09-07 08:27:22 | paul.moore | set | messages:
+ msg274779 |
2016-09-07 08:21:47 | serhiy.storchaka | set | title: Add support of butes paths in os.scandir() -> Add support of bytes paths in os.scandir() |
2016-09-07 08:21:10 | serhiy.storchaka | create | |