tempfile.mkdtemp() doesn't work with bytes paths #68418

durin42 · 2015-05-18T17:04:54Z

BPO	24230
Nosy	@birkenfeld, @gpshead, @ncoghlan, @pitrou, @vstinner, @bitdancer, @vadmium, @serhiy-storchaka
Files	issue24230-gps01.patch issue24230-gps02.diff issue24230-gps03.diff

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/gpshead'
closed_at = <Date 2015-05-22.23:21:09.422>
created_at = <Date 2015-05-18.17:04:53.985>
labels = ['type-feature', 'library']
title = "tempfile.mkdtemp() doesn't work with bytes paths"
updated_at = <Date 2016-03-15.11:46:23.756>
user = 'https://bugs.python.org/durin42'

bugs.python.org fields:

activity = <Date 2016-03-15.11:46:23.756>
actor = 'vstinner'
assignee = 'gregory.p.smith'
closed = True
closed_date = <Date 2015-05-22.23:21:09.422>
closer = 'gregory.p.smith'
components = ['Library (Lib)']
creation = <Date 2015-05-18.17:04:53.985>
creator = 'durin42'
dependencies = []
files = ['39436', '39464', '39469']
hgrepos = []
issue_num = 24230
keywords = ['patch']
message_count = 23.0
messages = ['243498', '243500', '243502', '243505', '243507', '243508', '243509', '243552', '243643', '243648', '243649', '243654', '243672', '243674', '243675', '243809', '243821', '243847', '243865', '243866', '243876', '261810', '261811']
nosy_count = 11.0
nosy_names = ['georg.brandl', 'gregory.p.smith', 'ncoghlan', 'pitrou', 'vstinner', 'durin42', 'r.david.murray', 'python-dev', 'martin.panter', 'Matt.Mackall', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'commit review'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue24230'
versions = ['Python 3.5']

durin42 · 2015-05-18T17:04:54Z

Many things in the OS module work with bytestrings for paths in Python 3, but tempfile isn't so happy. It'd be very useful to be able to have the same bytes-in-bytes-out behavior in tempfile as for the os.path.* methods.

This is something we're working around in hg, but I figured I'd file a bug anyway in case we can improve the situation.

bitdancer · 2015-05-18T17:21:02Z

Can you explain what you are looking for in more detail? It isn't obvious to me what "bytes in bytes out" means in the context of the tempfile API.

durin42 · 2015-05-18T17:24:35Z

Today we're doing something like this:

tmpdir = tempfile.mkdtemp('', 'hgtests.', d and d.decode('utf-8')).encode('utf-8')

but would rather have

tmpdir = tempfile.mkdtemp(b'', b'hgtests.', d)
# and now tmpdir is a bytestring containing the path to the tmpdir

basically, we want to maintain our pedantry around paths being sequences of bytes.

MattMackall · 2015-05-18T17:31:08Z

Another way of putting it is:

os.listdir(<bytes>) -> [<bytes>,...]
os.listdir(<unicode>) -> [<unicode>,...]

is the usual pattern, and tempfile isn't following it.

bitdancer · 2015-05-18T17:33:39Z

I understand now. The problem is that this would be a backward incompatible change.

bitdancer · 2015-05-18T17:35:13Z

I'm sorry, hit send before I finished thinking. You are saying bytes input are rejected, so yes this could be fixed.

bitdancer · 2015-05-18T17:37:50Z

Would you care to supply a patch? The beta deadline is this coming weekend.

gpshead · 2015-05-19T01:25:16Z

it seems a little messy to code this up the way Lib/tempfile.py is written but i'll take a stab at it tonight. Probably via parallel implementations of the methods internally rather than conditional logic throughout given how they work.

gpshead · 2015-05-20T05:09:53Z

Attached is a patch with implementation, tests and docs. I didn't have to duplicate too much thankfully. Just figure out where to put the type conversions. Review would be nice, but I'll err on the side of getting this in before beta 1 (May 24 per https://www.python.org/dev/peps/pep-0478/) so any corrections can be dealt with during the beta cycle.

Augie & Matt, can you confirm that this patch does what y'all need for Mercurial?

vadmium · 2015-05-20T06:36:40Z

Doing a quick review, expect a few comments on Rietveld

serhiy-storchaka · 2015-05-20T06:57:16Z

Added comments on Rietveld. But I'm not sure that this change is worth to do. Only low-level functions in os.path and os modules (and few other) support bytes names (and only bytes, not bytearray!). High level interfaces, such as in tarfile or zipfile, work only with str paths. The question is what level the tempfile module?

In any way Mercurial people need their own wrapper if they want to support Python <3.5.

bitdancer · 2015-05-20T12:18:32Z

We have a number of other places in the stdlib where bytes-in-bytes-out is observed, at various levels of abstraction. I think this is reasonable. To answer your direct question, I think tempfile is a convenience-and-do-it-right wrapper around what is really a pretty low level operation (creating a temporary file/directory securely).

serhiy-storchaka · 2015-05-20T14:24:20Z

I was vague about this feature, but if other core developer found it useful, I have no objections. Only few nitpicks to the implementation.

durin42 · 2015-05-20T15:23:41Z

In any way Mercurial people need their own wrapper if they want to support Python <3.5.

We need format operations on bytes objects, so we have no plans to support Python3 for any version before 3.5.

durin42 · 2015-05-20T15:24:48Z

I'll build a patched Python3.5 tomorrow (I'm on the road today) and let you know if this does everything we need. I'd be shocked if it wasn't plenty.

gpshead · 2015-05-22T06:17:53Z

code review comments addressed.

vadmium · 2015-05-22T08:22:56Z

Looks good in general. I think the signatures in Doc/library/tempfile.rst need updating for the following functions (possibly also version changed notices added somewhere):

NamedTemporaryFile
TemporaryFile
SpooledTemporaryFile
TemporaryDirectory

gpshead · 2015-05-22T19:12:38Z

updated, thanks for the great reviews.

python-dev · 2015-05-22T23:18:33Z

New changeset 870899ce71f4 by Gregory P. Smith in branch 'default':
bpo-24230: The tempfile module now accepts bytes for prefix, suffix and dir
https://hg.python.org/cpython/rev/870899ce71f4

gpshead · 2015-05-22T23:21:09Z

that should take care of it. if you see any other issues with this post commit, either raise them here or fix them directly. the reviews were helpful.

durin42 · 2015-05-23T01:20:59Z

I just removed my hack that gave us unicode to hand to mkdtemp() and everything still passes with your new commit. Thanks much for the quick response!

vstinner · 2016-03-15T11:44:01Z

I don't understand the rationale to add new functions using bytes filenames.

IMHO it's a bad practice to pass bytes filenames to write portable code. On Windows, bytes lead to strange bugs: os.listdir(bytes) gives you filenames which don't exist when a filename cannot be encoded to the ANSI code page.

On UNIX, os.fsencode() and os.fsdecode() allows you to convert filenames between unicode and bytes. It provides nice properties like os.fsencode(os.fsdecode(value))==value where value is bytes. No need to add new APIs, use existing APIs an use os.fsencode() and os.fsdecode() on top of them.

Bytes filenames are deprecated on Windows since Python 3.2. The new os.scandir() function doesn't support bytes for example, it's a deliberate choice.

Maybe I missed an important technical issue?

vstinner · 2016-03-15T11:46:24Z

"""
Today we're doing something like this:

tmpdir = tempfile.mkdtemp('', 'hgtests.', d and d.decode('utf-8')).encode('utf-8')
"""

Don't do that. UTF-8 is not the right encoding.

Use os.fsencode() and os.fsdecode(). Internally, Python uses sys.getfilesystemencoding() with 'surrogateescape' error handler (but 'strict' error handler on Windows).

If you use UTF-8, you can get indirectly mojibake.

durin42 mannequin added the stdlib Python modules in the Lib dir label May 18, 2015

bitdancer added the type-feature A feature request or enhancement label May 18, 2015

gpshead self-assigned this May 19, 2015

gpshead closed this as completed May 22, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tempfile.mkdtemp() doesn't work with bytes paths #68418

tempfile.mkdtemp() doesn't work with bytes paths #68418

durin42 mannequin commented May 18, 2015

durin42 mannequin commented May 18, 2015

bitdancer commented May 18, 2015

durin42 mannequin commented May 18, 2015

MattMackall mannequin commented May 18, 2015

bitdancer commented May 18, 2015

bitdancer commented May 18, 2015

bitdancer commented May 18, 2015

gpshead commented May 19, 2015

gpshead commented May 20, 2015

vadmium commented May 20, 2015

serhiy-storchaka commented May 20, 2015

bitdancer commented May 20, 2015

serhiy-storchaka commented May 20, 2015

durin42 mannequin commented May 20, 2015

durin42 mannequin commented May 20, 2015

gpshead commented May 22, 2015

vadmium commented May 22, 2015

gpshead commented May 22, 2015

python-dev mannequin commented May 22, 2015

gpshead commented May 22, 2015

durin42 mannequin commented May 23, 2015

vstinner commented Mar 15, 2016

vstinner commented Mar 15, 2016

tempfile.mkdtemp() doesn't work with bytes paths #68418

tempfile.mkdtemp() doesn't work with bytes paths #68418

Comments

durin42 mannequin commented May 18, 2015

durin42 mannequin commented May 18, 2015

bitdancer commented May 18, 2015

durin42 mannequin commented May 18, 2015

MattMackall mannequin commented May 18, 2015

bitdancer commented May 18, 2015

bitdancer commented May 18, 2015

bitdancer commented May 18, 2015

gpshead commented May 19, 2015

gpshead commented May 20, 2015

vadmium commented May 20, 2015

serhiy-storchaka commented May 20, 2015

bitdancer commented May 20, 2015

serhiy-storchaka commented May 20, 2015

durin42 mannequin commented May 20, 2015

durin42 mannequin commented May 20, 2015

gpshead commented May 22, 2015

vadmium commented May 22, 2015

gpshead commented May 22, 2015

python-dev mannequin commented May 22, 2015

gpshead commented May 22, 2015

durin42 mannequin commented May 23, 2015

vstinner commented Mar 15, 2016

vstinner commented Mar 15, 2016