msg135261 - (view) |
Author: Steve Ward (planet36) |
Date: 2011-05-06 05:51 |
In this revision <http://hg.python.org/cpython/rev/b6da97930f63>, the possible characters that comprise a temporary file decreased. It was noted in this thread <http://mail.python.org/pipermail/python-dev/2010-November/105452.html>. The old character set had more than 24 times as many possibilities! [1]
According to POSIX, the possible characters from which temporary file names should be constructed is [A-Za-z0-9._-]. [2]
The character set should be restored to [A-Za-z0-9_] at the least. A patch is attached.
The practices of other platforms, libraries, and languages are listed below. [3-12]
[1]
Old character set: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_
(26+26+10+1) ^ 6
62523502209
Current character set: abcdefghijklmnopqrstuvwxyz0123456789_
(26+10+1) ^ 6
2565726409
62523502209 / 2565726409
24.36873315474378000994
[2] POSIX <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_276>
3.276 Portable Filename Character Set
The set of characters from which portable filenames are constructed.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 . _ -
[3] int mkstemp(char *template); <http://pubs.opengroup.org/onlinepubs/9699919799/functions/mkstemp.html>
The string in template should look like a filename with six trailing 'X' s; mkstemp() replaces each 'X' with a character from the portable filename character set.
[4] glibc <http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/posix/tempname.c;hb=HEAD>
/* These are the characters used in temporary filenames. */
static const char letters[] =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
[5] gnulib <http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/tempname.c;hb=HEAD>
/* These are the characters used in temporary file names. */
static const char letters[] =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
[6] freebsd libc <http://svnweb.freebsd.org/base/head/lib/libc/stdio/mktemp.c?view=markup>
static const unsigned char padchar[] =
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
[7] openbsd libc <http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdio/mktemp.c?rev=HEAD;content-type=text%2Fplain>
#define TEMPCHARS "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
[8] netbsd libc <http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/lib/libc/stdio/gettemp.c?rev=HEAD&content-type=text/plain>
(uses pid)
[9] perl <http://perl5.git.perl.org/perl.git/blob/HEAD:/cpan/File-Temp/Temp.pm>
my @CHARS = (qw/ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 _
/);
[10] php <http://svn.php.net/viewvc/php/php-src/trunk/main/php_open_temporary_file.c?view=markup>
(if Windows, call GetTempFileName; else if havel mkstemp, call it; else call mktemp)
[11] ruby <http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/lib/tmpdir.rb?view=markup>
(uses current date, pid, and a random number between 0 and 4294967296 represented in base 36 (lowercase))
[12] Windows <http://msdn.microsoft.com/en-us/library/aa364991.aspx>
<path>\<pre><uuuu>.TMP
Component Meaning
<path> Path specified by the lpPathName parameter
<pre> First three letters of the lpPrefixString string
<uuuu> Hexadecimal value of uUnique
Only the lower 16 bits of the uUnique parameter are used. This limits GetTempFileName to a maximum of 65,535 unique file names if the lpPathName and lpPrefixString parameters remain the same.
|
msg135264 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2011-05-06 06:31 |
- How is it an issue? is the number of combinations really important to you?
- Isn't there a greater risk of collisions on a case-insensitive filesystem?
|
msg135265 - (view) |
Author: Steve Ward (planet36) |
Date: 2011-05-06 08:00 |
It is an issue because...
1) Python falls out of line with most other popular systems and languages.
2) There was never a good reason to decrease the entropy in the first place. If there was, the reason of case-sensitive files systems was not given.
3) If case-sensitivity was an issue, why wasn't it raised before? For more than 8 years (2002-08-09 till 2010-11-09) the issue never warranted a change.
The initial commit is here.
http://svn.python.org/view?view=revision&revision=28113
|
msg135336 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2011-05-06 17:15 |
FWIW: Conformity with popular systems is not a goal.
|
msg135602 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2011-05-09 16:05 |
Python should still work reliably even if the temp directory is on a case-insensitive filesystem. Since that isn't an easy thing to determine, -1 on restoring the uppercase characters to the filenames.
As for it not being fixed previously, it's a rare bug that only affects case-insensitive filesystems. Just because other groups have decided not to care about it (typically due to a heavy bias towards case-sensitive POSIX filesystems), that is no reason for us to ignore it once we're aware of it.
If we really wanted to increase the entropy, better to increase the number of random characters included in the name to 7 or 8 (with each additional character increasing the pool size by a factor of 37). After all, why perpetuate an arbitrary restriction that is due primarily to the difficulty of constructing nice string manipulation interfaces in C?
|
msg135603 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-05-09 16:09 |
If you would like more entry, use a longer filename:
>>> (26+10+1) ** 6 > (26+26+10+1) ** 6
False
>>> (26+10+1) ** 7 > (26+26+10+1) ** 6
True
|
msg135604 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2011-05-09 16:15 |
The number of characters in the filename is hardcoded in the guts of tempfile (in _RandomNameSequence.__next__ to be exact). Increasing that to 7 (or even 8) would be a straightforward change at the library level, but it can't be done from user code.
|
msg135605 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-05-09 16:20 |
Could someone explain me what's the risk on a case-insensitive filesystem?
Since files are created with O_CREAT|O_EXCL, I don't see where the problem is.
|
msg135607 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2011-05-09 16:34 |
Basically, adding the uppercase letters back in would provide no gain in entropy on a case-insensitive filesystem (as the upper and lower case letters will collide). Increasing the length of the random sequence would provide a consistent gain regardless of filesystem properties.
|
msg135609 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2011-05-09 16:37 |
+1 for increasing the number of random characters to 8
-1 for restoring uppercase letters
|
msg135611 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2011-05-09 16:48 |
Minor clarification: after my first comment, I realised that the original change in the character set didn't actually fix anything, it just propagated the effectively smaller sample space on case-insensitive filesystems to all filesystems. Increasing the number of random characters to 8 will more than compensate for that reduction while retaining the cross-platform consistency in behaviour.
|
msg135612 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2011-05-09 16:50 |
@Nick
I fully agree with you, but my point was that it doesn't make it less safe on case-insensitive filesystems.
Apart from that, it's of course better to increase the length of the random sequence.
|
msg135615 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2011-05-09 17:06 |
Oh, you mean the security risk if the temporary names can be guessed? My recollection is that it is more of a problem for temporary directories than it is for temporary files, since the module can't control how files inside a temp directory get created. It's been a long time since I read anything detailed on the threat models and attack scenarios mkstemp() and friends were designed to handle though, so Google may be a better source of answers on that front.
|
msg135644 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-05-09 20:49 |
If you care about security, you should not use the Python random module because it is not cryptographic. Oh oh, ssl doesn't expose RAND_bytes().
|
msg135646 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-05-09 20:54 |
> If you care about security, you should not use the Python random
> module because it is not cryptographic. Oh oh, ssl doesn't expose
> RAND_bytes().
Feel free to open a separate issue and even provide a patch!
|
msg195103 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-08-13 23:28 |
New changeset de5077aca668 by Victor Stinner in branch 'default':
Close #12015: The tempfile module now uses a suffix of 8 random characters
http://hg.python.org/cpython/rev/de5077aca668
|
msg195104 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-08-13 23:35 |
With 8 lowercase characters, the entropy is 41.7 bits, whereas it is only 35.9 bits for 6 characters with uppercase and lowercase letters.
>>> math.log(26+26+10+1, 2) * 6 # (a-zA-Z0-9_) x 6
35.8636795409995
>>> math.log(26+10+1, 2) * 6 # (a-z0-9_) x 6
31.256720193773702
>>> math.log(26+10+1, 2) * 8 # (a-z0-9_) x 8
41.6756269250316
My changeset improves the entropy, it is now higher than with the old charset.
I don't know if it is enough or not to be safe. systemd creates a temporary directory per service. Linux 3.11 will add a new O_TMPFILE to open() which allow to create a file with no name. Using the flag should help to workaround the race condition attack. See #18673 for O_TMPFILE.
|
msg315101 - (view) |
Author: Inada Naoki (methane) * |
Date: 2018-04-09 00:42 |
New changeset 9c463ec88ba21764f6fff8e01d6045a932a89438 by INADA Naoki (Wolfgang Maier) in branch 'master':
Update docstring of tempfile._RandomNameSequence (GH-6414)
https://github.com/python/cpython/commit/9c463ec88ba21764f6fff8e01d6045a932a89438
|
msg315102 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-04-09 01:01 |
New changeset d964f51f813282171d4da831e8de0fe003253e9e by Miss Islington (bot) in branch '3.7':
Update docstring of tempfile._RandomNameSequence (GH-6414)
https://github.com/python/cpython/commit/d964f51f813282171d4da831e8de0fe003253e9e
|
msg315104 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-04-09 01:26 |
New changeset 335efd7c252799eeeb8cbf51d178b1b897a91ae2 by Miss Islington (bot) in branch '3.6':
Update docstring of tempfile._RandomNameSequence (GH-6414)
https://github.com/python/cpython/commit/335efd7c252799eeeb8cbf51d178b1b897a91ae2
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:17 | admin | set | github: 56224 |
2018-04-09 01:26:59 | miss-islington | set | messages:
+ msg315104 |
2018-04-09 01:01:56 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg315102
|
2018-04-09 00:44:49 | miss-islington | set | pull_requests:
+ pull_request6125 |
2018-04-09 00:43:50 | miss-islington | set | pull_requests:
+ pull_request6124 |
2018-04-09 00:42:49 | methane | set | nosy:
+ methane messages:
+ msg315101
|
2014-01-30 14:49:11 | jwilk | set | nosy:
+ jwilk
|
2013-08-13 23:35:48 | vstinner | set | messages:
+ msg195104 |
2013-08-13 23:28:40 | python-dev | set | status: open -> closed
nosy:
+ python-dev messages:
+ msg195103
resolution: fixed stage: patch review -> resolved |
2013-08-03 13:38:08 | neologix | set | keywords:
+ easy, needs review stage: patch review type: enhancement versions:
+ Python 3.4, - Python 3.3 |
2011-05-10 02:54:07 | planet36 | set | files:
+ tempfile.py.diff |
2011-05-09 20:54:15 | pitrou | set | messages:
+ msg135646 |
2011-05-09 20:49:22 | vstinner | set | messages:
+ msg135644 |
2011-05-09 17:06:36 | ncoghlan | set | messages:
+ msg135615 |
2011-05-09 16:50:08 | neologix | set | messages:
+ msg135612 |
2011-05-09 16:48:15 | ncoghlan | set | messages:
+ msg135611 |
2011-05-09 16:37:34 | rhettinger | set | messages:
+ msg135609 |
2011-05-09 16:34:58 | ncoghlan | set | messages:
+ msg135607 |
2011-05-09 16:20:52 | neologix | set | nosy:
+ neologix messages:
+ msg135605
|
2011-05-09 16:15:24 | ncoghlan | set | messages:
+ msg135604 |
2011-05-09 16:09:03 | vstinner | set | nosy:
+ vstinner messages:
+ msg135603
|
2011-05-09 16:05:26 | ncoghlan | set | messages:
+ msg135602 |
2011-05-06 17:15:59 | eric.araujo | set | nosy:
+ ncoghlan, pitrou
|
2011-05-06 17:15:33 | eric.araujo | set | nosy:
+ eric.araujo
messages:
+ msg135336 versions:
- Python 3.1, Python 3.2, Python 3.4 |
2011-05-06 08:13:03 | amaury.forgeotdarc | set | nosy:
+ rhettinger
|
2011-05-06 08:00:27 | planet36 | set | messages:
+ msg135265 |
2011-05-06 06:31:22 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages:
+ msg135264
|
2011-05-06 05:51:46 | planet36 | create | |