This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PYO file permission problem
Type: security Stage: resolved
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: belopolsky, brett.cannon, christian.heimes, eric.snow, facundobatista, georg.brandl, ncoghlan, python-dev, stocker81
Priority: release blocker Keywords: patch

Created on 2008-02-08 19:28 by stocker81, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue2051.diff belopolsky, 2008-02-22 06:37
issue2051_test.eric.snow.diff eric.snow, 2012-08-24 04:33 test only review
Messages (17)
msg62203 - (view) Author: (stocker81) Date: 2008-02-08 19:28
Python's interpreter doesn't keep proper file permissions after
importing library. See the fallowing:

mk@laptop ~ $ echo "key='top secret'" > key.py
mk@laptop ~ $ chmod 600 key.py 
mk@laptop ~ $ python
Python 2.4.4 (#1, Jan  8 2008, 21:22:16) 
[GCC 4.1.2 (Gentoo 4.1.2 p1.0.1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import key
>>> 
mk@laptop ~ $ ls -l key.py*
-rw------- 1 mk mk  17 II  8 20:09 key.py
-rw-r--r-- 1 mk mk 120 II  8 20:09 key.pyc
mk@laptop ~ $ 

So, interpreter creates 644 pyo file (visible for all) which contains
secret data from 600 py file.
I think it should keep the original permissions, someone can save a
important data (eg. SQL login/pwd into Django's settings.py) into
library and makes it visible for all by an accident.
msg62204 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-02-08 19:59
You have a good use case for a change of behavior. We might change it
for Python 2.6 and 3.0.

A proper place would be Python/import.c:load_source_module().
msg62207 - (view) Author: (stocker81) Date: 2008-02-08 20:58
BTW, I see you've removed this issue from bugs list printed at
http://bugs.python.org/ but everyone (including unlogged visitors) can
access it still via http://bugs.python.org/issue2051 ...
msg62212 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-02-08 21:54
stocker81 wrote:
> BTW, I see you've removed this issue from bugs list printed at
> http://bugs.python.org/ but everyone (including unlogged visitors) can
> access it still via http://bugs.python.org/issue2051 ...

I haven't removed it from the listing. Look under priority "high" a few
pages up.

Christian
msg62220 - (view) Author: (stocker81) Date: 2008-02-09 12:05
Christian Heimes <report@bugs.python.org> wrote: 
> I haven't removed it from the listing. Look under priority "high" a few
> pages up.
> 

I'm sorry, my mistake.

Regards,
Marcin
msg62786 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-02-23 17:52
Fixed in r61002

Thanks for your patch Alexander! Good work.
msg168610 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-08-20 01:25
Since there was no regression test added for this, it appears to me that it is broken again now that we're using importlib.

It may be rather hard to fix given the limitations of the set_data API :(
msg168614 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-08-20 02:47
On Aug 19, 2012 9:26 PM, "Nick Coghlan" <report@bugs.python.org> wrote:
>
>
> Nick Coghlan added the comment:
>
> Since there was no regression test added for this, it appears to me that
it is broken again now that we're using importlib.
>
> It may be rather hard to fix given the limitations of the set_data API :(

From an API perspective it's tough, but realistically it should be possible
since the loaders have the path to the file that triggered the find as the
'path' attribute.

>
> ----------
> assignee: facundobatista ->
> nosy: +brett.cannon, georg.brandl, ncoghlan
> priority: high -> release blocker
> resolution: accepted ->
> status: closed -> open
> versions: +Python 3.3 -Python 2.6
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue2051>
> _______________________________________
msg168975 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2012-08-24 04:33
here's a test.  I'll work on a patch when I get a chance (and no one's beaten me to it :).
msg168978 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2012-08-24 08:01
So this boils down to set_data() not having a mode parameter, right?   Alas, the obvious approach, adding it, breaks backward compatibility.  The alternative is to use source_from_cache() in set_data() to get the source path, and then get the mode there.  Of course, differentiating between the use cases for set_data() seems important there too.  In short, "It may be rather hard to fix given the limitations of the set_data API".

Regardless, we'll need to be careful to use the loader's cache to avoid extra stat calls.

Also, we have to factor in issue6074 (basically do "mode | 0o600").  We'll need at least one more test to cover that (as Nick noted in that issue).

I have a patch that is close, but my eyes are getting a little heavy.  Hopefully I'll have that up tomorrow.
msg168979 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-08-24 08:13
I have a patch that works, and that I think may point the way to a replacement API for set_data in 3.4. Just running the full test suite before I check it in.

The test you posted definitely saved me a lot of time.

My version bypasses the cache, though. This isn't a disaster, since the main reason the cache is there is to speed up *failing* stat calls - this new one only incurs the hit when actually writing the bytecode file to disk, which should be lost in the noise of the actual IO write operation.

However, suggestions for improvement always welcome :)
msg168981 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-08-24 08:19
Also, my patch precisely recreates 3.2 behaviour, so it will mean #6074 may also affect 3.3. That isn't a regression, hence not a release blocker, but would still be good to get fixed for rc1.
msg168985 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-08-24 08:35
(oops, transposed the digits in the checkin message)

New changeset 3a831a0a29c4 by Nick Coghlan in branch 'default':
Close #2051: Permission bits are once again correctly copied from the source file to the cached bytecode file. Test by Eric Snow.
http://hg.python.org/cpython/rev/3a831a0a29c4
msg168986 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-08-24 08:36
New changeset cfb11045fc8a by Nick Coghlan in branch 'default':
Close #2051: Oops, transposed the digits in the issue number in the previous commit
http://hg.python.org/cpython/rev/cfb11045fc8a
msg169050 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2012-08-24 17:03
My patch was very similar.  _cache_bytecode() is a good addition.  Good point about the cache, too.

I'm not convinced that source_path is the right thing to add to the API (even just for SourceFileLoader).  I would have thought mode would have been more appropriate:

  def set_data(self, path, data, *, mode=0o666):

Then the "mode = _os.stat(source_path).st_mode" bit would get moved to _cache_bytecode().

My reasoning is that set_data() is useful to write any data relative to the Loader.  The concrete use case in the stdlib is for writing the .pyc files.  Otherwise I'm not certain why the idea of "source_path" should be associated with set_data().  On the other hand, the idea of "mode" likewise may not be univeral enough to enshrine in the API, but I'd think it's more so than source_path.

I was going to ask about backward compatability, but then I realized that SourceFileLoader is new in 3.3 (SourceLoader is new in 3.2).  I'm also wondering why set_data() is not a part of the FileLoader API, but that's a question for another time (and version).  :)
msg169054 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-08-24 17:34
I agree with Eric: set_data() is meant to be generic so as to allow writing arbitrary data, not just bytecode files. So while accepting a mode argument makes sense and I'm fine with in terms of possible API change (in the future), having a source_path argument I'm not comfortable with. So I'm going to just take Eric's idea and commit the change. I'm also making the argument _mode just to make sure no one relies on it until we can discuss API changes in the context of Python 3.4 sine we will need to clean up the loader APIs to have abstract source/bytecode stuff that can play nicely with file-based APIs to allow for reading generic file assets once we have all of these little bugs worked out.

And as for why set_data() is not part of FileLoader, that's because it isn't necessary to load files. =) It's in SourceFileLoader purely to facilitate writing bytecode files, but set_data() is not used at all in terms of the actual loading of source code (or bytecode files for that matter).
msg169056 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-08-24 17:48
New changeset 0c661c5632e0 by Brett Cannon in branch 'default':
Issue #2051: Tweak last commit for this issue to pass in mode instead
http://hg.python.org/cpython/rev/0c661c5632e0
History
Date User Action Args
2022-04-11 14:56:30adminsetgithub: 46327
2012-08-24 17:48:46python-devsetmessages: + msg169056
2012-08-24 17:34:20brett.cannonsetmessages: + msg169054
2012-08-24 17:03:17eric.snowsetmessages: + msg169050
2012-08-24 08:36:46python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg168986

resolution: fixed
stage: needs patch -> resolved
2012-08-24 08:35:48ncoghlansetmessages: + msg168985
2012-08-24 08:19:29ncoghlansetmessages: + msg168981
2012-08-24 08:13:41ncoghlansetmessages: + msg168979
2012-08-24 08:02:04eric.snowsetstage: needs patch
2012-08-24 08:01:26eric.snowsetmessages: + msg168978
2012-08-24 04:33:27eric.snowsetfiles: + issue2051_test.eric.snow.diff

nosy: + eric.snow
messages: + msg168975

keywords: + patch
2012-08-24 00:51:53ncoghlansetassignee: ncoghlan
2012-08-20 02:47:12brett.cannonsetmessages: + msg168614
2012-08-20 01:25:58ncoghlansetstatus: closed -> open
priority: high -> release blocker

assignee: facundobatista -> (no value)
versions: + Python 3.3, - Python 2.6
nosy: + georg.brandl, brett.cannon, ncoghlan

messages: + msg168610
resolution: accepted -> (no value)
2008-02-23 17:52:32christian.heimessetstatus: open -> closed
resolution: accepted
messages: + msg62786
2008-02-22 06:40:04belopolskysetnosy: + belopolsky
2008-02-22 06:37:29belopolskysetfiles: + issue2051.diff
2008-02-18 03:49:25facundobatistasetassignee: facundobatista
nosy: + facundobatista
2008-02-09 12:05:12stocker81setmessages: + msg62220
2008-02-08 21:54:37christian.heimessetmessages: + msg62212
2008-02-08 20:58:29stocker81setmessages: + msg62207
2008-02-08 19:59:17christian.heimessetpriority: high
nosy: + christian.heimes
messages: + msg62204
components: + Interpreter Core, - None
versions: + Python 2.6, - Python 2.4
2008-02-08 19:28:32stocker81create