Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVE-2008-5983 python: untrusted python modules search path #50003

Closed
iankko mannequin opened this issue Apr 14, 2009 · 49 comments
Closed

CVE-2008-5983 python: untrusted python modules search path #50003

iankko mannequin opened this issue Apr 14, 2009 · 49 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue

Comments

@iankko
Copy link
Mannequin

iankko mannequin commented Apr 14, 2009

BPO 5753
Nosy @loewis, @warsaw, @akuchling, @gpshead, @jcea, @pitrou, @benjaminp, @glyph, @bitdancer, @davidmalcolm
Files
  • python-CVE-2009-5983.patch: Patch from Ray Strode against python 2.6.
  • py_umspath_test.tar.gz: PoC
  • setargvex.patch
  • setargvex2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-05-21.17:40:59.705>
    created_at = <Date 2009-04-14.11:39:38.241>
    labels = ['type-security', 'interpreter-core']
    title = 'CVE-2008-5983 python: untrusted python modules search path'
    updated_at = <Date 2010-09-28.03:25:15.350>
    user = 'https://bugs.python.org/iankko'

    bugs.python.org fields:

    activity = <Date 2010-09-28.03:25:15.350>
    actor = 'jcea'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-05-21.17:40:59.705>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2009-04-14.11:39:38.241>
    creator = 'iankko'
    dependencies = []
    files = ['13685', '13686', '13860', '17418']
    hgrepos = []
    issue_num = 5753
    keywords = ['patch']
    message_count = 49.0
    messages = ['85965', '85966', '85967', '85968', '86904', '86906', '86927', '86943', '87061', '87063', '87083', '87084', '87212', '87239', '87281', '87285', '87299', '87300', '87309', '87343', '87347', '87348', '87350', '87399', '87556', '89688', '90329', '90330', '90336', '90448', '90473', '90480', '90481', '90543', '90556', '98027', '104927', '104939', '104950', '105945', '105976', '105980', '106184', '106214', '106221', '106256', '107508', '107515', '117504']
    nosy_count = 14.0
    nosy_names = ['loewis', 'barry', 'akuchling', 'gregory.p.smith', 'jcea', 'pitrou', 'benjamin.peterson', 'glyph', 'psss', 'r.david.murray', 'iankko', 'akr', 'thoger', 'dmalcolm']
    pr_nums = []
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue5753'
    versions = ['Python 2.6', 'Python 3.1', 'Python 2.7', 'Python 3.2']

    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented Apr 14, 2009

    Common Vulnerabilities and Exposures assigned an identifier
    CVE-2008-5983 (and related CVE ids) to the following vulnerability:

    Untrusted search path vulnerability in the PySys_SetArgv API function in
    Python 2.6 and earlier, and possibly later versions, prepends an empty
    string to sys.path when the argv[0] argument does not contain a path
    separator, which might allow local users to execute arbitrary code via a
    Trojan horse Python file in the current working directory.

    References:
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5983
    https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2008-5983
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5984
    https://bugzilla.redhat.com/show_bug.cgi?id=481551
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5985
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5986
    https://bugzilla.redhat.com/show_bug.cgi?id=481550
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5987
    https://bugzilla.redhat.com/show_bug.cgi?id=481553
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0314
    http://bugzilla.gnome.org/show_bug.cgi?id=569214
    https://bugzilla.redhat.com/show_bug.cgi?id=481556
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0315
    https://bugzilla.redhat.com/show_bug.cgi?id=481560
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0316
    https://bugzilla.redhat.com/show_bug.cgi?id=481565
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0317
    https://bugzilla.redhat.com/show_bug.cgi?id=481570
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0318
    https://bugzilla.redhat.com/show_bug.cgi?id=481572

    @iankko iankko mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue labels Apr 14, 2009
    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented Apr 14, 2009

    To sum up the behavior, the following table displays whether
    modules are read from the current working directory for various
    ways how the python scripts can be launched (unfixed/fixed version):

    unfixed   fixed   run as
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    yes       no      python test.py
    yes       no      python ./test.py
    yes       no      python /tmp/396/test.py
    yes       no      /bin/env python test.py
    
    yes       yes     test.py
    yes       yes     ./test.py
    yes       yes     /tmp/396/test.py
    yes       yes     /usr/bin/python test.py
    yes       yes     /usr/bin/python ./test.py
    yes       yes     /usr/bin/python /tmp/396/test.py
    
    no        no      test-in-different-dir.py
    no        no      ./bin/test-in-different-dir.py
    no        no      python ./bin/test-in-different-dir.py
    

    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented Apr 14, 2009

    As no longer work of "python ./foo.py" after patch utilization may
    cause, the update won't be acceptable, could you guys review the
    above patch and potentially provide an another one?

    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented Apr 14, 2009

    Just drop into /tmp and run (you will need the zenity package installed):

    python3.1 ./test.py

    or

    gedit # unfixed gedit

    in that directory.

    @pitrou
    Copy link
    Member

    pitrou commented May 1, 2009

    What is the problem exactly?
    An user can run arbitrary Python code from a file in his own account --
    well, sure, that's a feature. Unless I'm misunderstanding something.

    @pitrou
    Copy link
    Member

    pitrou commented May 1, 2009

    I wanted to read the patch at
    https://bugzilla.redhat.com/attachment.cgi?id=334888 but apparently its
    access is restricted...

    @glyph
    Copy link
    Mannequin

    glyph mannequin commented May 2, 2009

    Antoine,

    The problem is that apparently every program that embeds Python calls
    PySys_SetArgv and does not understand the consequences of doing so. For
    example, a user running 'gedit' to edit some files in a potentially
    insecure directory may not expect that starting the program there will
    cause it to load python files from that directory.

    The 'python' executable itself is not really "vulnerable" in quite the
    same way, because if you (i.e. a developer) start 'python' in some
    directory, you *do* typically expect that it will load code from that
    directory. For applications written *in* python, that have scripts in,
    let's say, /usr/bin, the directory added to the path is /usr/bin, not
    the application's working directory.

    @pitrou
    Copy link
    Member

    pitrou commented May 2, 2009

    I'm not sure we can change the behaviour of PySys_SetArgv() like that.
    At least not in a bugfix release.
    In 2.7/3.1, we could either change PySys_SetArgv(), or introduce a new
    PySys_SetArgvEx() with an additional argument indicating whether
    sys.path should be modified or not. I suggest asking on python-dev first.

    @gpshead
    Copy link
    Member

    gpshead commented May 3, 2009

    both the behavior change and PySys_SetArgvEx() with an additional
    boolean parameter sounds good to me.

    Some people may disagree about changing the default behavior. So long
    as its documented in the whatsnew I personally think it is fine. But
    would doing that require incrementing the API version number?

    +1 on adding a PySys_SetArgvEx() in time for 3.1 (the clock is ticking
    fast).

    +0 on the existing API default change.

    @pitrou
    Copy link
    Member

    pitrou commented May 3, 2009

    By the way, the advantage of a new function over a behaviour change is
    that the new function could safely be backported to 2.6.3, since it is
    also a "security fix".

    @pitrou
    Copy link
    Member

    pitrou commented May 3, 2009

    Here is a patch for trunk.

    @pitrou
    Copy link
    Member

    pitrou commented May 3, 2009

    Jan, would the new API be ok to you?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 5, 2009

    I disagree that this issue is release critical. I'm still skeptical that
    this is a security bug; if it is, any solution created needs to be
    applied to all active branches - including the ones that would be
    blocked by this issue right now. IOW, it's still possible to fix it
    after the release.

    @pitrou
    Copy link
    Member

    pitrou commented May 5, 2009

    Ok, downgrading to critical.
    I'm awaiting the reporter's answer anyway.

    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented May 5, 2009

    Antoine,

    (re: #msg87083, #msg87084) -- while the API change is acceptable and
    reasonable, it doesn't solve the core of the problem. I understand
    the change needs to be 'backward compatible' and shouldn't break
    the existing Python behavior, but the current proposed patch:

    1, doesn't avoid the need to fix the issue (by calling
    "PySys_SetArgvEx(argc, argv, 0);") in all current applications embedding
    Python,

    2, doesn't dismiss the risk of future appearance of application,
    embedding Python interpreter and using it in a vulnerable way
    (in fact, all what it does, is adding recommendation / alternative
    to use more safer PySys_SetArgv(*, *, 0) for such cases. I don't think
    we can just rely on the fact, the developers will use it in a safe
    way in the future -- or did I overlooked something?

    Wouldn't be possible to fix it 'only in Python' and prevent such
    potential future malicious (mis)uses?

    To Martin (re: #msg87212):

    What's the question of 'security nature' of the issue, Glyph in
    message #msg86927 already uncovered potential implications --
    if the application was written either 'by accident', or 'by intention',
    it shouldn't just allow to execute anything with the privileges of
    superuser, and even worse, doing it silently (then the only warranty
    for the unprivileged user would be to rely on the fact, the function
    was called 'in a safe way' in the application and I suppose such
    assumption would completely discourage him from running it).

    I recommend the final fix should be applied to all active Python
    branches (just comment on second part of Martin's comment).

    Regards, Jan.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 5, 2009

    What's the question of 'security nature' of the issue, Glyph in
    message #msg86927 already uncovered potential implications --

    The question is whether these are theoretical or real problems.
    I ran gedit (as proposed by Glyph) under strace(1), and it didn't
    try to open any files in the current directory.

    @pitrou
    Copy link
    Member

    pitrou commented May 5, 2009

    Hello Jan,

    1, doesn't avoid the need to fix the issue (by calling
    "PySys_SetArgvEx(argc, argv, 0);") in all current applications embedding
    Python,

    As you said yourself, we don't want to break backwards compatibility for
    C API users -- especially between two minor versions such as 2.6.2 and
    2.6.3. The current behaviour is certainly by design, otherwise it
    wouldn't be so complicated.

    Besides, the patch you proposed is fragile as it relies on a hard coded
    value for the executable name, and it also complexifies the behaviour
    even more. I don't think we should apply it in core Python. On the other
    hand, adding an /explicit/ option in the API minimizes the risk for
    confusion and signals clearly that an alternative is available.

    I don't think
    we can just rely on the fact, the developers will use it in a safe
    way in the future

    Well, you can always shoot yourself in the foot in C, even without using
    the Python API. The patch just provides a practical way for
    Python-embedding applications to be safer. Then, it's up to application
    developers to do their job.

    Wouldn't be possible to fix it 'only in Python' and prevent such
    potential future malicious (mis)uses?

    AFAICT, not without risking breaking compatibility for perfectly
    well-behaved apps which would rely on the current behaviour.

    @pitrou
    Copy link
    Member

    pitrou commented May 5, 2009

    The question is whether these are theoretical or real problems.
    I ran gedit (as proposed by Glyph) under strace(1), and it didn't
    try to open any files in the current directory.

    You have to use a Python-written gedit plugin for that to happen. For
    example, if I enable the "Python console" plugin, I get the following
    lines in strace:

    17569:open("gconf.so", O_RDONLY) = -1 ENOENT (No such file
    or directory)
    17570:open("gconfmodule.so", O_RDONLY) = -1 ENOENT (No such file
    or directory)
    17571:open("gconf.py", O_RDONLY) = -1 ENOENT (No such file
    or directory)
    17572:open("gconf.pyc", O_RDONLY) = -1 ENOENT (No such file
    or directory)

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 6, 2009

    I wonder why all these applications call PySys_SetArgv at all if they
    don't have any arguments to set. In the gedit case, I just removed the
    call from gedit, and it seems to work fine (sys.argv will be an empty list).

    @glyph
    Copy link
    Mannequin

    glyph mannequin commented May 6, 2009

    It suggests to me that somewhere there's some documentation, or an
    example, that says "this is the right way to embed python, call this
    function".

    If the right thing to do is to just not call the function at all, we
    need to get that knowledge out there into the embedding community and
    publicize this issue. Perhaps a doc bug? PySys_SetArgvEx seems like it
    might be a good idea for applications which do still want to set the
    argument list without the sys.path implications, but a quick perusal of
    the sources of plugins for the affected applications suggests that none
    of them need it.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 6, 2009

    It suggests to me that somewhere there's some documentation, or an
    example, that says "this is the right way to embed python, call this
    function".

    That may be an explanation. However, it would be immensely useful
    to know for sure, from the original authors of one or two such
    applications. Perhaps there is some issue that I'm missing (e.g.
    too much stuff crashes if sys.argv is empty - but what stuff
    could that be?)

    IOW, I *really* want to understand what's happening before fixing
    it. This is a security issue, after all.

    @glyph
    Copy link
    Mannequin

    glyph mannequin commented May 6, 2009

    IOW, I *really* want to understand what's happening before fixing
    it. This is a security issue, after all.

    Agreed. Does anyone currently subscribed to this ticket know the author
    of such an application? It would be very helpful to have them involved
    in the discussion.

    @gpshead
    Copy link
    Member

    gpshead commented May 6, 2009

    gedit does it here:

    http://git.gnome.org/cgit/gedit/tree/plugin-loaders/python/gedit-plugin-
    loader-python.c#n542

    I've emailed the file's author (Jesse) out of the blue to see if he knows
    why PySys_SetArgv() was called.

    @gpshead
    Copy link
    Member

    gpshead commented May 7, 2009

    re: gedit

    """I'm by no means an expert (I did not design the original python module
    extension), we simply copied from vim at the beginning. That said, it
    seems there are issues if you embed the python interpreter and do not
    explicitly set sys.argv to something.""" - jesse

    @pitrou
    Copy link
    Member

    pitrou commented May 10, 2009

    It seems other projects are already fighting with the path-changing
    behaviour of PySys_SetArgv(), e.g.:

    @akr
    Copy link
    Mannequin

    akr mannequin commented Jun 24, 2009

    src/if_python.c in vim-7.2 has a comment:
    /* Set sys.argv[] to avoid a crash in warn(). */

    I think the crash is follows.

    % python
    Python 2.5.2 (r252:60911, Jan  4 2009, 17:40:26) 
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import warnings
    >>> warnings.warn("foo")
    __main__:1: UserWarning: foo
    >>> import sys
    >>> sys.argv
    ['']
    >>> sys.argv = []
    >>> sys.argv
    []
    >>> warnings.warn("foo")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python2.5/warnings.py", line 54, in warn
        filename = sys.argv[0]
    IndexError: list index out of range
    >>>

    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented Jul 9, 2009

    Hello guys,

    what's the current state of this issue? The proposed patch hasn't
    still been projected into upstream Python code, so wondering:
    1, when and if it will be?
    2, if you have found another solution / patch?

    Thanks && Regards, Jan.

    Jan iankko Lieskovsky

    @pitrou
    Copy link
    Member

    pitrou commented Jul 9, 2009

    Hello,

    what's the current state of this issue? The proposed patch hasn't
    still been projected into upstream Python code, so wondering:
    1, when and if it will be?

    I was hoping for more feedback before committing it. While it has been
    labeled a security issue, not many people seem to actually care. Distro
    maintainers doing their own patching without communicating with us
    doesn't help either.

    2, if you have found another solution / patch?

    If it were so this bug would have been closed.

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented Jul 9, 2009

    Have you considered something like this? (patch against 3.1)

    --- Python/sysmodule.c.orig
    +++ Python/sysmodule.c
    @@ -1643,6 +1643,7 @@ PySys_SetArgv(int argc, wchar_t **argv)
     #endif /* Unix */
     		}
     #endif /* All others */
    +		if (n > 0 || argv0 == NULL || wcscmp(argv0, L"-c") == 0) {
     		a = PyUnicode_FromWideChar(argv0, n);
     		if (a == NULL)
     			Py_FatalError("no mem for sys.path insertion");
    @@ -1650,6 +1651,7 @@ PySys_SetArgv(int argc, wchar_t **argv)
     			Py_FatalError("sys.path.insert(0) failed");
     		Py_DECREF(a);
    +		}
     	}
     	Py_DECREF(av);
     }

    I presume main problem here is that '' may end up as first item in
    sys.path in certain cases.

    That is desired in some cases, namely:

    • python run in interactive mode
    • python -c '...'

    It does not happen and is not desired in other cases:

    • ./foo.py
    • python foo.py
    • env python foo.py

    Here foo.py can be just filename or filename with relative or absolute
    path. In all these cases python seems to set argv0 to something
    realpath can resolve.

    Problematic case is embedded use when bogus argv0 can cause '' to be
    added to sys.path, but it's usually not desired / expected (is anyone
    aware of the case when that is expected?). It can be argued whether
    apps should use garbage as argv0, but example in Demo/embed/demo.c do it
    as well...

    Patch above attempts to skip modification of sys.path when realpath
    failed (n == 0). There are two special cases, that are treated as
    special on couple of other places in PySys_SetArgv already:

    • argv0 == NULL (interactive python)
    • argv0 == "-c" (python -c)

    This should fix the problem for apps embedding python and providing
    garbage argv0. It would not make a difference for apps that provide
    some valid path as argv0. I'm not aware of non-embedded python use that
    will end up with different sys.path after this patch.

    Ideas? Anyone is aware of the valid usecase that can break with this?

    Advantage to Ex approach is that it does not require change on the
    embedding apps side, and should really impact only those setting garbage
    argv0.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 12, 2009

    Tomas, your patch is breaking an existing API, which may break existing
    uses (I'm not sure which ones, but people are doing lots of things with
    Python). That's why I proposed a separate API, which has the additional
    benefit of making things clearer rather than muddier.

    Besides, parsing of command line flags is already done in
    Modules/main.c, we shouldn't repeat it in sysmodule.c.

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented Jul 13, 2009

    Additional API has one disadvantage - it requires a modification of all
    affected applications embedding python, which is not likely to happen
    soon after the API is introduced.

    Therefore, it may still be worth reviewing current behaviour (that
    seemed to have had no documentation until recently, see issue bpo-5144, and
    can probably still benefit from more warnings related to the embedded
    use) in this corner case (argv0 is bogus and contains no '/') to see if
    it may be worth changing in future python versions.

    As for command line flags, I presume you're referring to the
    'wcscmp(argv0, L"-c")' part of the patch. It's not more than a re-use
    of the pattern already used couple of times in the PySys_SetArgv, that
    got added via:

    http://svn.python.org/view?view=rev&revision=39544

    Again, it's an attempt to make sure this only changes behaviour in
    rather specific case.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 13, 2009

    Indeed, it would certainly be useful to review current behaviour and
    document it precisely; and then, perhaps change it in order to fix the
    current bug. The problem is that the current behaviour seems to have
    evolved quite organically, and it's not obvious who relies on what (as I
    said, Python has many users). I'm not myself motivated in doing such a
    research. Perhaps other developers can chime in.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 13, 2009

    Besides, the new API makes the behaviour more explicit and puts the
    decision in the hands of the embedding developer (which certainly knows
    better than us what he wants to do).
    As the Python Zen says:

    In the face of ambiguity, refuse the temptation to guess.

    @iankko
    Copy link
    Mannequin Author

    iankko mannequin commented Jul 15, 2009

    Link to older Python tracker issue discussing the same problem and
    closed with "won't fix":

    http://bugs.python.org/issue946373
    

    Strange enough, but implied from reading above issue, just an
    idea (don't shoot :)). Wouldn't it be possible to recognize,
    if the module name the script | embedded application is trying
    to load belongs to && conflicts with the 'standard' Python module
    names as listed in:

    http://docs.python.org/modindex.html

    and in that case:
    a, issue a warning by loading it?
    b, refuse to import it, in case it doesn't come from usual
    standard Python modules location?

    Probably off-topic, but is there in Python some mechanism how to
    determine, if the module / module name belongs to:
    a, 'standard Python module set' or
    b, is a custom module, written by Python user?
    (via the Python's interpreter __main__ module's namespace
    dictionary? -- based on [1])

    [1] http://www.linuxjournal.com/article/8497

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented Jul 16, 2009

    This is not really the same thing as bpo-946373. That one seems to be
    about adding script's directory as the first thing in sys.path.
    Comments there seem to mix both interactive ('' in sys.path) and
    non-interactive (os.path.dirname(os.path.abspath(sys.argv[0])) in
    sys.path) python uses, while CVE-2008-5983 is only about '' in sys.path,
    mostly related to embedded use, rather than for python interpreter itself.

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented Jan 18, 2010

    Has anyone else had an opportunity to have a look at the change proposed in #msg90336?

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented May 4, 2010

    Can anyone move this to Stage: patch review (for the fix approach proposed in msg90336)? Or does anyone have better idea on how to move this closer to final fix or wontfix / reject? Thank you!

    @pitrou
    Copy link
    Member

    pitrou commented May 4, 2010

    Can anyone move this to Stage: patch review (for the fix approach
    proposed in msg90336)? Or does anyone have better idea on how to move
    this closer to final fix or wontfix / reject? Thank you!

    I stand by my opinion that adding another hack in the initialization
    path will not do us a lot of good, while a separate API would solve the
    problem neatly. Perhaps Dave Malcolm can chime in?

    @bitdancer
    Copy link
    Member

    FWIW I agree with Antoine.

    @davidmalcolm
    Copy link
    Member

    Attempting to summarize IRC discussion about this.

    PySys_SetArgv is used to set up sys.argv There is plenty of code which assumes that this is a list containing at least a zeroth string element; for example warnings.warn (see msg89688).

    It seems reasonable for an program that embeds Python to have no arguments, and for this case, it makes sense for sys.argv to be [""] (i.e. a list containing a single empty string).

    However, in this case, it doesn't necessarily make sense to prepend the empty string to the front of sys.path

    Looking through Python/sysmodule.c: if argc is 0 in the call to PySys_SetArgv, it looks like makeargvobject makes sys.argv be [""] (which is good), but it looks like it uses argc[0] (as "argv") to prepend sys.path.

    My reading of PySys_SetArgv is that if argv is NULL, then "char *argv0 = argv[0];" will read through NULL and thus will segfault on a typical platform.

    So one possible way to handle this might be to support PySys_SetArgv(0, NULL) as signifying that sys.argv should be set to [""] with no modification of sys.path

    This Google code search for "pysys_setargv(0" shows 25 hits:
    http://www.google.com/codesearch?hl=en&lr=&q=pysys_setargv\\(0&sbtn=Search

    Hoever, the function is complicated, and adding more special-cases seems error-prone.

    I favor Antoine's approach in http://bugs.python.org/file13860/setargvex.patch of adding a new API entry point, whilst maximizing compatibilty for all of the code our there using the existing entry point.

    I think that both the old and the new entry point need to have better documentation, in particular, spelling out the meaning of the args, what the effect of argc==0 is, and that argv must be non-NULL in the old entry point, but may be NULL for argc==0 in the new entry point (assuming that I'm reading that correctly).

    @pitrou
    Copy link
    Member

    pitrou commented May 18, 2010

    Ok, I will try to write better documentation.

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented May 18, 2010

    My reading of PySys_SetArgv is that if argv is NULL, then
    "char *argv0 = argv[0];" will read through NULL and thus will
    segfault on a typical platform.

    Right.

    I favor Antoine's approach in
    http://bugs.python.org/file13860/setargvex.patch of adding a new API
    entry point, whilst maximizing compatibilty for all of the code our
    there using the existing entry point.

    Sadly, this won't help existing applications affected by this problem, without all of them needing to be changed.

    My change proposed in msg90336 won't help either, at least not in all cases. Apps that call PySys_SetArgv with 1, { "myappname", NULL } can still be tricked to add full CWD path at the beginning of sys.path on platforms with realpath().

    @pitrou
    Copy link
    Member

    pitrou commented May 20, 2010

    Here is a new patch giving more details in the doc, and explicitly mentioning the CVE entry.

    @thoger
    Copy link
    Mannequin

    thoger mannequin commented May 21, 2010

    + - If the name of an existing script is passed in ``argv[0]``, its absolute
    + path is prepended to :data:`sys.path`

    Absolute path to the directory where script is located. And I believe there's no absolute path guarantee for platforms without realpath / GetFullPathName.

    Should the documentation also give some guidance to those that embed python and don't want to start using SetArgvEx right away and break compatibility with older python versions? Something like:

    If you're embedding python in your application, using SetArgv and don't want modified sys.path, call PyRun_SimpleString("sys.path.pop(0)\n"); after SysArgv to unconditionally drop the first sys.path argument added by SetArgv.

    @pitrou
    Copy link
    Member

    pitrou commented May 21, 2010

    Absolute path to the directory where script is located. And I believe
    there's no absolute path guarantee for platforms without realpath /
    GetFullPathName.

    Yes, this is more precise indeed. As for realpath(), I would expect it
    to be present on modern Unices (man page says "4.4BSD, POSIX.1-2001").

    If you're embedding python in your application, using SetArgv and
    don't want modified sys.path, call
    PyRun_SimpleString("sys.path.pop(0)\n"); after SysArgv to
    unconditionally drop the first sys.path argument added by SetArgv.

    I suppose
    PyRun_SimpleString("import sys; sys.path.pop(0)\n");
    would be better.
    Thanks for the comments, I'll update the patch.

    @pitrou
    Copy link
    Member

    pitrou commented May 21, 2010

    Committed in r81398 (trunk), r81399 (2.6), r81400 (py3k), r81401 (3.1). Thank you!

    @pitrou pitrou closed this as completed May 21, 2010
    @akuchling
    Copy link
    Member

    Demo/embed/demo.c calls PySys_SetArgv(), which may be where
    some people are copying their code from. I've updated it to
    use PySys_SetArgvEx() and added an explanatory comment in rev. 81881.

    @akuchling
    Copy link
    Member

    Since the function was also added to 2.6, the 2.6 What's New should mention it; added in rev81887.

    @jcea
    Copy link
    Member

    jcea commented Sep 28, 2010

    This issue is equivalent to MS Windows DLL hijacking (the MS situation is worse, because the DDL can be in network shares or, even , in remote webdav servers):

    http://blog.metasploit.com/2010/08/exploiting-dll-hijacking-flaws.html
    http://news.cnet.com/8301-27080_3-20014625-245.html

    When I learned about this attack, my first thought was "what if sys.path.index('')>=0?". Arg!.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants