Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve is_cgi() in http.server #83044

Closed
kkangshawn mannequin opened this issue Nov 20, 2019 · 12 comments
Closed

Improve is_cgi() in http.server #83044

kkangshawn mannequin opened this issue Nov 20, 2019 · 12 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@kkangshawn
Copy link
Mannequin

kkangshawn mannequin commented Nov 20, 2019

BPO 38863
Nosy @vstinner, @asvetlov, @ethanfurman, @vadmium, @miss-islington, @kkangshawn
PRs
  • bpo-38863: Improve is_cgi() in http.server #17312
  • Files
  • sample.tar.xz: Sample script including html and python cgi
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-11-22.09:14:33.017>
    created_at = <Date 2019-11-20.14:01:19.636>
    labels = ['type-feature', 'library', '3.9']
    title = 'Improve is_cgi() in http.server'
    updated_at = <Date 2020-07-20.20:51:46.154>
    user = 'https://github.com/kkangshawn'

    bugs.python.org fields:

    activity = <Date 2020-07-20.20:51:46.154>
    actor = 'Rhodri James'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-11-22.09:14:33.017>
    closer = 'asvetlov'
    components = ['Library (Lib)']
    creation = <Date 2019-11-20.14:01:19.636>
    creator = 'kkangshawn'
    dependencies = []
    files = ['48725']
    hgrepos = []
    issue_num = 38863
    keywords = ['patch']
    message_count = 12.0
    messages = ['357074', '357081', '357082', '357084', '357089', '357135', '357136', '357137', '357256', '357785', '357787', '357790']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'asvetlov', 'ethan.furman', 'martin.panter', 'miss-islington', 'kkangshawn']
    pr_nums = ['17312']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38863'
    versions = ['Python 3.9']

    @kkangshawn
    Copy link
    Mannequin Author

    kkangshawn mannequin commented Nov 20, 2019

    is_cgi() in CGIHTTPRequestHandler class separates given path into (dir, rest) then checks if dir is in cgi_directories. However, it divides based on the first seen '/', multi-level directories like /sub/dir/cgi-bin/hello.py is divided into head=/sub, rest=dir/cgi-bin/hello.py then check whether '/sub' exists in cgi_directories = [..., '/sub/dir/cgi-bin'].
    If the function divides by last seen '/', it works correctly as head=/sub/dir/cgi-bin, rest=hello.py

    @kkangshawn kkangshawn mannequin added type-bug An unexpected behavior, bug, or error 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir labels Nov 20, 2019
    @corona10
    Copy link
    Member

    CGI programs are stored in a directory which must be configured in the web server. The path is typically SERVER_ROOT/cgi-bin, so the URL looks like
    http://www.domain/cgi-bin/script

    So IMHO, is_cgi's assumption is correct.
    IMHO, this is not the wrong code.

    @corona10
    Copy link
    Member

    In addition, the code is not about the path on the file system,
    but about the web request path.

    https://bugs.python.org/msg216960 will help you to understand.

    @kkangshawn
    Copy link
    Mannequin Author

    kkangshawn mannequin commented Nov 20, 2019

    Thank you for your message and the info about 21323. I agree with the idea that cgi files are conventionally placed at the cgi-bin of the root but there is no explicit regulation so other servers, apache for example, handle this kind of sub directories successfully. In short, there is no violation in the form of /subdir/cgi-bin so this is nice to have if the http.server processes this case correctly. By the way, I didn't consider the case of continuous slashes so PR must be modified.

    @corona10
    Copy link
    Member

    Yes, IMHO, but this code is related to the http.server.CGIHTTPRequestHandler.
    This code looks like to be executed on the http.server.CGIHTTPRequestHandler not the apache server.

    According to docs,
    This defaults to ['/cgi-bin', '/htbin'] and describes directories to treat as containing CGI scripts.

    reference: https://docs.python.org/3.9/library/http.server.html?highlight=cgihttprequesthandler#http.server.CGIHTTPRequestHandler.cgi_directories

    @kkangshawn
    Copy link
    Mannequin Author

    kkangshawn mannequin commented Nov 21, 2019

    Hi Donghee,
    Since you said this is not a bug, I changed the title describing this is a matter of improvement.

    For your comment, I would say sorry first that I have made you confused. My mention about apache is just to give you an example for the other module that does similar thing(HTTP request handler with cgi support). I would never say the Python server shall support the sub-directory parsing for cgi scripts because apache does but I am saying good is good.

    > Yes, IMHO, but this code is related to the http.server.CGIHTTPRequestHandler.
    This code looks like to be executed on the http.server.CGIHTTPRequestHandler not the apache server.

    So you may see the PR changes the CGIHTTPRequestHandler class. Please refer to the code.

    > According to docs,
    This defaults to ['/cgi-bin', '/htbin'] and describes directories to treat as containing CGI scripts.

    This is the one I was looking at. As described, I added '/sub/dir/cgi-bin' into the cgi_directories expecting it to be treated the directory is for CGI scripts but the CGIHTTPRequestHandler does not process it. That is the reason for this issue report. Again, there is no note that restrict the path in cgi_directories shall be a single depth, so I think this will definitely give a benefit to handle the multi-level directories just like https://bugs.python.org/issue21323 enables processing the sub directories of /cgi-bin.
    Does this make sense to you now?

    @kkangshawn kkangshawn mannequin changed the title http.server is_cgi() does not correctly separate dir Improve is_cgi() in http.server Nov 21, 2019
    @kkangshawn kkangshawn mannequin added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Nov 21, 2019
    @corona10
    Copy link
    Member

    Thanks, Siwon Kang, I now understood what you want to say.
    I left some comments on your PR 17312.

    @corona10
    Copy link
    Member

    @martin.panter, @asvetlov

    I add the core developers who looks like to be experts on this module.
    Can you follow up on Siwon's PR?

    Thanks for your understanding.

    @miss-islington
    Copy link
    Contributor

    New changeset 91daa9d by Miss Islington (bot) (Siwon Kang) in branch 'master':
    bpo-38863: Improve is_cgi() in http.server (GH-17312)
    91daa9d

    @asvetlov asvetlov removed 3.7 (EOL) end of life 3.8 only security fixes labels Nov 22, 2019
    @vstinner
    Copy link
    Member

    vstinner commented Dec 4, 2019

    The change introduced a reference leak, see bpo-38962. Pablo fixed it:

    commit 24f5cac (HEAD -> master, upstream/master)
    Author: Pablo Galindo <Pablogsal@gmail.com>
    Date: Wed Dec 4 09:29:10 2019 +0000

    bpo-38962: Fix reference leak in test_httpservers (GH-17454)
    

    @asvetlov
    Copy link
    Contributor

    asvetlov commented Dec 4, 2019

    Thanks, Pablo!

    @kkangshawn
    Copy link
    Mannequin Author

    kkangshawn mannequin commented Dec 4, 2019

    Obviously, I should have taken back what I added to cgi_directories. Thank you Pablo and Victor!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants