This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Improve is_cgi() in http.server
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, ethan.furman, kkangshawn, martin.panter, miss-islington, vstinner
Priority: normal Keywords: patch

Created on 2019-11-20 14:01 by kkangshawn, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sample.tar.xz kkangshawn, 2019-11-20 14:01 Sample script including html and python cgi
Pull Requests
URL Status Linked Edit
PR 17312 merged kkangshawn, 2019-11-21 06:38
Messages (12)
msg357074 - (view) Author: Siwon Kang (kkangshawn) * Date: 2019-11-20 14:01
is_cgi() in CGIHTTPRequestHandler class separates given path into (dir, rest) then checks if dir is in cgi_directories. However, it divides based on the first seen '/', multi-level directories like /sub/dir/cgi-bin/hello.py is divided into head=/sub, rest=dir/cgi-bin/hello.py then check whether '/sub' exists in cgi_directories = [..., '/sub/dir/cgi-bin'].
If the function divides by last seen '/', it works correctly as head=/sub/dir/cgi-bin, rest=hello.py
msg357081 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-11-20 15:44
CGI programs are stored in a directory which must be configured in the web server. The path is typically SERVER_ROOT/cgi-bin, so the URL looks like
http://www.domain/cgi-bin/script

So IMHO, is_cgi's assumption is correct.
IMHO, this is not the wrong code.
msg357082 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-11-20 15:47
In addition, the code is not about the path on the file system,
but about the web request path.

https://bugs.python.org/msg216960 will help you to understand.
msg357084 - (view) Author: Siwon Kang (kkangshawn) * Date: 2019-11-20 16:56
Thank you for your message and the info about 21323. I agree with the idea that cgi files are conventionally placed at the cgi-bin of the root but there is no explicit regulation so other servers, apache for example, handle this kind of sub directories successfully. In short, there is no violation in the form of /subdir/cgi-bin so this is nice to have if the http.server processes this case correctly. By the way, I didn't consider the case of continuous slashes so PR must be modified.
msg357089 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-11-20 17:29
Yes, IMHO, but this code is related to the http.server.CGIHTTPRequestHandler.
This code looks like to be executed on the http.server.CGIHTTPRequestHandler not the apache server.

According to docs,
This defaults to ['/cgi-bin', '/htbin'] and describes directories to treat as containing CGI scripts.

reference: https://docs.python.org/3.9/library/http.server.html?highlight=cgihttprequesthandler#http.server.CGIHTTPRequestHandler.cgi_directories
msg357135 - (view) Author: Siwon Kang (kkangshawn) * Date: 2019-11-21 07:02
Hi Donghee,
Since you said this is not a bug, I changed the title describing this is a matter of improvement.

For your comment, I would say sorry first that I have made you confused. My mention about apache is just to give you an example for the other module that does similar thing(HTTP request handler with cgi support). I would never say the Python server shall support the sub-directory parsing for cgi scripts because apache does but I am saying good is good.

>> Yes, IMHO, but this code is related to the http.server.CGIHTTPRequestHandler.
This code looks like to be executed on the http.server.CGIHTTPRequestHandler not the apache server.

So you may see the PR changes the CGIHTTPRequestHandler class. Please refer to the code.

>> According to docs,
This defaults to ['/cgi-bin', '/htbin'] and describes directories to treat as containing CGI scripts.

This is the one I was looking at. As described, I added '/sub/dir/cgi-bin' into the cgi_directories expecting it to be treated the directory is for CGI scripts but the CGIHTTPRequestHandler does not process it. That is the reason for this issue report. Again, there is no note that restrict the path in cgi_directories shall be a single depth, so I think this will definitely give a benefit to handle the multi-level directories just like https://bugs.python.org/issue21323 enables processing the sub directories of /cgi-bin.
Does this make sense to you now?
msg357136 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-11-21 07:13
Thanks, Siwon Kang, I now understood what you want to say.
I left some comments on your PR 17312.
msg357137 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-11-21 07:20
@martin.panter, @asvetlov

I add the core developers who looks like to be experts on this module.
Can you follow up on Siwon's PR?

Thanks for your understanding.
msg357256 - (view) Author: miss-islington (miss-islington) Date: 2019-11-22 09:13
New changeset 91daa9d7224626dad4bb820924c01b3438ca6e3f by Miss Islington (bot) (Siwon Kang) in branch 'master':
bpo-38863: Improve is_cgi() in http.server (GH-17312)
https://github.com/python/cpython/commit/91daa9d7224626dad4bb820924c01b3438ca6e3f
msg357785 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-04 09:31
The change introduced a reference leak, see bpo-38962. Pablo fixed it:

commit 24f5cac7254177a4c9956d680c0a9b6dadd85c6f (HEAD -> master, upstream/master)
Author: Pablo Galindo <Pablogsal@gmail.com>
Date:   Wed Dec 4 09:29:10 2019 +0000

    bpo-38962: Fix reference leak in test_httpservers (GH-17454)
msg357787 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-12-04 09:47
Thanks, Pablo!
msg357790 - (view) Author: Siwon Kang (kkangshawn) * Date: 2019-12-04 09:58
Obviously, I should have taken back what I added to cgi_directories. Thank you Pablo and Victor!
History
Date User Action Args
2022-04-11 14:59:23adminsetgithub: 83044
2020-07-20 20:51:46Rhodri Jamessetnosy: - Rhodri James
2019-12-04 09:58:41kkangshawnsetmessages: + msg357790
2019-12-04 09:47:51asvetlovsetmessages: + msg357787
2019-12-04 09:31:41vstinnersetnosy: + vstinner
messages: + msg357785
2019-11-22 09:14:33asvetlovsetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: - Python 3.7, Python 3.8
2019-11-22 09:13:10miss-islingtonsetnosy: + miss-islington
messages: + msg357256
2019-11-22 07:19:30corona10setnosy: + ethan.furman, Rhodri James, - corona10
2019-11-21 07:20:56corona10setnosy: + asvetlov, martin.panter

messages: + msg357137
versions: - Python 3.6
2019-11-21 07:13:41corona10setmessages: + msg357136
2019-11-21 07:02:34kkangshawnsettype: behavior -> enhancement
messages: + msg357135
title: http.server is_cgi() does not correctly separate dir -> Improve is_cgi() in http.server
2019-11-21 06:40:44kkangshawnsetpull_requests: - pull_request16785
2019-11-21 06:38:27kkangshawnsetpull_requests: + pull_request16800
2019-11-20 17:29:58corona10setmessages: + msg357089
2019-11-20 16:56:50kkangshawnsetmessages: + msg357084
2019-11-20 15:47:45corona10setmessages: + msg357082
2019-11-20 15:44:52corona10setnosy: + corona10
messages: + msg357081
2019-11-20 14:53:59kkangshawnsetkeywords: + patch
stage: patch review
pull_requests: + pull_request16785
2019-11-20 14:01:19kkangshawncreate