Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.server._url_collapse_path should live elsewhere #49964

Open
gpshead opened this issue Apr 7, 2009 · 8 comments
Open

http.server._url_collapse_path should live elsewhere #49964

gpshead opened this issue Apr 7, 2009 · 8 comments
Labels
easy stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@gpshead
Copy link
Member

gpshead commented Apr 7, 2009

BPO 5714
Nosy @gpshead, @orsenthil, @bitdancer, @vadmium, @tahia-khan
Files
  • issue5714_withdoc.diff: Moved CGIHTTPServer._url_collapse_path_split -> urlparse.url_collapse_path_split. Changed docs for urlparse.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2009-04-07.00:24:03.677>
    labels = ['easy', 'type-feature', 'library']
    title = 'http.server._url_collapse_path should live elsewhere'
    updated_at = <Date 2019-09-10.21:01:24.992>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2019-09-10.21:01:24.992>
    actor = 'ta1hia'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2009-04-07.00:24:03.677>
    creator = 'gregory.p.smith'
    dependencies = []
    files = ['13644']
    hgrepos = []
    issue_num = 5714
    keywords = ['patch', 'easy']
    message_count = 8.0
    messages = ['85679', '85699', '85701', '85715', '85729', '86872', '221756', '252079']
    nosy_count = 6.0
    nosy_names = ['gregory.p.smith', 'orsenthil', 'r.david.murray', 'chin', 'martin.panter', 'ta1hia']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue5714'
    versions = ['Python 3.5']

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 7, 2009

    CGIHTTPServer._url_collapse_path_split should probably live in a more
    general library and become a documented API. Perhaps in urlparse?

    @gpshead gpshead added easy type-feature A feature request or enhancement labels Apr 7, 2009
    @chin
    Copy link
    Mannequin

    chin mannequin commented Apr 7, 2009

    I'd Wrote a patch for this.

    Perhaps some unittests are needed.

    @bitdancer
    Copy link
    Member

    Yes, unit tests are always needed for any change.

    Gregory, why is a function that does two things (collapse and split) a
    good candidate for a public API? It results in a pretty awkward name :)

    I'm not saying it's necessarily a bad idea, just looking for motivation.

    @bitdancer bitdancer added the stdlib Python modules in the Lib dir label Apr 7, 2009
    @chin
    Copy link
    Mannequin

    chin mannequin commented Apr 7, 2009

    Actually urlparse.urljoin implements RFC 2396

    Is it true that 'CGIHTTPServer._url_collapse_path_split' is just a
    inverted 'urlparse.urljoin' ?
    """
    >>> urlparse.urljoin('http://a/b/c/','g')
    'http://a/b/c/g'
    >>> urlparse.url_collapse_path_split('http://a/b/c/g')
    ('/http:/a/b/c', 'g')
    >>> urlparse.urlsplit('http://a/b/c/g')
    SplitResult(scheme='http', netloc='a', path='/b/c/g', query='', fragment='')
    """

    And there is existing function 'urlparse.urlsplit'.
    I think "CGIHTTPServer._url_collapse_path_split" is just a customized
    version of latter, and should be rewritten using 'urlparse.urlsplit'.

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 7, 2009

    urlparse.urljoin and urlparse.urlsplit do not do what is required for
    this function. urljoin does not collapse paths. urlsplit has nothing
    to do with paths.

    I agree r.david.murray that it is odd that it does two functions at once
    (the collapse and the split). I wrote it specifically for its current
    use case when checking paths to cgi scripts.

    The unittests for it describe the exact behavior it needs to implement.
    Trying to implement a separate collapse function is approximately the
    same amount of code because the edge cases such as
    '/a/b/' and '/a/b/c/..' which both need to result in it returning
    ('/a/b', '') instead of ('/a', 'b') are why it made sense to keep as a
    single function for its current use.

    Unittests for the function already exist in Lib/test/test_httpservers.py
    to describe its full behavior.

    @orsenthil
    Copy link
    Member

    Yes, the parsing best be done in urlparse.

    As this function claims to do step 6 of RFC2396, I am surprised why it
    cannot be done by using existing urlparse functions itself.(Because, the
    tests for RFC2396 compliance covers those cases too).

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jun 28, 2014

    I've tried investigating this but I've got lost so I'll have to pass the buck. I've got confused because of the code moving around in Python 2 and the library changes going to Python 3 :-(

    @berkerpeksag berkerpeksag changed the title CGIHTTPServer._url_collapse_path_split should live elsewhere http.server._url_collapse_path should live elsewhere Jun 30, 2014
    @vadmium
    Copy link
    Member

    vadmium commented Oct 2, 2015

    Some ideas for this sort of function (or perhaps a few related functions):

    • Allow OS-independent handling of the path. E.g. the CGI server still has to check if each segment is a CGI script, the rest becomes PATH_INFO. Perhaps return a sequence of path segments rather than a string.
    • Allow easy conversion to a valid OS path (or perhaps a pathlib object)
    • Ensure that invalid and special OS filenames trigger an error (e.g. "NUL" or unsupported characters, including backslashes \, on Windows). Like a stricter version of pathlib’s is_reserved().
    • Handle percent decoding

    Also, see the SimpleHTTPRequestHandler.translate_path() method, and bpo-14567. There seems to be a lot of redundancy.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants