This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steven.daprano
Recipients SilentGhost, devkral, orsenthil, steven.daprano
Date 2018-12-03.22:51:57
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1543877517.42.0.788709270274.issue35377@psf.upfronthosting.co.za>
In-reply-to
Content
I'm changing the name to better describe the problem, and suggest a better solution.

The urlparse.urlsplit and .urlunsplit functions currently don't validate the scheme argument, if given. According to the RFC:

   Scheme names consist of a sequence of characters. The lower case
   letters "a"--"z", digits, and the characters plus ("+"), period
   ("."), and hyphen ("-") are allowed. For resiliency, programs
   interpreting URLs should treat upper case letters as equivalent to
   lower case in scheme names (e.g., allow "HTTP" as well as "http").

https://www.ietf.org/rfc/rfc1738.txt

If the scheme is specified, I suggest it should be normalised to lowercase and validated, something like this:

    # untested
    if scheme:
        # scheme_chars already defined in module
        badchars = set(scheme) - set(scheme_chars)
        if badchars:
            raise ValueError('"%c" is invalid in URL schemes' % badchars.pop())
        scheme = scheme.lower()


This will help avoid errors such as passing 'http://' as the scheme.
History
Date User Action Args
2018-12-03 22:51:57steven.dapranosetrecipients: + steven.daprano, orsenthil, SilentGhost, devkral
2018-12-03 22:51:57steven.dapranosetmessageid: <1543877517.42.0.788709270274.issue35377@psf.upfronthosting.co.za>
2018-12-03 22:51:57steven.dapranolinkissue35377 messages
2018-12-03 22:51:57steven.dapranocreate