classification
Title: Faster urllib.urlparse utility functions
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-03-02 15:05 by serhiy.storchaka, last changed 2015-03-03 18:29 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
urlparse_split_faster.patch serhiy.storchaka, 2015-03-02 15:05 review
Messages (2)
msg237048 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-02 15:05
Proposed patch optimizes utility functions in the urllib.parse module.

$ ./python -m timeit -s "from urllib.parse import splittype" -- "splittype('type:'+'x'*1000)"
Unpatched: 100000 loops, best of 3: 17 usec per loop
Patched:   100000 loops, best of 3: 15 usec per loop

$ ./python -m timeit -s "from urllib.parse import splithost" -- "splithost('//www.example.org:80/foo/bar/baz.html')"
Unpatched: 100000 loops, best of 3: 12.7 usec per loop
Patched:   100000 loops, best of 3: 10.6 usec per loop

$ ./python -m timeit -s "from urllib.parse import splithost" -- "splithost('//www.example.org:80')"
Unpatched: 100000 loops, best of 3: 9.34 usec per loop
Patched:   100000 loops, best of 3: 9.09 usec per loop

$ ./python -m timeit -s "from urllib.parse import splituser" -- "splituser('username:password@example.org:80')"
Unpatched: 100000 loops, best of 3: 8.76 usec per loop
Patched:   100000 loops, best of 3: 3.1 usec per loop

$ ./python -m timeit -s "from urllib.parse import splituser" -- "splituser('example.org:80')"
Unpatched: 100000 loops, best of 3: 5.89 usec per loop
Patched:   100000 loops, best of 3: 1.98 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitpasswd" -- "splitpasswd('username:password')"
Unpatched: 100000 loops, best of 3: 7.38 usec per loop
Patched:   100000 loops, best of 3: 3.08 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitpasswd" -- "splitpasswd('username')"
Unpatched: 100000 loops, best of 3: 5.35 usec per loop
Patched:   100000 loops, best of 3: 1.92 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitnport" -- "splitnport('example.org:80')"
Unpatched: 100000 loops, best of 3: 13.2 usec per loop
Patched:   100000 loops, best of 3: 6.58 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitnport" -- "splitnport('example.org')"
Unpatched: 100000 loops, best of 3: 6.03 usec per loop
Patched:   100000 loops, best of 3: 2.37 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitquery" -- "splitquery('/path?query')"
Unpatched: 100000 loops, best of 3: 8.03 usec per loop
Patched:   100000 loops, best of 3: 3.01 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitquery" -- "splitquery('/path')"
Unpatched: 100000 loops, best of 3: 5.21 usec per loop
Patched:   1000000 loops, best of 3: 1.91 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitvalue" -- "splitvalue('attr=value')"
Unpatched: 100000 loops, best of 3: 7.37 usec per loop
Patched:   100000 loops, best of 3: 2.97 usec per loop

$ ./python -m timeit -s "from urllib.parse import splitvalue" -- "splitvalue('attr')"
Unpatched: 100000 loops, best of 3: 5.13 usec per loop
Patched:   1000000 loops, best of 3: 1.9 usec per loop

This functions are not documented but used in the stdlib and third-party code.
msg237156 - (view) Author: Roundup Robot (python-dev) Date: 2015-03-03 18:22
New changeset 461afc24fabc by Serhiy Storchaka in branch 'default':
Issue #23563: Optimized utility functions in urllib.parse.
https://hg.python.org/cpython/rev/461afc24fabc
History
Date User Action Args
2015-03-03 18:29:17serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015-03-03 18:22:33python-devsetnosy: + python-dev
messages: + msg237156
2015-03-02 15:05:03serhiy.storchakacreate