Title: mimetypes.guess_type("//") misinterprets host name as file name
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8, Python 3.7
Status: closed Resolution: fixed
Dependencies: 35939 Superseder:
Assigned To: Nosy List: corona10, martin.panter, maxking, miss-islington, vstinner
Priority: normal Keywords: patch

Created on 2014-09-06 02:52 by martin.panter, last changed 2019-09-05 01:44 by corona10. This issue is now closed.

File name Uploaded Description Edit
mimetypes-host.patch martin.panter, 2015-02-24 05:53 review
Pull Requests
URL Status Linked Edit
PR 15522 merged corona10, 2019-08-26 16:04
PR 15685 merged miss-islington, 2019-09-05 00:34
PR 15687 merged corona10, 2019-09-05 00:49
Messages (9)
msg226467 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-09-06 02:52
The documentation says that guess_type() takes a URL, but:

>>> mimetypes.guess_type("")
('application/x-msdownload', None)

I suspect the MS download is a reference to *.com files (like DOS's My current workaround is to strip out the host name from the URL, since I cannot imagine it would be useful for determining the content type. I am also stripping the fragment part. An argument could probably be made for stripping the “;parameters” and “?query” parts as well.

>>> # Workaround for mimetypes.guess_type("//")
... # interpreting host name as file name
... url = urlparse("")
>>> url = net.url_replace(url, netloc="", fragment="")
>>> url
>>> mimetypes.guess_type(url, strict=False)
(None, None)
msg236479 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-24 05:53
Posting a patch to fix this. It passes the URL through a urlsplit() → urlunsplit() stage, while removing the scheme://netloc parts.
msg335123 - (view) Author: Dong-hee Na (corona10) * (Python triager) Date: 2019-02-09 02:15
The proposed patch I mentioned on bpo-35939 also solve the above situation.

Python 3.8.0a1+ (heads/bpo-12317:96d37dbcd2, Feb  8 2019, 12:03:40)
[Clang 9.1.0 (clang-902.0.39.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import mimetypes
>>> mimetypes.guess_type("")
(None, None)
>>> mimetypes.guess_type("")
('application/x-msdownload', None)

I've also added the unit tests of mimetypes-host.patch. It works well.
I think that we close this issue also when the bpo-35939 is closed.

Thanks alot!
msg351156 - (view) Author: miss-islington (miss-islington) Date: 2019-09-05 00:34
New changeset 87bd2071c756188b6cd577889fb1682831142ceb by Miss Islington (bot) (Dong-hee Na) in branch 'master':
bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522)
msg351157 - (view) Author: miss-islington (miss-islington) Date: 2019-09-05 00:55
New changeset 6d7a786d2e4b48a6b50614e042ace9ff996f0238 by Miss Islington (bot) in branch '3.8':
bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522)
msg351158 - (view) Author: miss-islington (miss-islington) Date: 2019-09-05 01:16
New changeset 8873bff2871078e9f23e6c7d942d3a8edbd0921f by Miss Islington (bot) (Dong-hee Na) in branch '3.7':
[3.7] bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) (GH-15687)
msg351162 - (view) Author: Dong-hee Na (corona10) * (Python triager) Date: 2019-09-05 01:26
@vstinner(my mentor) @maxking
Now this issue is solved.
I'd like to close this issue. Is it okay?
msg351164 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-09-05 01:29
I think so, yes.

Also, while you are at it, can you also close bpo-35939 with a comment that points to this issue and the right PR for the fix?
msg351167 - (view) Author: Dong-hee Na (corona10) * (Python triager) Date: 2019-09-05 01:34
Great! I will close bpo-35939 also.
Date User Action Args
2019-09-05 12:19:43corona10linkissue35939 superseder
2019-09-05 01:44:00corona10setstatus: open -> closed
resolution: fixed
2019-09-05 01:34:56corona10setstage: patch review -> resolved
2019-09-05 01:34:34corona10setmessages: + msg351167
2019-09-05 01:29:06maxkingsetmessages: + msg351164
2019-09-05 01:26:52corona10setnosy: + vstinner, maxking
messages: + msg351162
2019-09-05 01:16:41miss-islingtonsetmessages: + msg351158
2019-09-05 00:55:01miss-islingtonsetmessages: + msg351157
2019-09-05 00:49:06corona10setpull_requests: + pull_request15345
2019-09-05 00:34:48miss-islingtonsetpull_requests: + pull_request15343
2019-09-05 00:34:39miss-islingtonsetnosy: + miss-islington
messages: + msg351156
2019-08-26 16:04:37corona10setstage: patch review
pull_requests: + pull_request15206
2019-02-09 02:19:48corona10setversions: + Python 3.7, Python 3.8, - Python 3.4
2019-02-09 02:15:26corona10setnosy: + corona10
messages: + msg335123
2019-02-08 23:25:29martin.pantersetdependencies: + Remove urllib.parse._splittype from mimetypes.guess_type
2015-02-24 05:53:55martin.pantersetfiles: + mimetypes-host.patch
keywords: + patch
messages: + msg236479
2014-09-06 02:52:37martin.pantercreate