Messages (3)
msg82992 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2009-03-01 23:47
Sorry for the harsh words, but when I found that code I nearly freaked
out.  For all those years I was using "from mimetypes import guess_type"
until today I found out that this has horrendous performance problems
due to the fact that the mimetype database is re-parsed on each call.

The reason for this is that mimetypes.guess_type is implemented like this:

def guess_type(...):
    global guess_type
    guess_type = new_guess_type
    return guess_type(...)

Obviously if the function was imported from the module and not looked up
via standard attribute lookup before each call (by calling it like
mimetypes.guess_type(...)) init() would be called over and over again.

What's the performance impact?  In a small WSGI middleware that serves
static files the *total* performance impact (including HTTP header
parsing, file serving etc.) was 1000%.  Just for guess_type() versus
mimetypes.guess_type() which was called just once per request.

I attached a workaround for that problem that tries to avoid init()
calls after the thing was initialized.

If this is intended behaviour it should be documented but I doubt that
this is a good idea as people don't read documentation it stuff seems to

And google tells me I'm not the first one who invoked guess_type that
msg82993 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-01 23:53
Wah, that's really a horrible way to implement this caching.
msg82997 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-03-02 03:35
Well, that was embarrassing! Fixed in r70086.
