Message82992
Sorry for the harsh words, but when I found that code I nearly freaked
out. For all those years I was using "from mimetypes import guess_type"
until today I found out that this has horrendous performance problems
due to the fact that the mimetype database is re-parsed on each call.
The reason for this is that mimetypes.guess_type is implemented like this:
def guess_type(...):
global guess_type
init()
guess_type = new_guess_type
return guess_type(...)
Obviously if the function was imported from the module and not looked up
via standard attribute lookup before each call (by calling it like
mimetypes.guess_type(...)) init() would be called over and over again.
What's the performance impact? In a small WSGI middleware that serves
static files the *total* performance impact (including HTTP header
parsing, file serving etc.) was 1000%. Just for guess_type() versus
mimetypes.guess_type() which was called just once per request.
I attached a workaround for that problem that tries to avoid init()
calls after the thing was initialized.
If this is intended behaviour it should be documented but I doubt that
this is a good idea as people don't read documentation it stuff seems to
work.
And google tells me I'm not the first one who invoked guess_type that
way: http://google.com/codesearch?q="from+mimetypes+import+guess_type" |
|
Date |
User |
Action |
Args |
2009-03-01 23:48:02 | aronacher | set | recipients:
+ aronacher |
2009-03-01 23:48:02 | aronacher | set | messageid: <1235951282.21.0.0432661498186.issue5401@psf.upfronthosting.co.za> |
2009-03-01 23:48:00 | aronacher | link | issue5401 messages |
2009-03-01 23:47:59 | aronacher | create | |
|