Author vstinner
Recipients amaury.forgeotdarc, andyharrington, barry, eric.araujo, erob, flox, georg.brandl, ggenellina, oopos, pebbe, pitrou, quentel, r.david.murray, tcourbon, tobias, v+python, vstinner
Date 2011-01-13.23:54:53
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1294962895.05.0.719397761716.issue4953@psf.upfronthosting.co.za>
In-reply-to
Content
I tested cgi_32.patch on Windows with Apache:
 - a test with a binary file works: I get a binary file instead of a text file
 - a test with a non-ASCII character (a\xe9b) works: the text is correctly decoded

I used the test script from full_source_and_error.zip.

Comments on cgi_32.patch:

 - I don't understand why FieldStorage changes sys.stdout and sys.stderr (see remarks about IOMix above): please remove the charset argument (it is also confusing to have two encoding arguments). it should be done somewhere else
 - please remove the O_BINARY hack: the patch is for Python 3.2 and I closed issue #10841. If you would like a backport, another patch should be written later
 - "encoding = 'latin-1' # ?": write a real comment or remove it
 - 'self.fp.read(...) # bytes': you should add a test on the type if you are not sure that fp.read() gives bytes
 - "file: the file(-like) object from which you can read the data *as bytes*": you should mention that TextIOWrapper is also tolerated (accepted?)
 - you may set fp directly to sys.stdin.buffer (instead of sys.stdin) if fp is None (it will be easier after removing the O_BINARY thing)
 - the patch adds a tab in an empty line, please don't do that :-)
 - you should add a (private?) attribute to FieldStorage to decide if it works on bytes or unicode, instead of using "self.filename is not None" test (eg. self._use_bytes = (self.filename is not None)
 - i don't like the idea of having a generic self.__write() method supporting bytes and unicode. it would prefer two methods, eg. self.__write_text() and self.__write_binary() (they can share a third private method)
 - i don't like "stream_encoding" name: what is the "stream" here? do you process a "file", a "string" or a "stream"? why not just "self.encoding"?
 - "import email.parser,email.feedparser" one import is useless here. I prefer "from email.feedparser import FeedParser" because you get directly a ImportError if the symbol is missing. And it's already faster to get FeedParser instead of email.feedparser.FeedParser in a loop (dummy micro-optimization)
 - even I like the following change, please do it in a separated patch:
-            if type(value) is type([]):
+            if isinstance(value,list):


I really don't like the IOMix thing:

 - sys.stdout.write() should not accept bytes
 - FieldStorage should not replace sys.stdout and sys.stderr: if you want to set the encoding of these files, set PYTHONIOENCODING environment variable before running your program (it changes also the encoding of sys.stdio)
 - IOMix should not accept bytes *and* unicode. It's better to have an explicit API like stdout.write('unicode') and stdout.buffer.write(b'bytes)


Most parts of the patch are correct and fix real bugs. Since cgi is broken currently (eg. it doesn't handle binary files correctly), anything making the situation better would be nice.

I vote +0 to commit the patch now (if the release manager agrees), and +1 if all of my remarks are fixed.
History
Date User Action Args
2011-01-13 23:54:55vstinnersetrecipients: + vstinner, barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 23:54:55vstinnersetmessageid: <1294962895.05.0.719397761716.issue4953@psf.upfronthosting.co.za>
2011-01-13 23:54:53vstinnerlinkissue4953 messages
2011-01-13 23:54:53vstinnercreate