Author quentel
Recipients amaury.forgeotdarc, barry, eric.araujo, erob, flox, ggenellina, gvanrossum, oopos, pebbe, pitrou, quentel, r.david.murray, tcourbon, tercero12, tobias, v+python
Date 2011-01-05.21:47:15
SpamBayes Score 3.56937e-14
Marked as misclassified No
Message-id <1294264043.01.0.580903663876.issue4953@psf.upfronthosting.co.za>
In-reply-to
Content
I agree that the only consistent solution is to impose that the attribute self.fp must read bytes in all cases, all required conversions should occur inside FieldStorage, using "some" encoding (not sure how to define it...)

If no argument fp is passed to __init__(), the instance uses the binary version of sys.stdin. In my patch I use sys.stdin.buffer, but it also works if I set it to sys.stdin.detach()

In all cases the interpreter must be launched with the -u option. As stated in the documentation, the effect of this option is to "force the binary layer of the stdin, stdout and stderr streams (which is available as their buffer attribute) to be unbuffered. The text I/O layer will still be line-buffered.". On my PC (Windows XP) this is required to be able to read all the data stream ; otherwise, only the beginning is read. I tried Glenn's suggestion with mscvrt, with no effect

I am working on the cgi.py module so that all tests (test_cgi and cgi_test) pass with binary streams. It's almost finished ; I had to adapt the tests, and sometimes fix bugs in them

Problems in test_cgi.py :
- in testQSAndFormData() string "data" should not begin with a line feed
- in testQSAndFormDataFile() : same thing as above + the argument to update result should be {'upload': b'this is the content of the fake file\n'} : bytes, ending with a line feed as in the string "data"
- in do_test(), for POST method, fp must be a BytesIO
- in test_fieldstorage_multipart(), expected value should be b'Testing 123.\n' for the third case (filename is not None, bytes expected, there is a line feed in string "data")

Problems in cgi_test.py
- data files mix headers (which should be strings) and POST data which should be read as bytes. In setup(), the file is opened in binary mode, the first two lines are read to initialize Content-Length and Content-Type, and an attribute encoding = 'latin-1' is set
- the tests showed warnings "ResourceWarning: unclosed file <_io.BufferedReader name='zenASCII.txt'>", I changed the code to avoid these warnings

I will send the results (diff for new version of cgi + tests) hopefully tomorrow
History
Date User Action Args
2011-01-05 21:47:23quentelsetrecipients: + quentel, gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, erob
2011-01-05 21:47:23quentelsetmessageid: <1294264043.01.0.580903663876.issue4953@psf.upfronthosting.co.za>
2011-01-05 21:47:15quentellinkissue4953 messages
2011-01-05 21:47:15quentelcreate