This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: cgi module cannot handle POST with multipart/form-data in 3.x
Type: behavior Stage: needs patch
Components: Library (Lib), Unicode Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, andyharrington, barry, eric.araujo, erob, flox, georg.brandl, ggenellina, grahamd, oopos, pebbe, quentel, r.david.murray, tcourbon, tobias, v+python, vstinner
Priority: normal Keywords: patch

Created on 2009-01-15 09:02 by oopos, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
full_source_and_error.zip oopos, 2009-01-15 14:34 full source and error show
opsuper.pl oopos, 2009-01-16 22:08 Perl code for process post files from form
unittest.zip tercero12, 2009-06-08 20:39 cgi.FieldStorage unittest for Multipart form-data and associated files.
http.zip quentel, 2011-01-02 08:51 Module cgi_new.py and tests
cgitest-python3.py r.david.murray, 2011-01-05 03:38
cgi_tests.zip quentel, 2011-01-13 08:41 cgi_test.py and associated files
adder.html andyharrington, 2011-01-13 20:52 uses get, works
adderpost.html andyharrington, 2011-01-13 20:52 just hanged get to post, makes adder.cgi hang
localCGIServer.py andyharrington, 2011-01-13 20:55 my wrapper around http.server to handle localhost
adder.cgi andyharrington, 2011-01-13 20:56 action script used by adder.html. adderpost.html, hangs with adderpost.html
Messages (130)
msg79892 - (view) Author: oopos (oopos) Date: 2009-01-15 09:02
Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win32
---------------------------------------------
Hi,I user Python! 
 
 
UnicodeDecodeError	Python 3.0: G:\cgi\python\python.exe
Thu Jan 15 16:46:34 2009

A problem occurred in a Python script. Here is the sequence of function
calls leading up to the error, in the order they occurred.
 G:\webserver\xampp\cgi-bin\testupload.py in ()
  107 
  108 # get form
  109 opsform = cgi.FieldStorage()
  110 
  111 print ("<br>","form-data:","<br>",opsform,"<br>")
opsform undefined, cgi = <module 'cgi' from 'G:\cgi\python\lib\cgi.py'>,
cgi.FieldStorage = <class 'cgi.FieldStorage'>
 G:\cgi\python\lib\cgi.py in __init__(self=FieldStorage(None, None, []),
fp=None, headers={'content-length': '671631', 'content-type':
'multipart/form-data;
boundary=---------------------------9699301019407'}, outerboundary='',
environ=<os._Environ object at 0x00C90C50>, keep_blank_values=0,
strict_parsing=0)
  477             self.read_urlencoded()
  478         elif ctype[:10] == 'multipart/':
  479             self.read_multi(environ, keep_blank_values,
strict_parsing)
  480         else:
  481             self.read_single()
self = FieldStorage(None, None, []), self.read_multi = <bound method
FieldStorage.read_multi of FieldStorage(None, None, [])>, environ =
<os._Environ object at 0x00C90C50>, keep_blank_values = 0,
strict_parsing = 0
 G:\cgi\python\lib\cgi.py in read_multi(self=FieldStorage(None, None,
[]), environ=<os._Environ object at 0x00C90C50>, keep_blank_values=0,
strict_parsing=0)
  597         # Create bogus content-type header for proper multipart
parsing
  598         parser.feed('Content-Type: %s; boundary=%s\r\n\r\n' %
(self.type, ib))
  599         parser.feed(self.fp.read())
  600         full_msg = parser.close()
  601         # Get subparts
parser = <email.feedparser.FeedParser object at 0x00DD5650>, parser.feed
= <bound method FeedParser.feed of <email.feedparser.FeedParser object
at 0x00DD5650>>, self = FieldStorage(None, None, []), self.fp =
<io.TextIOWrapper object at 0x00BE3FB0>, self.fp.read = <bound method
TextIOWrapper.read of <io.TextIOWrapper object at 0x00BE3FB0>>
 G:\cgi\python\lib\io.py in read(self=<io.TextIOWrapper object at
0x00BE3FB0>, n=-1)
 1722             # Read everything.
 1723             result = (self._get_decoded_chars() +
 1724                       decoder.decode(self.buffer.read(), final=True))
 1725             self._set_decoded_chars('')
 1726             self._snapshot = None
decoder = <encodings.gbk.IncrementalDecoder object at 0x00DB7AB0>,
decoder.decode = <built-in method decode of IncrementalDecoder object at
0x00DB7AB0>, self = <io.TextIOWrapper object at 0x00BE3FB0>, self.buffer
= <io.BufferedReader object at 0x00BE3F90>, self.buffer.read = <bound
method BufferedReader.read of <io.BufferedReader object at 0x00BE3F90>>,
final undefined

UnicodeDecodeError: 'gbk' codec can't decode bytes in position 157-158:
illegal multibyte sequence
      args = ('gbk',
b'-----------------------------9699301019407\r\n...-----------------------------9699301019407--\r\n',
157, 159, 'illegal multibyte sequence')
      encoding = 'gbk'
      end = 159
      object =
b'-----------------------------9699301019407\r\n...-----------------------------9699301019407--\r\n'
      reason = 'illegal multibyte sequence'
      start = 157
      with_traceback = <built-in method with_traceback of
UnicodeDecodeError object at 0x00DB7BF0>

l:\tmp_dir\tmpxyeojf.html contains the description of this error. 


-------------------------------------------
Hi,
I am newbie for python under the windows.
I find that the cgi module always work wrong for the binary files to upload.
I find that it cannot auto to discern the files' mode and alway use the
default mode : 'TEXT'.
So I want to change the sys.stdin 's mode to BINARY to support the
binary files.
I got this way:
 import msvcrt,os
 msvcrt.setmode(0,os.O_BINARY) # for stdin 
 msvcrt.setmode(1,os.O_BINARY) # for stdout
but it isn't work,too.
I know use C progam language can use this function:
  freopen("somefilename","mode","stdin or stdout") to redirect the file
flow.
Can every one help me ?

Best Regards
   oopos

I
msg79893 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-15 10:59
Does it work if you change your script like this:
   opsform = cgi.FieldStorage(open(sys.stdin.fileno(), 'rb'))
msg79894 - (view) Author: oopos (oopos) Date: 2009-01-15 11:58
To Amaury Forgeot d'Arc :

Thank you.

That error have sloved with your way:
[quote]Does it work if you change your script like this:
   opsform = cgi.FieldStorage(open(sys.stdin.fileno(), 'rb'))[/quote]

Now,The new problem come out:
[code]  97         """Push some new data into this object."""
 
   98         # Handle any previous leftovers
 
   99         data, self._partial = self._partial + data, ''
 
  100         # Crack into lines, but preserve the newlines on the end 
of each
 
  101         parts = NLCRE_crack.split(data)
 
data = b'-----------------------------7d91f41a302f4
\nCo...\x0e\x0f\x0c\x10\x17\x14\x18\x18\x17\x14\x16\x16', self = 
<email.feedparser.BufferedSubFile object at 0x00DD5270>, self._partial 
= '' 

TypeError: Can't convert 'bytes' object to str implicitly 
[/code]

I find that the CGI LIB didn't use bytes flow, it always use string 
flow.

More info in the attch file:
msg79896 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-15 12:43
OK, another try. Please replace the previous version with these 3 lines:

  encoding = os.environ.get('HTTP_TRANSFER_ENCODING')
  stdin = open(sys.stdin.fileno(), 'r', encoding=encoding)
  opsform = cgi.FieldStorage(stdin)
msg79898 - (view) Author: oopos (oopos) Date: 2009-01-15 14:34
Thank you for time.
Now,I try with you saied.Bu it is taken wrong as before.

See the files:
msg79901 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-15 16:45
Thanks for the test case. I reproduced it easily.
There is indeed a real problem in CGI streams.

The first thing to do is to start python with the -u option (add it to
the end of the first #! line), so that stdin yields bytes instead of
unicode chars, and \r\n are not translated on Windows.

Even then, I noticed that in the multipart/form-data section, text
fields are utf-8 encoded, but the file content is raw binary.
(FWIW, I use Firefox and Apache on Windows)
No encoding seems to be specified, neither in the content, nor in the
environment (no HTTP_TRANSFER_ENCODING)

And of course, the email.parser.FeedParser object used to parse it
accepts only unicode, not bytes.
Help needed.
msg79939 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-01-16 08:13
An attempt to more accurately describe the issue, to attract more 
knowledgeable people, I hope...
msg79981 - (view) Author: oopos (oopos) Date: 2009-01-16 22:08
Hehe.
I only want to use python to slove the upload files instead of PHP.but
it is hard to me or I have no much time to leran it.
Now,I learn Perl quickly and use it upload files, it works ok.

Thank you . I will take more time to learn Python language well.
Best Regards!

This is my code: (Perl Language)
msg80000 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-01-17 03:40
You should stick to Python 2.6 (or even 2.5) for web programming - 3.0 
is not mature enough. I thought this was a feasibility study on porting 
an existing application to Python 3.0 -- not your first steps in the 
language.
msg88960 - (view) Author: Timothy Farrell (tercero12) Date: 2009-06-05 18:29
I'm working on a web framework for Python 3.  Naturally this is a
blocker for me.  I was kinda expecting this to be addressed in 3.1 but
now that rc1 is out and I don't see anything about it, I'm wondering
about the status of this bug.  Can we get a status update?
msg88962 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-06-05 18:46
Can you provide a test case that clearly demonstrates the problem
(preferably a unit test, but anything easily reproducible will do)?  I'm
not sure what to do with the code attached to the case.
msg89112 - (view) Author: Timothy Farrell (tercero12) Date: 2009-06-08 20:39
I've attached unittest.zip.  Simply unzip it to a directory and run it.
 I've included a Python2.x version of the unittest for the sake of
clarity.  The 2.x version was developed on 2.4.  The 3.x version was
developed on 3.0.1 and 3.1rc1 (with identical results).

It seems that there are several issues with cgi.FieldStorage and
multi-part form data.

- Does Formstation read in a Bytes or String?
-- It seems to expect a String but this yields invalid results for
uploading files.
-- A stream of Bytes would make more sense but loses it Pythonic
"Batteries included" nature if the user has to decode the encoding
manually for each form field.
msg91444 - (view) Author: (tobias) Date: 2009-08-10 13:50
Actually, I think this whole issue is more complex. For example,
consider a (fictious) CGI script where users can upload an image and a
description and the script sends a success/error message in return.

In this case, one has to:
- read the HTTP request header from stdin as US-ASCII
- read the image from stdin as raw binary data
- read the description from stdin as a string in some encoding
- write the HTTP response header to stdout as US-ASCII
- write the response message to stdout in some (other) encoding
- write error messages to server log via stderr as US-ASCII
Also, there are cases when a cgi script should return binary data
instead (e.g., images or archive files) or apply a transfer encoding
(e.g., gzip).

Although FieldStorage only cares about reading, it still has to cope
with intermixed textual and binary data. So the only practical way in my
opinion is to use raw binary data and have FieldStorage decode strings
on demand, since only the programmer knows whether a field should
contain text or binary data. FieldStorage should offer two methods for
this purpose: one for reading binary data and another for reading and
decoding strings on-the-fly (possibly using a default encoding passed to
its constructor).
msg91449 - (view) Author: Timothy Farrell (tercero12) Date: 2009-08-10 14:42
I think you hit the nail on the head.  Now we just need (someone) to
code it.
msg91711 - (view) Author: Timothy Farrell (tercero12) Date: 2009-08-18 18:58
I thought I'd take a crack at this today.  I soon figured out the real
issue.  It is the email.parser module that handles the decoding of
Multipart/form-data things...and it is also still quote broken w.r.t.
handling Bytes.  So this issue is dependent on
http://bugs.python.org/issue4661 before it can be fixed.
msg91741 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2009-08-19 20:19
Please, please, please contact the email-sig and help pitch in.  For
many reasons I simply haven't had the cycles to work on this and I don't
see that happening any time soon.  There are folks willing to work on
the package in the email-sig and I will add my $0.02 with design
suggestions, but we really need manpower and motivation to get the email
package into shape.
msg95288 - (view) Author: Thomas Courbon (tcourbon) Date: 2009-11-15 13:09
*bump*

Hi there Pythoners !

As Timothy Farrell I'm currently working (or rather, toying since it
just for fun) on a Python3 web framework. I just started but when it
come to file upload I experience issues which I believe are connected to
that issue.

Just want if their was any progress since last messages or I should jump
in and try writing my own multipart parser (as I doubt I have the
required skill to contribute to the email parser).
msg95292 - (view) Author: Timothy Farrell (tercero12) Date: 2009-11-15 14:18
Perhaps this update should go in the linked email bug.  The email team
has a goal of fixing the email module in time for the 3.2 release.  I
also, feel as though I lack the skill to fix the email module, but it
goes beyond that since they're potentially having to change some
interfaces.  See this document:
http://wiki.python.org/moin/Email%20SIG/DesignThoughts

Once the email module is fixed, the cgi module will be trivial to fix. 
I'm confident enough to handle it.  I don't think anyone can give you a
date so you'll have to make the custom solution decision based on your
timeframe and patience.
msg95330 - (view) Author: Thomas Courbon (tcourbon) Date: 2009-11-16 08:47
It seems that there wasn't work on that issue (which look complicated by
the way). I'll wait, there is so much other aspects of a web framework
to play with :)
Thank anyway for the pointer.
msg107959 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2010-06-16 22:24
Could this be related to issue 8077?
msg108221 - (view) Author: Timothy Farrell (tercero12) Date: 2010-06-20 01:35
Yes, they are related but not quite the same.  The solution to this problem will likely include a fix to the problem described in issue 8077.
msg121864 - (view) Author: Glenn Linderman (v+python) * Date: 2010-11-21 05:24
Regarding http://bugs.python.org/issue4953#msg91444 POST with multipart/form-data encoding can use UTF-8, other stuff is restricted to ASCII!

From http://www.w3.org/TR/html401/interact/forms.html:
Note. The "get" method restricts form data set values to ASCII characters. Only the "post" method (with enctype="multipart/form-data") is specified to cover the entire [ISO10646] character set.

Hence cgi formdata can safely decode text fields using UTF-8 decoding (experimentally, that is the encoding used by Firefox to support the entire ISO10646 character set).
msg125035 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-02 08:51
Hi,

I have started working on the port of a simplified version of Karrigell (a web framework) to Python3. I experienced the same problem as the other posters : in the current version, file upload doesn't work. So I've been working on the cgi module for a few days and now have a version which correctly manages file uploads in the tests I made

The problem in the current version (3.2b2) is that all data is read from sys.stdin, which reads strings, not bytes. This obviously can't work properly to upload binary files. In the proposed version, for multipart/form-data type, all data is read as bytes from sys.stdin.buffer ; in the CGI script, the Python interpreter must be launched with the -u option, as suggested by Amaury, otherwise sys.stdin.buffer.read() only returns the beginning of the data stream

The headers inside the multipart/form-data are decoded to a string using sys.stdin.encoding and passed to a FeedParser (which requires strings) ; then the data is read from sys.stdin.buffer (bytes) until a boundary is found

If the field is a file, the file object in self.file stores bytes, and the attribute "value" is a byte string. If it is not a file, the value is decoded to a string, always using sys.stdin.encoding, as for all other fields for other types of forms

Other cosmetic changes :
- replaced "while 1" by "while True"
- replaced "if type(value) == type([])" by "if isintance(value,list)"

Attached file : zip with cgi_new.py and tests in a folder called "http"
Tested with Python 3.2b2 (r32b2:87398, Dec 19 2010, 22:51:00) [MSC v.1500 32 bit (Intel)] on win32 ; files and CGI scripts served by Apache 2.2
msg125065 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-02 16:19
Thank you very much for working on this!  I'll try to take a look at the patch soon.  A couple quick comments based on your posting: first, the email module now has a BytesFeedparser that will accept a byte stream, which I hope might simplify your patch.  Second, it would be very helpful if you could upload your patch as an 'svn diff' against the current py3k trunk (see python.org/dev for details on how to do that).  That will make review and application of the patch much much simpler.  (This would be true even if more of the code in cgi.py has changed than not.)  If you don't want to set up an svn checkout, then a context diff against the copy of cgi.py you started with would be second best.  Please post any files individually as .patch or .diff or .txt files...these are preferred in the tracker over .zip files because they can be viewed without downloading.
msg125086 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-02 19:59
I attach the svn diff file against the present version (generated by Tortoise SVN), hope it's what you expect
msg125088 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-02 20:08
Please ignore previous post. I worked on the version of cgi.py included in version 3.2b2, and I just realized there were changes commited to the svn repository since this version. I will post the diff file later, but you can always test the files in the zip file
msg125100 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-02 21:16
Pierre, thanks for your work on this.  I hope a fix can make it in to 3.2.

However, while starting Python with -u can help a but, that should not, in my opinion, be requirement to use CGI.  Rather, the stdin should be set into binary mode by the CGI processing... it would be helpful if the CGI module either did it automatically, verified it has been done, or at least provided a helper function that could do it, and that appropriate documentation be provided, if it is not automatic.  I've seen code like:

    try: # Windows needs stdio set for binary mode.
        import msvcrt
        msvcrt.setmode (0, os.O_BINARY) # stdin  = 0
        msvcrt.setmode (1, os.O_BINARY) # stdout = 1
        msvcrt.setmode (2, os.O_BINARY) # stderr = 2
    except ImportError:
        pass

and

        if hasattr( sys.stdin, 'buffer'):
            sys.stdin = sys.stdin.buffer

which together, seem to do the job.  For output, I use a little class that accepts either binary or text, encoding the latter:

    class IOMix():
        def __init__( self, fh, encoding="UTF-8"):
            if hasattr( fh, 'buffer'):
                self._bio = fh.buffer
                fh.flush()
                self._last = 'b'
                import io
                self._txt = io.TextIOWrapper( self.bio, encoding, None, '\r\n')
                self._encoding = encoding
            else:
                raise ValueError("not a buffered stream")
        def write( self, param ):
            if isinstance( param, str ):
                self._last = 't'
                self._txt.write( param )
            else:
                if self._last == 't':
                    self._txt.flush()
                self._last = 'b'
                self._bio.write( param )
        def flush( self ):
            self._txt.flush()
        def close( self ):
            self.flush()
            self._txt.close()
            self._bio.close()


        sys.stdout = IOMix( sys.stdout, encoding )
        sys.stderr = IOMix( sys.stderr, encoding )


IOMix may need a few more methods for general use, "print" comes to mind, for example.
msg125105 - (view) Author: Peter Kleiweg (pebbe) Date: 2011-01-02 21:24
Why not simply:

fp = sys.stdin.detach()
msg125106 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-02 21:27
Here is the correct diff file

I also introduced a test to exit from the loop in read_multi() if the total number of bytes read reaches "content-length". It was necessary for my framework, which uses cgi.FieldStorage to read from the attribute rfile defined in socketserver. Without this patch, the program hangs after receiving the number of bytes specified in content length. I work on a Windows XP PC so it might be related to the bug #427345 handled by server.CGIHTTPRequestHandler.run_cgi()
msg125108 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-02 21:36
Regarding the use of detach(), I don't know if it works.  Maybe it would.  I know my code works, because I have it working.  But if there are simpler solutions that are shown to work, that would be great.
msg125114 - (view) Author: Peter Kleiweg (pebbe) Date: 2011-01-02 22:43
Using platform-dependant code seems iffy to me. The detach function on sys.stdin, sys,stdout and sys.stderr is there specifically to switch these streams from text mode to binary mode. See: http://docs.python.org/py3k/library/sys.html#sys.stdin
msg125152 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-03 03:44
Peter, it seems that detach is relatively new (3.1) likely the code samples and suggestions that I had found to cure the problem predate that.  While I haven't yet tried detach, your code doesn't seem to modify stdin, so are you suggesting, really...

   sys.stdin = sys.stdin.detach()

or maybe

   if hasattr( sys.stdin, 'detach'):
        sys.stdin = sys.stdin.detach()

On the other hand, if detach, coded as above, is equivalent to 

   if hasattr( sys.stdin, 'buffer'):
        sys.stdin = sys.stdin.buffer

then I wonder why it was added.  So maybe I'm missing something in reading the documentation you pointed at, and also that at http://docs.python.org/py3k/library/io.html#io.TextIOBase.detach
both of which seem to be well-documented if you already have an clear understanding of the layers in the IO subsystem, but perhaps not so well-documented if you don't yet (and I don't).

But then you referred to the platform-dependent stuff... I don't see anything in the documentation for detach() that implies that it also makes the adjustments needed on Windows to the C-runtime, which is what the platform-dependent stuff I suggested does... if it does, great, but a bit more documentation would help in understanding that.  And if it does, maybe that is the difference between the two code fragments in this comment?  I would have to experiment to find out, and am not in a position to do that this moment.
msg125153 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-03 03:50
Rereading the doc link I pointed at, I guess detach() is part of the new API since 3.1, so doesn't need to be checked for in 3.1+ code... but instead, may need to be coded as:

    try:
        sys.stdin = sys.stdin.detach()
    except UnsupportedOperation:
        pass
msg125158 - (view) Author: Etienne Robillard (erob) Date: 2011-01-03 10:19
On 02/01/11 10:50 PM, Glenn Linderman wrote:
> Glenn Linderman <v+python@g.nevcal.com> added the comment:
>
> Rereading the doc link I pointed at, I guess detach() is part of the new API since 3.1, so doesn't need to be checked for in 3.1+ code... but instead, may need to be coded as:
>
>     try:
>         sys.stdin = sys.stdin.detach()
>     except UnsupportedOperation:
>         pass
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4953>
> _______________________________________
>   

Hi!

using "detach" would be great but I'm missing that method here in 2.7! :-)

erob@localhost:~$ python2.7
Python 2.7.1 (r271:86832, Jan  2 2011, 10:38:30)
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> sys.stdin.detach
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'sys' is not defined
>>> import sys
>>> sys.stdin.detach
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'file' object has no attribute 'detach'
msg125159 - (view) Author: Etienne Robillard (erob) Date: 2011-01-03 10:44
i'm thinking this issue is also well connected to:

http://bugs.python.org/issue1573931

so a backport of whatever solution comes to 3.2 would be a great
addition to Python 2.6 as the very minimum, in order to satisfy
minimal backward compatibility!

Thanks,

On 02/01/11 10:50 PM, Glenn Linderman wrote:
> Glenn Linderman <v+python@g.nevcal.com> added the comment:
>
> Rereading the doc link I pointed at, I guess detach() is part of the new API since 3.1, so doesn't need to be checked for in 3.1+ code... but instead, may need to be coded as:
>
>     try:
>         sys.stdin = sys.stdin.detach()
>     except UnsupportedOperation:
>         pass
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4953>
> _______________________________________
>
msg125178 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-03 14:45
Etienne: since this is about solving a 3.x specific problem, it will not get backported.  Issue 1573931 looks unrelated to me at a quick glance.  FYI, you will find that you *do* have detach in 2.7 if you open a file using the io subsystem (import io).  Of course, that isn't used for the std files in 2.7.

Glen: the new IO subsystem is a complete C layer on top of only the most basic of the C runtime stuff.  It does handle cross platform issues.  Given that, and given that the input to CGI *should* be bytes, I think letting an error raise if the stream is text and detatch isn't available is fine, though we might find we want to catch it to improve the error message with extra context.

Pierre: yes, that diff is what I was looking for.  I hope to have time to look it over later today.
msg125181 - (view) Author: Etienne Robillard (erob) Date: 2011-01-03 15:33
On 03/01/11 09:45 AM, R. David Murray wrote:
> R. David Murray <rdmurray@bitdance.com> added the comment:
>
> Etienne: since this is about solving a 3.x specific problem, it will not get backported.  Issue 1573931 looks unrelated to me at a quick glance.  FYI, you will find that you *do* have detach in 2.7 if you open a file using the io subsystem (import io).  Of course, that isn't used for the std files in 2.7.
>
> Glen: the new IO subsystem is a complete C layer on top of only the most basic of the C runtime stuff.  It does handle cross platform issues.  Given that, and given that the input to CGI *should* be bytes, I think letting an error raise if the stream is text and detatch isn't available is fine, though we might find we want to catch it to improve the error message with extra context.
>
> Pierre: yes, that diff is what I was looking for.  I hope to have time to look it over later today.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4953>
> _______________________________________
>   

Thanks for theses precisions, David.

So will cgi.FieldStorage still be usable in 3.x using 2.5 semantics ?
implementing the size argument
in the FieldStorage class would surely be a good fix for WSGI middlewares. 

Either ways (using the new io subsystem) or monkey-patching
cgi.FieldStorage so it accepts the size argument could probably helps to
resolve memory-usage issues with things like file uploads!

Regards
msg125201 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-03 17:33
So then David, is your suggestion to use

sys.stdin = sys.stdin.detach()

and you claim that the Windows-specific hacks are not needed in 3.x land?  The are, in 2.x land, I have proven empirically, but haven't been able to test CGI forms very well in 3.x because of this bug.  I will test 3.x download without the Windows-specific hack, and report how it goes.  My testing started with 2.x and has proceeded to 3.x, and it is not always obvious what hacks are no longer needed in 3.x.  Thanks for the info.
msg125217 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-03 19:09
Yes, that is my suggestion.  Keep in mind that I haven't looked at the patch or run any tests yet :)

If windows-specific hacks are needed to get the binary stream in 3.x, then IMO that's a bug in IO.  As far as I know at the moment there's no such bug :)
msg125235 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-03 20:45
David, Starting from a working (but hacked to work) version of http.server and using 3.2a1 (I should upgrade to the Beta, but I doubt it makes a difference at the moment), I modified

        # if hasattr( sys.stdin, 'buffer'):
        #     sys.stdin = sys.stdin.buffer
        sys.stdin = sys.stdin.detach()

and it all kept working.

Then I took out the

    try: # Windows needs stdio set for binary mode.
        import msvcrt
        msvcrt.setmode (0, os.O_BINARY) # stdin  = 0
        msvcrt.setmode (1, os.O_BINARY) # stdout = 1
        msvcrt.setmode (2, os.O_BINARY) # stderr = 2
    except ImportError:
        pass

and it quit working.  Seems that \r\r\n\r\r\n is not recognized by Firefox as the "end of the headers" delimiter.

Whether this is a bug in IO or not, I can't say for sure.  It does seem, though, that

1) If Python is fully replacing the IO layers, which in 3.x it seems to claim to, then it should fully replace them, building on a binary byte stream, not a "binary byte stream with replacement of \n by \r\n".  The Windows hack above replaces, for stdin, stdout, and stderr, a "binary byte stream with replacement of \n by \r\n" with a binary byte stream.  Seems like Python should do that, on Windows, so that it has a chance of actually knowing/controlling what gets generated.  Perhaps it does, if started with "-u", but starting with "-u" should not be a requirement for a properly functioning program. Alternately, the IO streams could understand, and toggle the os.O_BINARY flag, but that seems like it would require more platform-specific code than simply opening all Windows files (and adjusting preopened Windows files) during initialization.

2) The weird CGI processing that exists in the released version of http.server seems to cover up this problem, partly because it isn't very functional, claims "alternate semantics" (read: non-standard semantics), and invokes Python with -u when it does do so.  It is so non-standard that it isn't clear what should or should not be happening.  But the CGI scripts I am running, that pass or fail as above, also run on Windows 2.6, and particularly, Unix 2.6, in an Apache environment.  So I have been trying to minimize the differences to startup code, rather than add platform-specific tweaks throughout the CGI scripts.

That said, it clearly could be my environment, but I've debugged enough different versions of things to think that the Windows hack above is required on both 2.x and 3.x to ensure proper bytestreams.... and others must think so too, because I found the code by searching on Google, not because I learned enough Python internals to figure it out on my own.  The question I'm attempting to address here, is only that 3.x still needs the same hack that 2.x needs, on Windows, to create bytestreams.
msg125237 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-03 20:50
(and I should mention that all the "hacked to work" issues in my copy of http.server have been reported as bugs, on 2010-11-21. The ones of most interest related to this binary bytestream stuff are issue 10479 and issue 10480)
msg125241 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-03 21:12
Other version of the diff file. Nothing changed but I'm afraid I had left duplicate definitions of some methods in the FieldStorage class
I follow the discussion on this thread, but would like to know if the patch has been tested and works
msg125402 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-05 03:35
A day late, but I've looked at the patch.

Now, I'm not all that knowledgeable about CGI, so other people will probably want to chime in here....

First, I'm uploading a new version of the patch as an svn diff (can be applied to a checkout using 'patch -p0 <patchfile' from the top level directory of the checkout).  This includes Pierre's patch unchanged, and includes changes to test_cgi so that Pierre's patch is tested.  Some of the tests fail.

A couple of the failures have to do with file bodies being returned as binary when previously they were returned as strings.  This raises the issue of backward compatibility: if cgi/fieldstorage using applications exist for 3.1, changing this will break them.  There may not be a good solution to that problem.  But it also may not be possible to fix this in 3.2 at this point (which I seem to have already decided earlier, but I can't now remember why...).

From looking over the cgi code it is not clear to me whether Pierre's approach is simpler or more complex than the alternative approach of starting with binary input and decoding as appropriate.  From a consistency perspective I would prefer the latter, but I don't know if I'll have time to try it out before rc1.

I also wonder if it would be possible to rewrite FieldStorage to take even better advantage of FeedParser, but if so that would *certainly* not happen before rc1.
msg125403 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-05 03:38
Here is a modified version of the unittest file from unittest.zip that can be run against Pierre's code (it feeds FieldStorage a text stream with a buffer).  Running the tests require the data files from the zip.

They do not pass, in a very different way from the test_cgi failures.
msg125410 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-05 04:33
R. David said:
>From looking over the cgi code it is not clear to me whether Pierre's approach is simpler or more complex than the alternative approach of starting with binary input and decoding as appropriate.  From a consistency perspective I would prefer the latter, but I don't know if I'll have time to try it out before rc1.

I say:
I agree with R. David that an approach using the binary input seems more appropriate, as the HTTP byte stream is defined as binary.  Do the 3.2 beta email docs now include documentation for the binary input interfaces required to code that solution?  Or could you provide appropriate guidance and review, should someone endeavor to attempt such a solution?

The remaining concerns below are only concerns; they may be totally irrelevant, and I'm too ignorant of how the code works to realize their irrelevance.  Hopefully someone that understands the code can comment and explain.

I believe that the proper solution is to make cgi work if sys.stdin has already been converted to be a binary stream, or if it hasn't, to dive down to the underlying binary stream, using detach().  Then the data should be processed as binary, and decoded once, when the proper decoding parameters are known.  The default encoding seems to be different on different platforms, but the binary stream is standardized.  It looks like new code was added to attempt to preprocess the MIME data into chunks to be fed to the email parser, and while I can believe code could be written to do such correctly (but I can't speak for whether this patch code is correct or not), it seems redundant/inefficient and error-prone to do it once outside the email parser, and again inside it.

I also doubt that self.fp.encoding is consistent from platform to platform).  But the HTTP bytestream is binary, and self-describing or declared by HTTP or HTML standards for the parts that are not self-describing.  The default platform encoding used for the preopened sys.stdin is not particularly relevant and may introduce mojibake type bugs, decoding errors in the presence of some inputs, and/or platform inconsistencies, and it seems that that is generally where self.fp.encoding, used in various places in this patch, comes from.

Regarding the binary vs. text issue; when using both binary and text interfaces on output streams, there is the need to do flushing between text and binary writes to preserve the proper sequencing of data in the output.  For input, is it possible that mixing text and binary input could result in the binary input missing data that has already been preloaded into the text buffer?  Although, for CGI programs, no one should have done any text inputs before calling the CGI functions, so perhaps this is also not a concern... and there probably isn't any buffering on socket streams (the usual CGI use case) but I see the use of both binary and text input functions in this patch, so this may be another issue that someone could explain why such a mix is or isn't a problem.
msg125419 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-05 12:38
Yeah, the documentation for the email stuff is in the dev docs.  There's a short summary in the changes section of the email intro with links to the classes and methods that are affected.  But basically you call BinaryFeedParser and feed it a binary data, and everything else works just like it did before, including the fact that get_payload() with no arguments returns a string.  If there is non-ASCII data in that string and no charset was specified the binary data will get trashed though.  To get the binary data out you call it with decode=True.

I believe you are right that the io module does not support intermixing calls to the main object and its buffer attribute; that's why detach was introduced, I believe.  Antoine is nosy on this issue now; he can correct me if I'm wrong.

So unfortunately I think we do need to come at this starting from binary at the beginning and *decoding* as needed (I believe http uses latin-1 when no charset is specified, but I need to double check that).  That still leaves the problem of what if anything to do about existing programs that expect every value in a FieldStorage to be a string.  Introduce a new method or parameter for getting the binary version of the value, possibly with some flag indicating that parsing detected non-ascii data?
msg125426 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-05 14:32
> I believe you are right that the io module does not support intermixing calls to the main object and its buffer attribute

I’ve learned in a recent discussion on web-sig that you can mix them, provided that you call stream.flush() before using stream.buffer.
msg125428 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-05 15:10
> I believe you are right that the io module does not support intermixing 
> calls to the main object and its buffer attribute; that's why detach
> was introduced, I believe.

Writing is ok as long as you call flush() on the text layer when necessary. Reading is not since there's no official way to flush the input buffer on the text layer (assuming some input has been consumed, that is). detach() doesn't do anything special AFAIR.

(this is all funny in the light of the web-sig discussion where people explain that CGI is such a natural model)
msg125474 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-05 21:47
I agree that the only consistent solution is to impose that the attribute self.fp must read bytes in all cases, all required conversions should occur inside FieldStorage, using "some" encoding (not sure how to define it...)

If no argument fp is passed to __init__(), the instance uses the binary version of sys.stdin. In my patch I use sys.stdin.buffer, but it also works if I set it to sys.stdin.detach()

In all cases the interpreter must be launched with the -u option. As stated in the documentation, the effect of this option is to "force the binary layer of the stdin, stdout and stderr streams (which is available as their buffer attribute) to be unbuffered. The text I/O layer will still be line-buffered.". On my PC (Windows XP) this is required to be able to read all the data stream ; otherwise, only the beginning is read. I tried Glenn's suggestion with mscvrt, with no effect

I am working on the cgi.py module so that all tests (test_cgi and cgi_test) pass with binary streams. It's almost finished ; I had to adapt the tests, and sometimes fix bugs in them

Problems in test_cgi.py :
- in testQSAndFormData() string "data" should not begin with a line feed
- in testQSAndFormDataFile() : same thing as above + the argument to update result should be {'upload': b'this is the content of the fake file\n'} : bytes, ending with a line feed as in the string "data"
- in do_test(), for POST method, fp must be a BytesIO
- in test_fieldstorage_multipart(), expected value should be b'Testing 123.\n' for the third case (filename is not None, bytes expected, there is a line feed in string "data")

Problems in cgi_test.py
- data files mix headers (which should be strings) and POST data which should be read as bytes. In setup(), the file is opened in binary mode, the first two lines are read to initialize Content-Length and Content-Type, and an attribute encoding = 'latin-1' is set
- the tests showed warnings "ResourceWarning: unclosed file <_io.BufferedReader name='zenASCII.txt'>", I changed the code to avoid these warnings

I will send the results (diff for new version of cgi + tests) hopefully tomorrow
msg125481 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-05 23:41
R. David said:
(I believe http uses latin-1 when no charset is specified, but I need to double check that)

See http://bugs.python.org/issue4953#msg121864 ASCII and UTF-8 are what HTTP defines. Some implementations may, in fact, use latin-1 instead of ASCII in some places.  Not sure if we want Python CGI to do that or not.

Thanks for getting the email APIs in the docs... shouldn't have to bug you as much that way :)

Antoine said:
(this is all funny in the light of the web-sig discussion where people explain that CGI is such a natural model)

Thanks for clarifying the stdin buffering vs. binary issue... it is as I suspected.  Maybe you can also explain the circumstances in which "my" Windows code is needed, and whether Python's "-u" does it automatically, but I still believe that "-u" shouldn't be necessary for a properly functioning program, not even a CGI program... it seems like a hack to allow some programs to work without other changes, so might be a useful feature, but hopefully not a required part of invoking a CGI program.

The CGI interface is "self describing", when you follow the standards, and use the proper decoding for the proper pieces.  In that way, it is similar to email.  It is certainly not as simple as using UTF-8 everywhere, but compatibility with things invented before UTF-8 even existed somewhat prevents the simplest solution, and then not everything is text, either.  At least it is documented, and permits full UNICODE data to be passed around where needed, and permits binary to be passed around where that is needed, when the specs are adhered to.
msg125483 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-05 23:52
> Antoine said:
> (this is all funny in the light of the web-sig discussion where people
> explain that CGI is such a natural model)
> 
> Thanks for clarifying the stdin buffering vs. binary issue... it is as
> I suspected.  Maybe you can also explain the circumstances in which
> "my" Windows code is needed, and whether Python's "-u" does it
> automatically, but I still believe that "-u" shouldn't be necessary
> for a properly functioning program, not even a CGI program...

Could you open a separate bug with a simple piece of code to reproduce
the issue (preferably without launching an HTTP server :))?
msg125501 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-06 02:12
Pierre said:
In all cases the interpreter must be launched with the -u option. As stated in the documentation, the effect of this option is to "force the binary layer of the stdin, stdout and stderr streams (which is available as their buffer attribute) to be unbuffered. The text I/O layer will still be line-buffered.". On my PC (Windows XP) this is required to be able to read all the data stream ; otherwise, only the beginning is read. I tried Glenn's suggestion with mscvrt, with no effect

I say:
If you start the interpreter with -u, then my mscvrt has no effect.  Without it, there is an effect.  Read on...

Antoine said:
Could you open a separate bug with a simple piece of code to reproduce
the issue (preferably without launching an HTTP server :))?

I say:
issue 10841
msg125524 - (view) Author: Etienne Robillard (erob) Date: 2011-01-06 08:52
On 05/01/11 09:12 PM, Glenn Linderman wrote:
> Glenn Linderman <v+python@g.nevcal.com> added the comment:
>
> Pierre said:
> In all cases the interpreter must be launched with the -u option. As stated in the documentation, the effect of this option is to "force the binary layer of the stdin, stdout and stderr streams (which is available as their buffer attribute) to be unbuffered. The text I/O layer will still be line-buffered.". On my PC (Windows XP) this is required to be able to read all the data stream ; otherwise, only the beginning is read. I tried Glenn's suggestion with mscvrt, with no effect
>
> I say:
> If you start the interpreter with -u, then my mscvrt has no effect.  Without it, there is an effect.  Read on...
>
> Antoine said:
> Could you open a separate bug with a simple piece of code to reproduce
> the issue (preferably without launching an HTTP server :))?
>
> I say:
> issue 10841
>
> ----------
>
>   

Thats a quite annoying response. whats the purposes of a "option" switch
if it becomes
mandatory ? Are you refering to Windows only users ?

I would prefer a way to programmatically allow FieldStorage to use
HTTP_TRANSFER_ENCODING
if available, to select a matching encoding...

Thanks
msg125533 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-06 09:28
Etienne, I'm not sure what you are _really_ referring to by HTTP_TRANSFER_ENCODING.  There is a TRANSFER_ENCODING defined by HTTP but it is completely orthogonal to character encoding issues.  There is a CONTENT_ENCODING defined which is a character encoding, but that is either explicit in the MIME data, or assumed to be either ASCII or UTF-8, in certain form data contexts.

Because the HTTP protocol is binary, only selected data, either explicitly or implicitly (by standard definition) should be decoded, using the appropriate encoding.  FieldStorage should be able to (1) read a binary stream (2) do the appropriate decoding operations (3) return the data as bytes or str as appropriate.

Right now, I'm mostly interested in the fact that it doesn't do (1), so it is hard to know what it does for (2) or (3) because it gets an error first.
msg125543 - (view) Author: Etienne Robillard (erob) Date: 2011-01-06 10:02
yes, lets not complexify anymore please...
> Because the HTTP protocol is binary, only selected data, either explicitly or implicitly (by standard definition) should be decoded, using the appropriate encoding.  FieldStorage should be able to (1) read a binary stream (2) do the appropriate decoding operations (3) return the data as bytes or str as appropriate.
>
> Right now, I'm mostly interested in the fact that it doesn't do (1), so it is hard to know what it does for (2) or (3) because it gets an error first.
>
> ----------
>   
according to rfc2616...

"Transfer-codings are analogous to the Content-Transfer-Encoding values
of MIME [7], which were designed to enable safe transport of binary data
over a 7-bit transport service. However, safe transport has a different
focus for an 8bit-clean transfer protocol. In HTTP, the only unsafe
characteristic of message-bodies is the difficulty in determining the
exact body length (section 7.2.2
<http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.2>), or
the desire to encrypt data over a shared transport."

I may have not fully understood that part. Is "chunked" encoding what's
being used in MIME to allow
large file uploads and properly handle multipart POST requests?

Thanks,
msg125546 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-06 10:11
Etienne said:
yes, lets not complexify anymore please...

Albert Einstein said:
Things should be as simple as possible, but no simpler.

I say:
My "learning" of HTTP predates "chunked".  I've mostly heard of it being used in downloads rather than uploads, but I'm not sure if it pertains to uploads or not.  Since all the data transfer is effectively chunked by TCP/IP into packets, I'm not clear on what the benefit is, but I am pretty sure it is off-topic for this bug, at least until FieldStorage works at all on 3.x, like for small pieces of data.

I meant to say in my preceding response, that the multiple encodings that may be found in an HTTP stream, make it inappropriate to assign an encoding to the file through which the HTTP data streams... that explicit decode calls by FieldStorage should take place on appropriate chunks only.  I almost got there, so maybe you picked it up.  But I didn't quite say it.
msg125556 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-06 14:46
I tried full_source_and_error.zip on Windows and it failed. With stdio_binary.patch (attached to #10841), it works but I get an unicode file instead of a binary file. With stdio_binary.patch+cgi_plus_tests.diff it works as expected: I get a binary file (bytes).

But I don't understand why I have to pass an text stream (sys.stdin) instead of a binary stream (sys.stdin.buffer) to the FieldStorage constructor.
msg125557 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-06 15:39
Haypo: I believe that the consensus we've come to is that you shouldn't have to.  FieldStorage should take a binary stream.  So should cgi.parse.  If defaulting to sys.stdin, then if stdin is text, they should turn it in to a binary stream right at the start.

None of which is true right now, and this presents some backward compatibility problems.
msg125558 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-06 15:40
> Haypo: I believe that the consensus we've come to is that you
> shouldn't have to.  FieldStorage should take a binary stream.  So
> should cgi.parse.  If defaulting to sys.stdin, then if stdin is text,
> they should turn it in to a binary stream right at the start.

Is mutating sys.stdin really a good idea? Or am I misunderstand your
proposal?
msg125563 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-06 16:26
Ah, you are right.  That makes the backward compatibility issue a lot worse :(
msg125570 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-06 17:33
About the backward compatibility: does anyone use CGI with Python3? It looks like the module is broken (at least to upload files). If not, it's maybe better to fix it today than having to maintain a "broken" API.

Python 3.2 has many incompatibles changes with Python 3.1 to fix bugs. Some examples:
 - No more sys.setfilesystemencoding() function
 - ctypes and struct doesn't convert implicitly str to bytes
 - mbcs encoding raise an error if the error handler is not supported
 - etc.
msg125583 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-06 19:46
We have several, myself included, that can't use CGI under 3.x because it doesn't take a binary stream.

I believe there are several alternatives:
1) Document that CGI needs a binary stream, and expect the user to provide it, either an explicit handle, or by tweaking sys.stdin before calling with a default file stream.
2) Provide a CGI function for tweaking sys.stdin (along with #1)
3) Document that CGI will attempt to convert passed in streams, default or explicit, to binary, if they aren't already, and implement the code to do so.

My choice is #3.  I see CGI as being used only in HTTP environments, where the data stream should be binary anyway.
msg125629 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-07 08:14
Option 1 is impossible, because the CGI script sometimes has no control on the stream : for instance on a shared web host, it will receive sys.stdin as a text stream

I also vote for option 3 ; explaining that if no argument is passed, the program will use sys.stdin.buffer (or the result of sys.stdin.detach() : I guess it's the same ?), and that if an argument is passed, it must provide an attribute "buffer" (or a method detach() ?) as the binary layer of the stream

BTW, I didn't have time to finish the versions of cgi.py and tests, my next slot is this week-end
msg125633 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-07 09:15
We may also accept TextIOWrapper (eg. sys.stdin) *and*
BufferedReader/FileIO (eg. sys.stdin.buffer). It is possible to test the
type of the stream. With a TextIOWrapper, the raw buffer can be read
using stream.buffer.

But for StringIO/BytesIO: only BytesIO should be accepted.
msg125637 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-07 09:50
Pierre said:
Option 1 is impossible, because the CGI script sometimes has no control on the stream : for instance on a shared web host, it will receive sys.stdin as a text stream

I say:
It is the user code of the CGI script that calls CGI.FieldStorage.  So the user could be required (option 1) to first tweak the stdin to be bytes, one way or another.  I don't understand any circumstance where a Python CGI script doesn't have control over the settings of the Python IO Stack that it is using to obtain the data... and the CGI spec is defined as a bytestream, so it must be able to read the bytes.

Victor said:
It is possible to test the type of the stream.

I say:
Yes, why just assume (as I have been) that the initial precondition is the defaults that Python imposes.  Other code could have interposed something else.  The user should be allowed to pass in anything that is a TextIOWrapper, or a BytesIO, and CGI should be able to deal with it.  If the user passes some other type, it should be assumed to produce bytes from its read() API, and if it doesn't the user gets what he deserves (an error).  Since the default Python sys.stdin is a TextIOWrapper, having CGI detect that, and extract its .buffer to use for obtaining bytes, should work fine.  If the user already tweaked sys.stdin to be a BytesIO (.buffer or detach()), CGI should detect and use that.  If the user substitutes a different class, it should be bytes, and that should be documented, the three cases that could work.
msg125698 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-07 19:50
I fixed #10841 (r87824): stdin (and all other files) is now set to binary (instead of text) mode on Windows.
msg125839 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-09 12:26
Here is the diff file for the revised version of cgi.py

FieldStorage tests if the stream is an instance of (a subclass of) io.TextIOBase. If true, data is read from its attribute buffer ; if it hasn't one (eg for StringIO instances), an AttributeException is raised. Should we have a more specific exception ?
If false, the stream's method read() is supposed to return bytes ; an exception will be raised if it's not the case

The encoding used to decode keys and values to strings is the attribute "encoding" of the stream, or "latin-1" if this attribute doesn't exist

Besides FieldStorage, I modified the  parse() function at module level, but not parse_multipart (should it be kept at all ?)

I leave the code to set sys.stdin to binary on Windows for the moment, but it can be removed in the final version thanks to Victor's fix of issue 10841

I modified cgi_test.py and test_cgi.py (sent in a next post), all the tests pass with the revised version of cgi.py on my PC

While testing the patch I found other related things that I suppose should be changed (but need to check again - perhaps there are already tracker issues about them) :
- in http.server.CGIHTPPRequestHandler, the -u option should be removed (line 1123)
- on Windows, http.server.SimpleHTTPRequestHandler.list_directory() fails with Arabic characters (mbcs encoding fails, utf-8 works)
- in urllib.parse.unquote(), default encoding should be latin-1, not utf-8 (submitting a simple form with French accented characters raises a UnicodeEncodeError when trying to print the submitted value)
msg125840 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-09 12:28
cgi tests
msg125885 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 08:52
This looks much simpler than the previous patch.  However, I think it can be further simplified. This is my first reading of this code, however, so I might be totally missing something(s).

Pierre said:
Besides FieldStorage, I modified the  parse() function at module level, but not parse_multipart (should it be kept at all ?)

I say:
Since none of this stuff works correctly in 3.x, and since there are comments in the code about "folding" the parse* functions into FieldStorage, then I think they could be deprecated, and not fixed.  If people are still using them, by writing code to work around their deficiencies, that code would continue to work for 3.2, but then not in 3.3 when that code is removed?  That seems reasonable to me.  In this scenario, the few parse* functions that are used by FieldStorage should be copied into FieldStorage as methods (possibly private methods), and fixed there, instead of being fixed in place.  That was all the parse* functions could be deprecated, and the use of them would be unchanged for 3.2.

Since RFC 2616 says that the HTTP protocol uses ISO-8859-1 (latin-1), I think that should be required here, instead of deferring to fp.encoding, which would eliminate 3 lines.

Also, the use of FeedParser could be replaced by BytesFeedParser, thus eliminating the need to decode header lines in that loop.

And, since this patch will be applied only to Python 3.2+, the mscvrt code can be removed (you might want a personal copy with it for earlier version of Python 3.x, of course).

I wonder if the 'ascii' reference should also be 'latin-1'?

In truly reading and trying to understand this code to do a review, I notice a deficiency in _parseparam and parse_header: should I file new issues for them? (perhaps these are unimportant in practice; I haven't seen \ escapes used in HTTP headers).  RFC 2616 allows for "" which are handled in _parseparam.  And for \c inside "", which is handled in parse_header.  But: _parseparam counts " without concern for \", and parse_header allows for \\ and \" but not \f or \j or \ followed by other characters, even though they are permitted (but probably not needed for much).

In make_file, shouldn't the encoding and newline parameters be preserved when opening text files?  On the other hand, it seems like perhaps we should leverage the power of IO to do our encoding/decoding... open the file with the TextIOBase layer set to the encoding for the MIME part, but then just read binary without decoding it, write it to the .buffer of the TextIOBase, and when the end is reached, flush it, and seek(0).  Then the data can be read back from the TextIOBase layer, and it will be appropriate decoded.  Decoding errors might be deferred, but will still occur.  This technique would save two data operations: the explicit decode in the cgi code, and the implicit encode in the IO layers, so resources would be saved.  Additionally, if there is a CONTENT-LENGTH specified for non-binary data, the read_binary method should be used for it also, because it is much more efficient than readlines... less scanning of the data, and fewer outer iterations.  This goes well with the technique of leaving that data in binary until read from the file.

It seems that in addition to fixing this bug, you are also trying to limit the bytes read by FieldStorage to some maximum (CONTENT_LENGTH).  This is good, I guess.  But skip_lines() has a readline potentially as long as 32KB, that isn't limited by the maximum.  Similar in read_lines_to_outer_boundary, and read_lines_to_eof (although that may not get called in the cases that need to be limited).  If a limit is to be checked for, I think it should be a true, exact limit, not an approximate limit.

See also issue 10879.
msg125886 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 09:31
Also, the required behavior of make_file changes, to need the right encoding, or binary, so that needs to be documented as a change for people porting from 2.x. It would be possible, even for files, which will be uploaded as binary, for a user to know the appropriate encoding and, if the file is to be processed rather than saved, supply that encoding for the temporary file.  So the temporary file may not want to be assumed to be binary, even though we want to write binary to it.  So similarly to the input stream, if it is TextIOBase, we want to write to the .buffer.
msg125892 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 10:13
I wrote:
Additionally, if there is a CONTENT-LENGTH specified for non-binary data, the read_binary method should be used for it also, because it is much more efficient than readlines... less scanning of the data, and fewer outer iterations.  This goes well with the technique of leaving that data in binary until read from the file.

I further elucidate:
Sadly, while the browser (Firefox) seems to calculate an overall CONTENT-LENGTH for the HTTP headers, it does not seem to calculate CONTENT-LENGTH for individual parts, not even file parts where it would be extremely helpful.
msg125893 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 10:23
It seems the choice of whether to make_file or StringIO is based on the existence of self.length... per my previous comment, content-length doesn't seem to appear in any of the multipart/ item headers, so it is unlikely that real files will be created by this code.

Sadly that seems to be the case for 2.x also, so I wonder now if CGI has ever properly saved files, instead of buffering in memory...

I'm basing this off the use of Firefox Live HTTP headers tool.
msg125900 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-10 13:03
Some comments on cgi_diff_20110109.txt, especially on FieldStorage 
constructor.

Le dimanche 09 janvier 2011 13:26:24, vous avez écrit :
> Here is the diff file for the revised version of cgi.py

+            import msvcrt
+            msvcrt.setmode (0, os.O_BINARY) # stdin  = 0
+            msvcrt.setmode (1, os.O_BINARY) # stdout = 1
+            msvcrt.setmode (2, os.O_BINARY) # stderr = 2

Why do you change stdout and stderr mode? Is it needed? Instead of 0, you 
should use sys.stdin.fileno() with a try/except on .fileno() because stdin can 
be a StringIO object:

   >>> o=io.StringIO()    
   >>> o.fileno()
   io.UnsupportedOperation: fileno

I suppose that it's better to do nothing if sys.stdin has no .fileno() method.

More generally, I don't think that the cgi module should touch sys.stdin mode: 
it impacts the whole process, not only the cgi module. Eg. change sys.stdin 
mode in Python 3.1 will break the interperter because the Python parser in 
Pytohn 3.1 doesn't know how to handle \r\n end of line. If you need binary 
stdin, I should backport my patch for #10841 (for std*, FileIO and the 
parser).

----
def __init__(self, fp=None, headers=None, outerboundary="",
             environ=os.environ, keep_blank_values=0, strict_parsing=0,
             limit=None):
...
if 'QUERY_STRING' in environ:
   qs = environ['QUERY_STRING']
elif sys.argv[1:]:
   qs = sys.argv[1]
else:
   qs = ""
fp = BytesIO(qs.encode('ascii')) # bytes
----

With Python 3.2, you should use environ=environ.os.environb by default to 
avoid unnecessary conversion (os.environb --decode--> os.environ --encode--> 
qs). To decode sys.argv, ASCII is not the right encoding: you should use 
qs.encode(locale.getpreferredencoding(), 'surrogateescape') because Python 
decodes the environment and the command line arguments from 
locale.getpreferredencoding()+'surrogateescape', so it is the exact reverse 
operation and you get the original raw bytes.

For Python 3.1, use also qs.encode(locale.getpreferredencoding(), 
'surrogateescape') to encode the environment variable.

So for Python 3.2, it becomes something like:
----
def __init__(self, fp=None, headers=None, outerboundary="",
             environ=os.environb, keep_blank_values=0, strict_parsing=0,
             limit=None):
...
if 'QUERY_STRING' in environ:
   qs = environ[b'QUERY_STRING']
elif sys.argv[1:]:
   qs = sys.argv[1]
else:
   qs = b""
if isinstance(qs, str):
   encoding = locale.getpreferredencoding()
   qs = qs.encode(encoding, 'surrogateescape'))
fp = BytesIO(qs)
----
If you would like to support byte *and* Unicode environment (eg. 
environ=os.environ and environ=os.environb), you should do something a little 
bit more complex: see os.get_exec_path(). I can work on a patch if you would 
like to. A generic function should maybe be added to the os module, function 
with an optional environ argument (as os.get_exec_path()).

---
if fp is None:
   fp = sys.stdin
if fp is sys.stdin:
   ...
---
you should use sys.stdin.buffer if fp is None, and accept sys.stdin.buffer in 
the second test. Something like:
---
stdin = sys.stdin
if isinstance(fp,TextIOBase):
   stdin_buffer = stdin.buffer
else:
   stdin_buffer = stdin
if fp is None:
   fp = stdin_buffer
if fp is stdin or fp is stdin_buffer:
   ...
---

Don't you think that a warning would be appropriate if sys.stdin is passed 
here?
---
        # self.fp.read() must return bytes
        if isinstance(fp,TextIOBase):
            self.fp = fp.buffer
        else:
            self.fp = fp
---
Maybe a DeprecationWarning if we would like to drop support of TextIOWrapper 
later :-)

For the else case: you should maybe add a strict test on the type, eg. check 
for RawIOBase or BufferedIOBase subclass, isinstance(fp, (io.RawIOBase, 
io.BufferedIOBase)). It would avoid to check that fp.read() returns a bytes 
object (or get an ugly error later).

Set sys.stdin.buffer.encoding attribute is not a good idea. Why do you modify 
fp, instead of using a separated attribute on FieldStorage (eg. 
self.fp_encoding)?
---
        # field keys and values (except for files) are returned as strings
        # an encoding is required to decode the bytes read from self.fp
        if hasattr(fp,'encoding'):
            self.fp.encoding = fp.encoding
        else:
            self.fp.encoding = 'latin-1' # ?
---

I only read the constructor code.
msg125901 - (view) Author: Etienne Robillard (erob) Date: 2011-01-10 13:07
On 10/01/11 05:23 AM, Glenn Linderman wrote:
> I'm basing this off the use of Firefox Live HTTP headers tool.
>
>   

is sendfile() available on Windows ? i thought the Apache server could
use that
to upload files without having to buffer files in memory..

HTH,

-- 

Etienne Robillard

Company: Green Tea Hackers Club
Occupation: Software Developer
E-mail:     erob@gthcfoundation.org
Work phone: +1 514-962-7703
Website (Company):  https://www.gthc.org/
Website (Blog):     https://www.gthc.org/blog/
PGP public key fingerprint:    F2A9 32EA 8E7C 460F 1728  A1A7 649C 7F17 A086 DDEC

During times of universal deceit, telling the truth becomes a revolutionary act. -- George Orwell
msg125921 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 19:56
Victor said:
Don't you think that a warning would be appropriate if sys.stdin is passed 
here?
---
        # self.fp.read() must return bytes
        if isinstance(fp,TextIOBase):
            self.fp = fp.buffer
        else:
            self.fp = fp
---
Maybe a DeprecationWarning if we would like to drop support of TextIOWrapper 
later :-)

I say:
I doubt we ever want to Deprecate the use of "plain stdin" as the default (or as an explicit) parameter for FieldStorage's fp parameter.  Most usage of FieldStorage will want to use stdin; if FieldStorage detects that stdin is TextIOBase (generally it is) and uses its buffer to get binary data, that is very convenient for the typical CGI application.  I think I agree with the rest of your comments.

Etienne said:
is sendfile() available on Windows ? i thought the Apache server could
use that to upload files without having to buffer files in memory..

I say:
I don't think it is called that, but similar functionality may be available on Windows under another name.  I don't know if Apache uses it or not.  But I have no idea how FieldStorage could interact with Apache via the CGI interface, to access such features.  I'm unaware of any APIs Apache provides for that purpose, but if there are some, let me know.  On the other hand, there are other HTTP servers besides Apache to think about. 

I'm also not sure if sendfile() or equivalent, is possible to use from within FieldStorage, because it seems in practice we don't know the size of the uploaded file without parsing it (which requires buffering it in memory to look at it).
msg125926 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-10 20:30
@Glenn
"Also, the use of FeedParser could be replaced by BytesFeedParser, thus eliminating the need to decode header lines in that loop."

BytesFeedParser only uses the ascii codec ; if the header has non ASCII characters (filename in a multipart/form-data), they are replaced by ? : the original file name is lost. So for the moment I leave the text version of FeedParser

@Victor :
"you should use qs.encode(locale.getpreferredencoding(), 'surrogateescape')"
Ok, I changed the code to that

"Maybe a DeprecationWarning if we would like to drop support of TextIOWrapper later :-)"
Maybe I'm missing something here, but sys.stdin is always a TextIOWrapper instance, even if set to binary mode

"For the else case: you should maybe add a strict test on the type, eg. check for RawIOBase or BufferedIOBase subclass, isinstance(fp, (io.RawIOBase, io.BufferedIOBase)). It would avoid to check that fp.read() returns a bytes object (or get an ugly error later)."

Rejecting non-instances of RawIOBase or BufferedIOBase is too much, I think. Any class whose instances have a read() method that return bytes should be accepted, like the TestReadLine class in test_cgi.py

"Set sys.stdin.buffer.encoding attribute is not a good idea. Why do you modify fp, instead of using a separated attribute on FieldStorage (eg. self.fp_encoding)?"

I set an attribute encoding to self.fp because, for each part of a multipart/form-data, a new instance of FieldStorage is created, and this instance needs to know how to decode bytes. So, either an attribute must be set to one of the arguments of the FieldStorage constructor, and fp comes to mind, or an extra argument has to be passed to this constructor, i.e. the encoding of the original stream
msg125928 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 20:41
Victor said:
"Set sys.stdin.buffer.encoding attribute is not a good idea. Why do you modify fp, instead of using a separated attribute on FieldStorage (eg. self.fp_encoding)?"

Pierre said:
I set an attribute encoding to self.fp because, for each part of a multipart/form-data, a new instance of FieldStorage is created, and this instance needs to know how to decode bytes. So, either an attribute must be set to one of the arguments of the FieldStorage constructor, and fp comes to mind, or an extra argument has to be passed to this constructor, i.e. the encoding of the original stream

I say:
Ah, now I understand why you did it that way, but:

The RFC 2616 says the CGI stream is ISO-8859-1 (or latin-1).  The _defined_ encoding of the original stream is irrelevant, in the same manner that if it is a text stream, that is irrelevant.  The stream is binary, and latin-1, or it is non-standard.  Hence, there is not any reason to need a parameter, just use latin-1. If non-standard streams are to be supported, I suppose that would require a parameter, but I see no need to support non-standard streams: it is hard enough to support standard streams without complicating things.  The encoding provided with stdin is reasonably unlikely to be latin-1: Linux defaults to UTF-8 (at least on many distributions), and Windows to CP437, and in either case is configurable by the sysadmin.  But even the sysadmin should not be expected to configure the system locale to have latin-1 as the default encoding for the system, just because one of the applications that might run is an CGI program.  So I posit that the encoding on fp is irrelevant and should be ignored, and using it as a parameter between FieldStorage instances is neither appropriate nor necessary, as the standard defines latin-1 as the encoding for the stream.
msg125930 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-10 21:11
I don't have time to review the patch or even respond in detail to the comments right now, but I do want to respond about BytesFeedParser.  It is true that there is currently no interface to get the raw-bytes version of the header back out of the Message object, even though it still has it when constructed via BytesFeedParser.  This is an API oversight that needs to be rectified.
msg125931 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-10 21:28
> I set an attribute encoding to self.fp because, for each part 
> of a multipart/form-data, a new instance of FieldStorage is created,
> and this instance needs to know how to decode bytes.

Set fp.encoding may raise an error (eg. for a read-only object, or an object implemented in C). You should add a new argument to the constructor.

> Maybe I'm missing something here, but sys.stdin is always
> a TextIOWrapper instance, even if set to binary mode

I mean: you should pass sys.stdin.buffer instead of sys.stdin.
msg125935 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-10 21:55
@Glenn
" The _defined_ encoding of the original stream is irrelevant, in the same manner that if it is a text stream, that is irrelevant.  The stream is binary, and latin-1, or it is non-standard"

I wish it could be as simple, but I'm afraid it's not. On my PC, sys.stdin.encoding is cp-1252. I tested a multipart/form-data with an INPUT field, and I entered the euro character, which is encoded  \x80 in cp-1252

If I use the encoding defined for sys.stdin (cp-1252) to decode the bytes received on sys.stdin.buffer, I get the correct value in the cgi script ; if I set the encoding to latin-1 in FieldStorage, since \x80 maps to undefined in latin-1, I get a UnicodeEncodeError if I try to print the value ("character maps to <undefined>")
msg125952 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-10 23:07
Victor said:
I mean: you should pass sys.stdin.buffer instead of sys.stdin.

I say:
That would be possible, but it is hard to leave it at default, in that case, because sys.stdin will, by default, not be a binary stream.  It is a convenience for FieldStorage to have a useful default for its input, since RFC 3875 declares that the message body is obtained from "standard input".

Pierre said:
I wish it could be as simple, but I'm afraid it's not. On my PC, sys.stdin.encoding is cp-1252. I tested a multipart/form-data with an INPUT field, and I entered the euro character, which is encoded  \x80 in cp-1252

If I use the encoding defined for sys.stdin (cp-1252) to decode the bytes received on sys.stdin.buffer, I get the correct value in the cgi script ; if I set the encoding to latin-1 in FieldStorage, since \x80 maps to undefined in latin-1, I get a UnicodeEncodeError if I try to print the value ("character maps to <undefined>")

I say:
Interesting. I'm curious what your system (probably Windows since you mention cp-) and browser, and HTTP server is, that you used for that test.  Is it possible to capture the data stream for that test?  Describe how, and at what stage the data stream was captured, if you can capture it.  Most interesting would be on the interface between browser and HTTP server.

RFC 3875 states (section 4.1.3) what the default encodings should be, but I see that the first possibility is "system defined".  On the other hand, it seems to imply that it should be a system definition specifically defined for particular media types, not just a general system definition such as might be used as a default encoding for file handles... after all, most Web communication crosses system boundaries.  So lacking a system defined definition for text/ types, it then indicates that the default for text/ types is Latin-1.

I wonder what result you get with the same browser, at the web page http://rishida.net/tools/conversion/ by entering the euro symbol into the Characters entry field, and choosing convert.
msg125992 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-11 10:06
I said:
I wonder what result you get with the same browser, at the web page http://rishida.net/tools/conversion/ by entering the euro symbol into the Characters entry field, and choosing convert.

But I couldn't wait, so I ran a test with € in one of my input boxes, using Firefox, a FORM as:
<form enctype="multipart/form-data" method="post" action="...

and below is the Live Headers report.  I note several things that seem relevant to this issue.

1) The character encoding isn't specified anywhere.  In fact, the only content-type specification is the multipart/form-data in the environment.

2) Except for the Euro, everything in the data stream is ASCII (but could be ISO-8859-1, or latin-1).  

3) Looking separately at the byte stream read by my experimental version of cgi.py which prints the bytes as they are read, I see that the encoding of the Euro is UTF-8:  '\xe2\x82\xac'

4) Because of 1), it is clear that default encoding types must be applied, and it is clear that Firefox provides UTF-8.

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Content-Type: multipart/form-data; boundary=---------------------------1650566221634
Content-Length: 527
-----------------------------1650566221634
Content-Disposition: form-data; name="type"

summary
-----------------------------1650566221634
Content-Disposition: form-data; name="submit"

Search
-----------------------------1650566221634
Content-Disposition: form-data; name="pre"

€
-----------------------------1650566221634
Content-Disposition: form-data; name="part"


-----------------------------1650566221634
Content-Disposition: form-data; name="key"


-----------------------------1650566221634--
msg125993 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-11 10:27
R. David:

Pierre said:
BytesFeedParser only uses the ascii codec ; if the header has non ASCII characters (filename in a multipart/form-data), they are replaced by ? : the original file name is lost. So for the moment I leave the text version of FeedParser

I say:
Does this mean BytesFeedParser, to be useful for cgi.py, needs to accept an input parameter encoding, defaulting to ASCII for the email case?  Should that be a new issue?  Or should cgi.py, since it can't use email to do all its work (no support for file storage, no support for encoding) simply not try, and use its own code for header decoding also?  The only cost would be support for Encoded-Word -- but it is not clear that HTTP uses them?  Can anyone give an example of such?  Read the next message here for an example of filename containing non-ASCII.
msg125994 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-11 10:40
In my previous message I quoted Pierre rightly cautioning about headers containing non-ASCII... and that BytesFeedParser doesn't, so using it to parse headers may be questionable.

So I decided to try one... I show the Live HTTP headers below, from a simple upload form.  What is not so simple is the filename of the file to be uploaded... it contains a couple non-ASCII characters... in fact, one of them is non-latin-1 also: "foöţ.html".  It rather seems that Firefox provides the filename in UTF-8, although Live HTTP headers seems to have displayed it using Latin-1 on the screen!  But in saving it to a file, it didn't write a BOM, and the byte sequence for the filename is definitely UTF-8, and pasted here to be viewed correctly.

So my question: where does Firefox get its authority to encode the filename using UTF-8 ???

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://rkivs.com.gl:8032/row/test.html
Content-Type: multipart/form-data; boundary=---------------------------207991835220448
Content-Length: 304
-----------------------------207991835220448
Content-Disposition: form-data; name="submit"

upload
-----------------------------207991835220448
Content-Disposition: form-data; name="pre"; filename="foöţ.html"
Content-Type: text/html

aoheutns

-----------------------------207991835220448--
msg126035 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-11 21:03
@Glenn
"I'm curious what your system (probably Windows since you mention cp-) and browser, and HTTP server is, that you used for that test.  Is it possible to capture the data stream for that test?  Describe how, and at what stage the data stream was captured, if you can capture it.  Most interesting would be on the interface between browser and HTTP server."

I tested it on Windows XP Family Edition 2020, Service Pack 3, with Python 3.2b2
Browsers : Mozilla Firefox 3.6.13 and Internet Explorer 7.0
Servers : Apache 2.2, and the built-in server started by :

import http.server
http.server.test(HandlerClass=http.server.CGIHTTPRequestHandler)

I print the bytes received in the multipart/form-data part by "print(odelim+line)" at the end of method  read_lines_to_outerboundary() of FieldStorage. The bytes sent when I enter the string 
    "a"+"n tilde" + the euro sign 
are : b'a\xf1\x80' - that is, the cp-1252 encoding of the string

Since it works the same with 2 browsers and 2 web servers, I'm almost sure it's not dependant on the configuration - but if others can tests on different configurations I'd like to know the result

Basically, this behaviour is not surprising : if sys.stdin.encoding is set to a certain value, it's natural that the bytes sent on the binary layer are encoded with this encoding, not with latin-1

I attach the diff file for an updated version of cgi.py :
- new argument stream_encoding instead of setting an attribute "encoding" to fp
- use locale.getpreferredencoding() to decode the query string
msg126060 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-12 00:36
Pierre said:
Since it works the same with 2 browsers and 2 web servers, I'm almost sure it's not dependant on the configuration - but if others can tests on different configurations I'd like to know the result

So I showed in my just previous messages (after the one you are responding to) my output from Live HTTP Headers, where it seems that Firefox is using UTF-8 transmission, both for header values (filename) and data values (euro character).  Without specifying Content-Type (for the data) or doing RFC 2047 encoding as would be expected from reading the various standard documents (RFC 2045, W3 HTML 4.01, RFC 2388).  I wonder now if Live HTTP Headers is reporting the logical data, prior to encoding for transmission.  But I was getting UTF-8 data inside my CGI script... 

So now I tweaked the server to save the bytes it transfers its rfile to the cgi process (had already tweaked that to be binary instead of having encodings), and it is clearly UTF-8 at that point also.  Looks just like the Live HTTP headers.  Now that I have data-capture on the server side, I can run the same tests with other browsers... so I ran it with Opera 11, IE 8, Chrome 8, and the only differences were the specific value of the boundaries... all the data was in UTF-8, both filename, and form data value.

I can't now find a setting for Firefox to allow the user to control the encoding it sends to the server, but I can't rule out that I once might have, and set it to UTF-8.  But I'm quite certain I don't know enough about the other browsers to adjust their settings.  I don't have Apache installed on this box, so I cannot test to see if it changes something.

Is there a newer standard these browsers are following, that permits UTF-8?  Or even requires it?

Why is Pierre seeing cp-1252, and I'm seeing UTF-8?  I'm running Windows 6.1 (Build 7600), 64-bit, the so-called Windows 7 Professional edition.
msg126062 - (view) Author: Etienne Robillard (erob) Date: 2011-01-12 00:53
On 11/01/11 07:36 PM, Glenn Linderman wrote:
> Is there a newer standard these browsers are following, that permits UTF-8?  Or even requires it?
>
> Why is Pierre seeing cp-1252, and I'm seeing UTF-8?  I'm running Windows 6.1 (Build 7600), 64-bit, the so-called Windows 7 Professional edition.
>
> ----------
>
>   

May be your browser have differents assumptions on what charset is valid
for encoding multipart
form data... For instance, all modern browsers allow customizing
charsets based on the user's locale.

Lastly this behavior is well-defined in RFC 2616, as the
"Accept-Charset" HTTP header:

   "The Accept-Charset request-header field can be used to indicate what
   character sets are acceptable for the response. This field allows
   clients capable of understanding more comprehensive or special-
   purpose character sets to signal that capability to a server which is
   capable of representing documents in those character sets."

just my 2 cents while watching a boring hockey game... :-)
msg126065 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-12 02:07
Aha!

Found a page <http://htmlpurifier.org/docs/enduser-utf8.html#whyutf8-support> which links to another page <http://web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html> that explains the behavior.

The synopsis is that browsers (all modern browsers) return form data
Form data is generally returned in the same character encoding as the Form page itself was sent to the client.

I suspect this explains the differences between what Pierre and I are reporting.  I suspect (but would appreciate confirmation from Pierre), that his web pages use 
<meta http-equiv="Content-Type" content="text/html; charset=CP-1252" />
or else do not use such a meta tag, and his server is configured (or defaults) to send HTTP headers:
Content-Type: text/html; charset=CP-1252

Whereas, I do know that all my web pages are coded in UTF-8, have no meta tags, and my CGI scripts are sending 
Content-Type: text/html; charset=UTF-8
for all served form pages... and thus getting back UTF-8 also, per the above explanation.

What does this mean for Python support for http.server and cgi?
Well, http.server, by default, sends Content-Type without charset, except for directory listings, where it supplies charset= the result of sys.getfilesystemcoding().  So it is up to META tags to define the coding, or for the browser to guess.  That's probably OK: for a single machine environment, it is likely that the data files are coded in the default file system encoding, and it is likely the browser will guess that.  But it quickly breaks when going to a multiple machine or internet environment with different default encodings on different machines.  So if using http.server in such an environment, it is necessary to inform the client of the page encoding using META tags, or generating the Content-Type: HTTP header in the CGI script (which latter is what I'm doing for the forms and data of interest).

What does it mean for cgi.py's FieldStorage?

Well, use of the default encoding can work in the single machine environment... so I guess there are would be worse things that doing so, as Pierre has been doing.  But clearly, that isn't the complete solution.  The new parameter he proposes to FieldStorage can be used, if the application can properly determine the likeliest encoding for the form data, before calling it.

On a single machine system, that could be the default, as mentioned above.  On a single application web server, it could be some constant encoding used for all pages (like I use UTF-8 for all my pages).  For a multiple application web server, as long as each application uses a consistent encoding, that application could properly guess the encoding to pass to FieldStorage.  Or, if the application wishes to allow multiple encodings, as long as it can keep track of them, and use the right ones at the right time, it is welcome to.

How does this affect email?  Not at all, directly.

How does this affect cgi.py's use of email?
It means that cgi.py cannot use BytesFeedParser, in spite of what the standards say, so Pierre's approach of predecoding the headers is the correct one, since email doesn't offer an encoding parameter.  Since email doesn't support disk storage for file uploads, but buffers everything in memory, it means that cgi.py can only pass headers to FeedParser, so has to detect end-of-headers itself, since email provides no feedback to indicate that end-of-headers was reached, and that means that cgi.py must parse the MIME parts itself, so it can put the large parts on disk. It means that the email package provides extremely little value to cgi.py, and since web browsers and multipart/form-data use simple subsets of the full power of RFC822 headers, email could be replaced with the use of its existing parse_header function, but that should be deprecated.  A copy could be moved inside FieldStorage class and fixed a bit.
msg126066 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-12 02:14
I notice the version on this issue is Python 3.3, but it affects 3.2 and 3.1 as well.  While I would like to see it fixed for 3.2, perhaps it is too late for that, with rc1 coming up this weekend?

Could at least the non-deprecated parse functions be deprecated in 3.2, so that they could be removed in 3.3?  Or should we continue to support them?
msg126075 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-12 06:07
Pierre,
I applied your patch to my local copy of cgi.py for my installation of 3.2, and have been testing.  Lots of things work great!

My earlier comment regarding make_file seems to be relevant.  Files that are not binary should have an encoding.  Likely you removed the encoding because it was a hard-coded UTF-8 and that didn't work for you, with your default encoding of cp-1252.  However, now that I am passing in UTF-8 via the stream-encoding parameter, because that is what matches my form-data, I get an error that cp-1252 (apparently also my default encoding, except for console stuff which is 437) cannot encode \u0163.  So I think the encoding parameter should be added back in, but the value used should be the stream_encoding parameter.  You might also turn around the test on self.filename:

        import tempfile
        if self.filename:
            return tempfile.TemporaryFile("wb+")
        else:
            return tempfile.TemporaryFile("w+",
                                          encoding=self.stream_encoding,
                                          newline="\n")

One of my tests used a large textarea and a short file.  I was surprised to see that the file was not stored as a file, but the textarea was.  I guess that is due to the code in read_single that checks length rather than filename to decide whether it should be stored in a file from the get-go.  It seems that this behaviour, while probably more efficient than actually creating a file, might be surprising to folks overriding make_file so that they could directly store the data in the final destination file, instead of copying it later.  The documented semantics for make_file do not state that it is only called if there is more than 1000 bytes of data, or that the form_data item headers contain a CONTENT-LENGTH header (which never seems to happen).  Indeed, I found a comment on StackOverflow where someone had been surprised that small files did not have make_file called on them.
msg126117 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-12 17:42
Yes, the immanence of RC1 makes it really doubtful that this can be fixed in 3.2.  Whether or not it can be fixed in 3.2.1 will depend on the nature of the fix.  If it changes behavior such that currently working uses of FieldStorage (that don't deal with binary files) break, then the fix can't be backported.  Likewise if the API changes, the change can't be backported.

Doing the deprecation sounds like a good idea.  Would you be willing to propose a patch with tests?  I'm pretty busy this week and I doubt I can do anything myself about it before the weekend.

If this cannot be fixed in a way that is backward compatible (and even if it can), in 3.3 we also have the option of adding features to the email package to better support the use cases in HTTP if that makes sense.  Certainly the external file support is something that email needs for itself, so it would be nice to add that in 3.3.
msg126124 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-12 18:49
I'd be willing to propose such a patch and tests, but I haven't a clue how, other than starting by reading the contributor document... I was putting off learning the process until hg conversion, not wanting to learn an old process for a few months :(  And I've never written an official Python test, or learned how to use the test modules, etc.  So that's a pretty steep curve for the 2 days remaining.

Due to the way that browsers actually work, vs. how the standards are written, it seems necessary to add the optional  stream_encoding parameter.  The limit parameter Pierre is proposing is also a good check against improperly formed inputs.  So there are new, optional parameters to the FieldStorage constructor.

Without these fixes, though, cgi.py continues to be totally useless for file uploads, so not releasing this in 3.2 makes 3.2 continue to be useless as a basis for web applications.  I have no idea if there is a timeframe for 3.3, nor what it is.  I'm not sure if, or how many, web frameworks use cgi.py vs. replacing the functionality.  Seems at least some replace it, so they may not suffer in porting to 3.x (except internally, grappling with the same issues).

Happily, Pierre's latest patch needs only one more fix, per my (non-Python-standard) testing.  Between his testing in one environment using default code pages, and mine using UTF-8, the bases seem to be pretty well covered for testing... certainly more than the previous default tests.  I think you contributed some tests, I haven't tried them, but it seems Pierre has, as he has a patch for that also (which I haven't tried).
msg126140 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-12 21:15
Many thoughts and tests after...

Glenn, the both of us were wrong : the encoding to use in FieldStorage is neither latin-1, nor sys.stdin.encoding : I tested form fields with characters whose utf-8 encoding has bytes that map to undefined in cp1252, the calls to the decode() method with sys.stdin.encoding failed

The encoding used by the browser is defined in the Content-Type meta tag, or the content-type header ; if not, the default seems to vary for different browsers. So it's definitely better to define it

The argument stream_encoding used in FieldStorage *must* be this encoding ; in this version, it is set to utf-8 by default

But this raises another problem, when the CGI script has to print the data received. The built-in print() function encodes the string with sys.stdout.encoding, and this will fail if the string can't be encoded with it. It is the case on my PC, where sys.stdout.encoding is cp1252 : it can't handle Arabic or Chinese characters

The solution I have tried is to pass another argument, charset, to the FieldStorage contructor, defaulting to utf-8. It must be the same as the charset defined in the CGI script in the Content-Type header

FieldStorage uses this argument to override the built-in print() function :
- flush the text layer of sys.stdin, in case calls to print() have been made before calling FieldStorage
- get the binary layer of stdout : out = sys.stdout.detach()
- define a function _print this way:
	def _print(*strings):
		for item in strings:
			out.write(str(item).encode(charset))
		out.write(b'\r\n')
- override print() :
    import builtins
    builtins.print = _print

The function print() in the CGI script now sends the strings encoded with "charset" to the binary layer of sys.stdout. All the tests I made with Arabic or Chinese input fileds, or file names, succed when using this patch ; so do test_cgi and cgi_test (slightly modified)
msg126145 - (view) Author: Peter Kleiweg (pebbe) Date: 2011-01-12 22:19
Pierre Quentel wrote:
- get the binary layer of stdout : out = sys.stdout.detach()

You can't do that! That makes sys.stdout unavaible to the program that is importing the cgi module.

Cgi should access and process sys.stdin only, as binary by means of sys.stdin.detach()

The cgi module is used to handle form data and uploaded files. But the resulting page is usually written by the main program or another module, using sys.stdout
msg126152 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-13 00:11
Pierre said:
The encoding used by the browser is defined in the Content-Type meta tag, or the content-type header ; if not, the default seems to vary for different browsers. So it's definitely better to define it

The argument stream_encoding used in FieldStorage *must* be this encoding

I say:
I agree it is better to define it.  I think you just said the same thing that the page I linked to said, I might not have conveyed that correctly in my paraphrasing.  I assume you are talking about the charset of the Content-Type of the form page itself, as served to the browser, as the browser, sadly, doesn't send that charset back with the form data.

Pierre says:
But this raises another problem, when the CGI script has to print the data received. The built-in print() function encodes the string with sys.stdout.encoding, and this will fail if the string can't be encoded with it. It is the case on my PC, where sys.stdout.encoding is cp1252 : it can't handle Arabic or Chinese characters

I say:
I don't think there is any need to override print, especially not builtins.print.  It is still true that the HTTP data stream is and should be treated as a binary stream.  So the script author is responsible for creating such a binary stream.

The FieldStorage class does not use the print method, so it seems inappropriate to add a parameter to its constructor to create a print method that it doesn't use.

For the convenience of CGI script authors, it would be nice if CGI provided access to the output stream in a useful way... and I agree that because the generation of an output page comes complete with its own encoding, that the output stream encoding parameter should be separate from the stream_encoding parameter required for FieldStorage.

A separate, new function or class for doing that seems appropriate, possibly included in cgi.py, but not in FieldStorage.  Message 125100 in this issue describes a class IOMix that I wrote and use for such; codifying it by including it in cgi.py would be fine by me... I've been using it quite successfully for some months now.

The last line of Message 125100 may be true, perhaps a few more methods should be added.  However, print is not one of them.  I think you'll be pleasantly surprised to discover (as I was, after writing that line) that the builtins.print converts its parameters to str, and writes to stdout, assuming that stdout will do the appropriate encoding.  The class IOMix will, in fact, do that appropriate encoding (given an appropriate parameter to its initialization.  Perhaps for CGI, a convenience function could be added to IOMix to include the last two code lines after IOMix in the prior message:

        @staticmethod
        def setup( encoding="UTF-8"):
            sys.stdout = IOMix( sys.stdout, encoding )
            sys.stderr = IOMix( sys.stderr, encoding )

Note that IOMix allows the users choice of output stream encoding, applies it to both stdout and stderr, which both need it, and also allows the user to generate binary directly (if sending back a file, for example), as both bytes and str are accepted.

print can be used with a file= parameter in 3.x which your implementation doesn't permit, and which could be used to write to other files by a CGI script, so I really, really don't think we want to override builtins.print without the file= parameter, and specifically tying it to stdout.

My message 126075 still needs to be included in your next patch.
msg126160 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-13 08:39
I knew the builtins hack was terrible, thanks for the replies...

I changed cgi.py with Glenn's IOMix class, and included the changes in make_file(). The patch is attached to this message

Is it really too late to include it in 3.2 ? Missing a working cgi module is really a problem for a wider 3.x adoption
msg126161 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-13 08:40
diff for the updated version of test_cgi.py, compatible with cgi.py
msg126162 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-13 08:41
zip file with the updated cgi_test.py and associated files
msg126164 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-13 09:43
Pierre,
Looking better.
I see you've retained the charset parameter, but do not pass it through to nested calls of FieldStorage.  This is good, because it wouldn't work if you did.  However, purists might still complain that FieldStorage should only ever use and affect stdin... however, since I'm a pragmatist, I'll note that the default charset value is None, which means it does nothing to stdout or stderr by default, and be content with that.

I've run a couple basic tests and it works, and the other things the code hasn't changed since your last iteration, but I'll test them again after I get some sleep.

I'll try setting the Version here back to 3.2 -- it is a bug in 3.2 -- and see if some committer will take pity on web developers that use CGI, and are hoping to be able to use Python 3.2 someday.
msg126165 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-13 09:47
Small tip: To ease review, I recommend you work from a checkout of the
Subversion py3k branch, using svn add if you have new files and then
producing one svn diff of the whole checkout.  It’s easier than looking
at multiple files, even more so if they’re hidden in a zip.  We
programmers like text :)

You can also remove outdated patches from this page to make it clearer.
msg126167 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-13 13:23
Ok Eric, thanks for the tips

I attach the diff for the 2 modified modules (cgi.py and test_cgi.py). For the other tests, they are not in the branch and there are many test files so I leave the zip file

I removed outdated diffs
msg126173 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-13 14:31
It getting in to 3.2 would be a release manager call, so I've set it to release blocker so Georg can make the call.  My opinion is that while I would *really* like to see this fixed in 3.2, the changes really should have a thorough *design* review as well as a code review.  

The argument for putting it in would be that it is broken as is (at least for binary file upload, possibly in other ways as well), and if we can get agreement on the API changes, we can fix any remaining bugs in 3.2.1.  However, making API changes at this point (post-beta) requires a significant exception to our normal development rules, and I don't like doing things this rushed and last minute.  But I also don't like the thought of having FieldStorage be broken in 3.2.

Georg, I'm really busy this week, and don't have time to do a review, unfortunately.  If you think it worth considering putting it in, I can try to take a look at the API changes tomorrow, but unfortunately can make no promise to do so.  Hopefully others can, if needed.
msg126175 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-13 15:38
Ok, thanks. Here is a summary of the API changes :

- the argument fp passed to FieldStorage is either an instance of (a subclass of) io.TextIOBase with a "buffer" attribute for the underlying binary layer (thus, it can't be a StringIO instance) ; or an object with read() and readline() methods that return bytes
Defaults to sys.stdin

- 2 additional arguments can be passed to the FieldStorage constructor :
. stream_encoding : the encoding used by the user agent to encode submitted data. It must be the same as the content-type of the HTML page where the form stands. Defaults to utf-8
. charset : the encoding used by the CGI script (the one used by the print() function to encode and send to sys.stdout). It must be the same as the charset in the content-type header sent by this script. Defaults to None, in which case the default encoding of sys.stdout is used

- the only change in the object returned by FieldStorage() is that, if a field represents a file (its argument filename is not None), the read() method on this field returns bytes, and its attribute "value" is a bytestring, not a string
msg126199 - (view) Author: Andy Harrington (andyharrington) Date: 2011-01-13 20:47
I found a similar issue.  If you want more simple files demonstrating the issue, I have attached some.  If I start my localCGIServer.py, then I can use adder.html fine (uses get), but with adderpost.html (uses post) the cgi action file, adder.cgi (that worked fine with the get version) hangs.
msg126205 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-13 22:03
Ok, there are 10+ files attached, 20+ comments, no up-to-date patch. It's really too late for 3.2 IMO.
msg126207 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-13 22:09
Comment ça, no up to date patch ? cgi_32.patch is up to date, the API changes are documented, the unittests work, what else do you want ?
msg126210 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-13 22:14
> Comment ça, no up to date patch ? cgi_32.patch is up to date, the API
> changes are documented, the unittests work, what else do you want ?

The O_BINARY stuff looks obsolete to me.
msg126212 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-13 22:27
The O_BINARY stuff was probably necessary because issue 10841 is not yet in the build Pierre was using?  I agree it in not necessary with the fix for that issue, but neither does it hurt.

It could be stripped out, if you think that is best, Antoine.

But there is a working patch.
msg126214 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-13 22:39
Can one person please

1) Sum up the discussion and its outcome briefly

2) Remove all patches and replace them with one diff with docs, tests and code (even if you have new files, you don’t have to put them in a zip, use svn add and they will show up in the svn diff, which is really easier to review)
msg126215 - (view) Author: Etienne Robillard (erob) Date: 2011-01-13 22:43
+1
msg126219 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-13 23:54
I tested cgi_32.patch on Windows with Apache:
 - a test with a binary file works: I get a binary file instead of a text file
 - a test with a non-ASCII character (a\xe9b) works: the text is correctly decoded

I used the test script from full_source_and_error.zip.

Comments on cgi_32.patch:

 - I don't understand why FieldStorage changes sys.stdout and sys.stderr (see remarks about IOMix above): please remove the charset argument (it is also confusing to have two encoding arguments). it should be done somewhere else
 - please remove the O_BINARY hack: the patch is for Python 3.2 and I closed issue #10841. If you would like a backport, another patch should be written later
 - "encoding = 'latin-1' # ?": write a real comment or remove it
 - 'self.fp.read(...) # bytes': you should add a test on the type if you are not sure that fp.read() gives bytes
 - "file: the file(-like) object from which you can read the data *as bytes*": you should mention that TextIOWrapper is also tolerated (accepted?)
 - you may set fp directly to sys.stdin.buffer (instead of sys.stdin) if fp is None (it will be easier after removing the O_BINARY thing)
 - the patch adds a tab in an empty line, please don't do that :-)
 - you should add a (private?) attribute to FieldStorage to decide if it works on bytes or unicode, instead of using "self.filename is not None" test (eg. self._use_bytes = (self.filename is not None)
 - i don't like the idea of having a generic self.__write() method supporting bytes and unicode. it would prefer two methods, eg. self.__write_text() and self.__write_binary() (they can share a third private method)
 - i don't like "stream_encoding" name: what is the "stream" here? do you process a "file", a "string" or a "stream"? why not just "self.encoding"?
 - "import email.parser,email.feedparser" one import is useless here. I prefer "from email.feedparser import FeedParser" because you get directly a ImportError if the symbol is missing. And it's already faster to get FeedParser instead of email.feedparser.FeedParser in a loop (dummy micro-optimization)
 - even I like the following change, please do it in a separated patch:
-            if type(value) is type([]):
+            if isinstance(value,list):


I really don't like the IOMix thing:

 - sys.stdout.write() should not accept bytes
 - FieldStorage should not replace sys.stdout and sys.stderr: if you want to set the encoding of these files, set PYTHONIOENCODING environment variable before running your program (it changes also the encoding of sys.stdio)
 - IOMix should not accept bytes *and* unicode. It's better to have an explicit API like stdout.write('unicode') and stdout.buffer.write(b'bytes)


Most parts of the patch are correct and fix real bugs. Since cgi is broken currently (eg. it doesn't handle binary files correctly), anything making the situation better would be nice.

I vote +0 to commit the patch now (if the release manager agrees), and +1 if all of my remarks are fixed.
msg126222 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-14 01:52
Victor, thanks for your comments, and interest in this bug.  Other than the existence of the charset parameter, and whether or not to include IOMix, I think all of the others could be fixed later, and do not hurt at present.  So I will just comment on those two comments.

I would prefer to see FieldStorage not have the charset attribute also, but I don't have the practice to produce an alternate patch, and I can see that it would be a convenience for some CGI scripts to specify that parameter, and have one API call do all the work necessary to adjust the IO streams, and read all the parameters, and then the rest of the logic of the web app can follow.  Personally, I adjust the stdout/stderr streams earlier in my scripts, and only optionally call FieldStorage, if I determine the request needs such.

I've been using IOMix for some months (I have a version for both Python 2 and 3), and it solves a real problem in generating web page data streams... the data stream should be bytes, but a lot of the data is manipulated using str, which would then need to be decoded.  The default encoding of stdout is usually wrong, so must somehow be changed.  And when you have chunks of bytes (in my experience usually from a database or file) to copy to the output stream, if your prior write was str, and then you write bytes to sys.stdout.binary, you have to also remember to flush the TextIOBuffer first.  IOMix provides a convenient solution to all these problems, doing the flushing for you automatically, and just taking what comes and doing the right thing.  If I hadn't already invented IOMix to help write web pages, I would want to :)
msg126232 - (view) Author: Graham Dumpleton (grahamd) Date: 2011-01-14 06:08
FWIW, keep in mind that cgi.FieldStorage is also quite often used in WSGI scripts in arbitrary WSGI servers which have got nothing to do with CGI. Having cgi.FieldStorage muck around with stdout/stderr under WSGI, even where using a CGI/WSGI bridge, would potentially be a bad thing to do, especially in embedded systems like mod_wsgi where sys.stdout and sys.stderr are replaced with file like objects that map onto Apache error logging. Even in non embedded systems, you could very well screw up any application logging done via stdout/stderr and break the application.

So, the default or common code paths should never play with sys.stdout or sys.stderr. It is already a PITA that the implementation falls back to using sys.argv when QUERY_STRING isn't defined which also could produce strange results under a WSGI server. In other words, please don't go adding any more code which makes the wrong assumption that this is only used in CGI scripts.
msg126233 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-14 06:17
Graham, Thanks for your comments.  Fortunately, if the new charset parameter is not supplied, no mucking with stdout or stderr is done, which is the only reason I cannot argue strongly against the feature, which I would have implemented as a separate API... it doesn't get in the way if you don't use it.

I would be happy to see the argv code removed, but it has been there longer than I have been a Python user, so I just live with it ... and don't pass arguments to my CGI scripts anyway.  I've assumed that is some sort of a debug feature, but I also saw some code in the HTTPCGIServer and http.server that apparently, on some platforms, actually do pass parameters to CGI on the command lines.  I would be happy to see that code removed too, but it also predates my Python experience.  And no signs of "if debug:" by either of them!
msg126242 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-14 08:23
@Victor

Thanks for the comments

"- I don't understand why FieldStorage changes sys.stdout and sys.stderr (see remarks about IOMix above): please remove the charset argument (it is also confusing to have two encoding arguments). it should be done somewhere else"

done

"please remove the O_BINARY hack: the patch is for Python 3.2 and I closed issue #10841. If you would like a backport, another patch should be written later"

done

""encoding = 'latin-1' # ?": write a real comment or remove it"

I removed this part

"'self.fp.read(...) # bytes': you should add a test on the type if you are not sure that fp.read() gives bytes"

added tests in read_urlencoded(), read_multi() and read_binary()

 "file: the file(-like) object from which you can read the data *as bytes*": you should mention that TextIOWrapper is also tolerated (accepted?)"

not done : here "file" is not the argument passed to the FieldStorage constructor, but the attribute of values returned from calls to FieldStorage. In the new implementation, its read() method always returns bytes

"you may set fp directly to sys.stdin.buffer (instead of sys.stdin) if fp is None (it will be easier after removing the O_BINARY thing)"
 
done
 
 "the patch adds a tab in an empty line, please don't do that :-)"
 
done (hopefully :-)
 
"you should add a (private?) attribute to FieldStorage to decide if it works on bytes or unicode, instead of using "self.filename is not None" test (eg. self._use_bytes = (self.filename is not None)"

done

 "i don't like the idea of having a generic self.__write() method supporting bytes and unicode. it would prefer two methods, eg. self.__write_text() and self.__write_binary() (they can share a third private method)"
 
not done, the argument of __write is always bytes
 
"i don't like "stream_encoding" name: what is the "stream" here? do you process a "file", a "string" or a "stream"? why not just "self.encoding"?"

done

 - "import email.parser,email.feedparser" one import is useless here. I prefer "from email.feedparser import FeedParser" because you get directly a ImportError if the symbol is missing. And it's already faster to get FeedParser instead of email.feedparser.FeedParser in a loop (dummy micro-optimization)

done

" even I like the following change, please do it in a separated patch:
-            if type(value) is type([]):
+            if isinstance(value,list):"

not done

"I really don't like the IOMix thing:"

removed

"I vote +0 to commit the patch now (if the release manager agrees), and +1 if all of my remarks are fixed."

should be close to +0.8 now ;-)
msg126249 - (view) Author: Etienne Robillard (erob) Date: 2011-01-14 09:25
+1

thanks for this input. I agree for the most part. However if the io
semantics in python 3 is radically different than on python 2, I could
have expected that WSGI scripts would similarly depend on a newer
type of file descriptor access using the ``sys`` module.

cheers!
msg126251 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-14 10:04
Pierre, Thank you for the new patch, with the philosophy of "it's broke, so let's produce something the committers like to get it fixed".

I see you overlooked removing the second use of O_BINARY.  Locally, I removed that also, and tested your newest patch, and it still functions great for me.
msg126253 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-14 10:48
Glenn, you read my mind ;-)

Thanks for mentioning the O_BINARY thing. New (last !) patch attached
msg126256 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 13:12
r87996+r87997 adds encoding and errors argument to parse_qs() and parse_qsl() of urllib.parse. It is needed to decoded correctly %XX syntax in cgi.

r87998 is the patch on the cgi module.

Changes with cgi_32.patch:

 * Use TextIOWrapper instead of TextIOBase, because TextIOBase has no buffer
   attribute
 * typo in a docstring: "it must must" => "must"
 * (docstring) default: sys.stdin => default: sys.stdin.buffer
 * PEP 8: hasattr(a,b) => hasattr(a, b) (same for isinstance) and
   "encoding = 'utf-8'" => "encoding='utf-8'" (in the argument list)
 * "xxx.decode(...) # str": remove useless # str comment. decode() always give
   unicode in Python 3 (same change for ".encode() # bytes")
 * Rename "next" variables to "next_boundary" because next is a builtin
   function in Python 3 (unrelated change).
 * FieldStorage.innerboundary and FieldStorage.outerboundary are bytes objects:
   encode innerboundary in the constructor, and raise an error if outerboundary
   is not a bytes object
 * Rename _use_bytes to _binary_file
 * isinstance(bytes) test: write the type, not the value, in the error message
 * Replace line[:2] == b'--' by line.startswith(b'--'), and then replace
   line.strip() by line.rstrip()
 * test_fieldstorage_multipart() uses ASCII (and specifiy the encoding to FieldStorage)
 * add FieldStorage.errors attribute: pass it to parse_qsl()
 * add errors attribute to FieldStorage: same default value than urllib.parse.unquote(): 'replace'
 * parse(): pass encoding argument to parse_qs()
 * FieldStorage: pass encoding and errors arguments to parse_qsl()

Because the patch on TextIOBase, it patched the docstring:
---
        fp              : file pointer; default: sys.stdin.buffer
            (not used when the request method is GET)
            Can be :
            1. an instance of (a subclass of) TextIOWrapper, in this case it
            must provide an attribute "buffer" = the binary layer that returns
            bytes from its read() method, and preferably an attribute
            "encoding" (defaults to latin-1)
            2. an object whose read() and readline() methods return bytes
---
becomes
---
        fp              : file pointer; default: sys.stdin.buffer
            (not used when the request method is GET)
            Can be :
            1. a TextIOWrapper object
            2. an object whose read() and readline() methods return bytes
---

Replace "type(value) is type([])" is done in another commit: r87999.

I consider that the work on this issue is done, and so I close it. If I am wrong, explain why and repoen the issue.

Please test the cgi as much as possible before Python 3.2 final: reopen the issue if it doesn't work.
msg126257 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 13:14
Oh, I forgot to credit the author(s): who wrote the patch?
msg126262 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-14 14:07
Thanks a lot Victor !

I wrote the patch : Pierre Quentel (pierre.quentel@gmail.com) with many
inputs by Glenn Linderman

2011/1/14 STINNER Victor <report@bugs.python.org>

>
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
>
> Oh, I forgot to credit the author(s): who wrote the patch?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4953>
> _______________________________________
>
msg126264 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 14:15
TODO: Add more tests to test_cgi. What is the latest patch for test_cgi?
msg126266 - (view) Author: Pierre Quentel (quentel) * Date: 2011-01-14 14:41
My latest patch for test_cgi is in cgi_32.patch

I will try to add more tests later
msg126267 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 14:57
haypo>> What is the latest patch for test_cgi?
quentel> My latest patch for test_cgi is in cgi_32.patch

Ok, but cgi_32.patch doesn't add any test. I only adapt existing tests for your other changes.

I remove cgi_32.patch because it was commited.
msg126268 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 15:00
Remove cgi_plus_tests.diff: it looks to be an old version of cgi_32.patch.

@r.david.murray: Did you write cgi_plus_tests.diff, or is it based on the work on Pierre Quentel?
msg126269 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 15:03
Remove tmpy44zj7.html and tmpav1vve.html: a similar file is included in full_source_and_error.zip.
msg126289 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-14 19:11
Victor: we normally leave the patch file that was committed attached to the issue for future reference.

The _plus_tests file was just the original patch plus the existing cgi tests adjusted to pass in bytes instead of strings to cgi, if I recall correctly.  Nor original code of mine in either part :)
msg126293 - (view) Author: Glenn Linderman (v+python) * Date: 2011-01-14 19:41
Thanks to Pierre for producing patch after patch and testing testing testing, and to Victor for committing it, as well as others that contributed in smaller ways, as I tried to.  I look forward to 3.2 rc1 so I can discard all my temporary patched copies of cgi.py
msg126301 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 21:36
Because I'm unable to read the whole history and analyze each file attached to this issue, I opened #10911 to ask to write more tests for the cgi module.
msg126307 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-15 01:19
Le vendredi 14 janvier 2011 à 19:11 +0000, R. David Murray a écrit :
> Victor: we normally leave the patch file that was committed attached
> to the issue for future reference.

Sorry, but there were too much files. I was trying to figure out if
there is something useful in the files.

> The _plus_tests file was just the original patch plus the existing cgi
> tests adjusted to pass in bytes instead of strings to cgi, if I recall
> correctly.  Nor original code of mine in either part :)

Ok.
History
Date User Action Args
2022-04-11 14:56:44adminsetgithub: 49203
2011-01-15 01:19:29vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126307
2011-01-14 21:36:09vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126301
2011-01-14 19:41:02v+pythonsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126293
2011-01-14 19:39:10v+pythonsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
versions: + Python 3.2, - Python 3.3
2011-01-14 19:11:10r.david.murraysetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126289
2011-01-14 15:11:51pitrousetnosy: - pitrou
2011-01-14 15:03:00vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126269
2011-01-14 15:01:53vstinnersetfiles: - tmpy44zj7.html
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 15:01:46vstinnersetfiles: - tmpav1vve.html
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 15:01:34vstinnersetfiles: - cgi_plus_tests.diff
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 15:00:03vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126268
2011-01-14 14:58:10vstinnersetfiles: - cgi_32.patch
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 14:57:56vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126267
2011-01-14 14:41:19quentelsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126266
2011-01-14 14:39:23quentelsetfiles: - unnamed
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 14:15:22vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126264
2011-01-14 14:07:53quentelsetfiles: + unnamed

title: cgi module cannot handle POST with multipart/form-data in 3.x -> cgi module cannot handle POST with multipart/form-data in 3.x
messages: + msg126262
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 13:14:10vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126257
2011-01-14 13:12:27vstinnersetstatus: open -> closed

messages: + msg126256
resolution: fixed
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 10:48:28quentelsetfiles: - cgi_32.patch
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 10:48:19quentelsetfiles: + cgi_32.patch
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126253
2011-01-14 10:04:19v+pythonsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126251
2011-01-14 09:25:55erobsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126249
2011-01-14 08:25:21quentelsetfiles: - cgi_32.patch
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-14 08:23:43quentelsetfiles: + cgi_32.patch
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126242
2011-01-14 06:17:38v+pythonsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, grahamd, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126233
2011-01-14 06:08:08grahamdsetnosy: + grahamd
messages: + msg126232
2011-01-14 01:52:02v+pythonsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126222
2011-01-13 23:54:53vstinnersetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126219
2011-01-13 22:43:36erobsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126215
title: cgi module cannot handle POST with multipart/form-data in 3.x -> cgi module cannot handle POST with multipart/form-data in 3.x
2011-01-13 22:42:32eric.araujosetfiles: - unnamed
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 22:39:57eric.araujosetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126214
2011-01-13 22:27:14v+pythonsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126212
2011-01-13 22:14:55pitrousetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126210
2011-01-13 22:09:04quentelsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126207
2011-01-13 22:03:24pitrousetpriority: release blocker -> normal
versions: - Python 3.2
nosy: + pitrou

messages: + msg126205

stage: patch review -> needs patch
2011-01-13 20:56:31andyharringtonsetfiles: + adder.cgi
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 20:55:27andyharringtonsetfiles: + localCGIServer.py
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 20:54:02pitrousetnosy: - pitrou
2011-01-13 20:52:53andyharringtonsetfiles: + adderpost.html
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 20:52:07andyharringtonsetfiles: + adder.html
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 20:49:45andyharringtonsetfiles: - localCGIServer.py
nosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, andyharrington, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 20:47:02andyharringtonsetfiles: + localCGIServer.py
nosy: + andyharrington
messages: + msg126199

2011-01-13 15:38:33quentelsetnosy: barry, georg.brandl, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126175
2011-01-13 14:31:47r.david.murraysetpriority: high -> release blocker
nosy: + georg.brandl
messages: + msg126173

2011-01-13 13:23:08quentelsetfiles: + cgi_32.patch
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126167
2011-01-13 13:19:46quentelsetfiles: - cgi_diff.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:19:39quentelsetfiles: - test_cgi_20111013.diff
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:19:31quentelsetfiles: - cgi_20110113.diff
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:19:26quentelsetfiles: - cgi_diff_20110112.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:19:19quentelsetfiles: - cgi_diff_20110111.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:19:12quentelsetfiles: - cgi_tests.zip
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:18:58quentelsetfiles: - cgi_diff_20110109.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:18:50quentelsetfiles: - cgi_diff.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 13:18:43quentelsetfiles: - cgi_diff.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-13 09:47:16eric.araujosetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126165
title: cgi module cannot handle POST with multipart/form-data in 3.0 -> cgi module cannot handle POST with multipart/form-data in 3.x
2011-01-13 09:43:02v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126164
versions: + Python 3.2
2011-01-13 08:41:19quentelsetfiles: + cgi_tests.zip
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126162
2011-01-13 08:40:17quentelsetfiles: + test_cgi_20111013.diff
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126161
2011-01-13 08:39:39quentelsetfiles: + cgi_20110113.diff
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126160
2011-01-13 00:11:02v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126152
2011-01-12 22:19:50pebbesetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126145
2011-01-12 21:15:46quentelsetfiles: + cgi_diff_20110112.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126140
2011-01-12 18:49:58v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126124
2011-01-12 17:42:54r.david.murraysetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126117
2011-01-12 06:07:52v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126075
2011-01-12 02:14:46v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126066
2011-01-12 02:07:07v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126065
2011-01-12 00:53:58erobsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126062
2011-01-12 00:36:14v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126060
2011-01-11 21:03:56quentelsetfiles: + cgi_diff_20110111.txt
nosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg126035
2011-01-11 14:04:00tercero12setnosy: - tercero12
2011-01-11 10:40:03v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125994
2011-01-11 10:27:28v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125993
2011-01-11 10:06:51v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125992
2011-01-10 23:07:23v+pythonsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125952
2011-01-10 21:55:55quentelsetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125935
2011-01-10 21:28:55vstinnersetnosy: barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125931
2011-01-10 21:13:55gvanrossumsetnosy: - gvanrossum
2011-01-10 21:11:50r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125930
2011-01-10 20:41:08v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125928
2011-01-10 20:30:07quentelsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125926
2011-01-10 19:56:25v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125921
2011-01-10 13:07:32erobsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125901
title: cgi module cannot handle POST with multipart/form-datain 3.0 -> cgi module cannot handle POST with multipart/form-data in 3.0
2011-01-10 13:03:49vstinnersetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125900
title: cgi module cannot handle POST with multipart/form-data in 3.0 -> cgi module cannot handle POST with multipart/form-datain 3.0
2011-01-10 10:23:42v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125893
2011-01-10 10:13:07v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125892
2011-01-10 09:31:52v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125886
2011-01-10 08:52:12v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125885
2011-01-09 12:28:01quentelsetfiles: + cgi_tests.zip
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125840
2011-01-09 12:26:18quentelsetfiles: + cgi_diff_20110109.txt
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125839
2011-01-07 19:50:05vstinnersetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125698
2011-01-07 09:50:09v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125637
2011-01-07 09:15:46vstinnersetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125633
2011-01-07 08:14:25quentelsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125629
2011-01-06 19:46:30v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125583
2011-01-06 17:33:12vstinnersetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125570
2011-01-06 16:26:24r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125563
2011-01-06 15:40:33pitrousetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125558
2011-01-06 15:39:23r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, vstinner, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125557
2011-01-06 14:46:03vstinnersetnosy: + vstinner
messages: + msg125556
2011-01-06 10:11:23v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125546
2011-01-06 10:02:23erobsetfiles: + unnamed

messages: + msg125543
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-06 09:28:28v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125533
2011-01-06 08:52:16erobsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125524
2011-01-06 02:12:46v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125501
2011-01-05 23:52:56pitrousetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125483
2011-01-05 23:41:25v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125481
2011-01-05 21:47:15quentelsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125474
2011-01-05 15:10:52pitrousetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125428
2011-01-05 14:32:42eric.araujosetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125426
2011-01-05 12:38:13r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125419
2011-01-05 04:33:32v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125410
2011-01-05 03:38:37r.david.murraysetfiles: + cgitest-python3.py
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125403
2011-01-05 03:35:08r.david.murraysetfiles: + cgi_plus_tests.diff

messages: + msg125402
keywords: + patch
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-05 02:18:56vstinnersetnosy: + pitrou
2011-01-03 21:12:53quentelsetfiles: + cgi_diff.txt
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125241
2011-01-03 20:51:00v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125237
2011-01-03 20:45:27v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125235
2011-01-03 19:09:34r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125217
2011-01-03 17:33:08v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125201
2011-01-03 15:52:18giampaolo.rodolasetnosy: - giampaolo.rodola
2011-01-03 15:33:07erobsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125181
2011-01-03 14:45:13r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125178
2011-01-03 10:44:46erobsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125159
2011-01-03 10:19:50erobsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125158
title: cgi module cannot handle POST with multipart/form-data in 3.0 -> cgi module cannot handle POST with multipart/form-data in 3.0
2011-01-03 03:50:19v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125153
2011-01-03 03:44:56v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
messages: + msg125152
2011-01-03 03:19:15l0nwlfsetnosy: - l0nwlf
2011-01-02 22:43:27pebbesetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125114
2011-01-02 21:36:56v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125108
2011-01-02 21:27:55quentelsetfiles: + cgi_diff.txt
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125106
2011-01-02 21:24:05pebbesetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125105
2011-01-02 21:16:40v+pythonsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125100
2011-01-02 20:08:02quentelsetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125088
2011-01-02 19:59:56quentelsetfiles: + cgi_diff.txt
nosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
messages: + msg125086
2011-01-02 16:19:48r.david.murraysetnosy: gvanrossum, barry, amaury.forgeotdarc, ggenellina, giampaolo.rodola, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, l0nwlf, pebbe, quentel, erob
versions: + Python 3.3, - Python 3.0, Python 3.1, Python 3.2
messages: + msg125065
stage: test needed -> patch review
2011-01-02 16:12:19erobsetnosy: + erob
2011-01-02 08:51:37quentelsetfiles: + http.zip
nosy: + quentel
messages: + msg125035

2010-11-21 05:24:38v+pythonsetnosy: + v+python
messages: + msg121864
2010-08-27 03:16:25floxsetnosy: + flox
2010-07-17 10:03:47eric.araujosetnosy: + eric.araujo, l0nwlf
2010-06-20 01:35:08tercero12setmessages: + msg108221
2010-06-16 22:46:46giampaolo.rodolasetnosy: + giampaolo.rodola
2010-06-16 22:24:21gvanrossumsetnosy: + gvanrossum
messages: + msg107959
2010-03-26 23:55:52pebbesetnosy: + pebbe
2010-01-10 17:16:40r.david.murraylinkissue6854 superseder
2009-11-16 08:47:31tcourbonsetmessages: + msg95330
2009-11-15 14:18:11tercero12setmessages: + msg95292
2009-11-15 13:09:03tcourbonsetnosy: + tcourbon
messages: + msg95288
2009-08-19 20:19:25barrysetmessages: + msg91741
2009-08-18 18:58:21tercero12setmessages: + msg91711
2009-08-10 14:42:24tercero12setmessages: + msg91449
2009-08-10 13:50:35tobiassetnosy: + tobias
messages: + msg91444
2009-06-08 20:39:29tercero12setfiles: + unittest.zip

messages: + msg89112
2009-06-05 18:46:19r.david.murraysetpriority: high

nosy: + r.david.murray
messages: + msg88962

stage: test needed
2009-06-05 18:29:41tercero12setmessages: + msg88960
versions: + Python 3.2
2009-04-15 21:00:39tercero12setnosy: + tercero12
2009-01-17 03:40:52ggenellinasetmessages: + msg80000
2009-01-16 22:08:58oopossetfiles: + opsuper.pl
messages: + msg79981
2009-01-16 08:13:34ggenellinasetversions: + Python 3.1
nosy: + ggenellina
title: Cannot upload binary file from form ? -> cgi module cannot handle POST with multipart/form-data in 3.0
messages: + msg79939
components: + Library (Lib)
type: performance -> behavior
2009-01-15 16:45:24amaury.forgeotdarcsetnosy: + barry
messages: + msg79901
2009-01-15 14:34:46oopossetfiles: + full_source_and_error.zip
messages: + msg79898
2009-01-15 12:43:51amaury.forgeotdarcsetmessages: + msg79896
2009-01-15 11:58:36oopossetfiles: + tmpy44zj7.html
messages: + msg79894
2009-01-15 10:59:25amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg79893
2009-01-15 09:02:46ooposcreate