Author vstinner
Recipients amaury.forgeotdarc, barry, eric.araujo, erob, flox, ggenellina, gvanrossum, oopos, pebbe, pitrou, quentel, r.david.murray, tcourbon, tercero12, tobias, v+python, vstinner
Date 2011-01-10.13:03:49
SpamBayes Score 0.0
Marked as misclassified No
Message-id <>
In-reply-to <>
Some comments on cgi_diff_20110109.txt, especially on FieldStorage 

Le dimanche 09 janvier 2011 13:26:24, vous avez écrit :
> Here is the diff file for the revised version of

+            import msvcrt
+            msvcrt.setmode (0, os.O_BINARY) # stdin  = 0
+            msvcrt.setmode (1, os.O_BINARY) # stdout = 1
+            msvcrt.setmode (2, os.O_BINARY) # stderr = 2

Why do you change stdout and stderr mode? Is it needed? Instead of 0, you 
should use sys.stdin.fileno() with a try/except on .fileno() because stdin can 
be a StringIO object:

   >>> o=io.StringIO()    
   >>> o.fileno()
   io.UnsupportedOperation: fileno

I suppose that it's better to do nothing if sys.stdin has no .fileno() method.

More generally, I don't think that the cgi module should touch sys.stdin mode: 
it impacts the whole process, not only the cgi module. Eg. change sys.stdin 
mode in Python 3.1 will break the interperter because the Python parser in 
Pytohn 3.1 doesn't know how to handle \r\n end of line. If you need binary 
stdin, I should backport my patch for #10841 (for std*, FileIO and the 

def __init__(self, fp=None, headers=None, outerboundary="",
             environ=os.environ, keep_blank_values=0, strict_parsing=0,
if 'QUERY_STRING' in environ:
   qs = environ['QUERY_STRING']
elif sys.argv[1:]:
   qs = sys.argv[1]
   qs = ""
fp = BytesIO(qs.encode('ascii')) # bytes

With Python 3.2, you should use environ=environ.os.environb by default to 
avoid unnecessary conversion (os.environb --decode--> os.environ --encode--> 
qs). To decode sys.argv, ASCII is not the right encoding: you should use 
qs.encode(locale.getpreferredencoding(), 'surrogateescape') because Python 
decodes the environment and the command line arguments from 
locale.getpreferredencoding()+'surrogateescape', so it is the exact reverse 
operation and you get the original raw bytes.

For Python 3.1, use also qs.encode(locale.getpreferredencoding(), 
'surrogateescape') to encode the environment variable.

So for Python 3.2, it becomes something like:
def __init__(self, fp=None, headers=None, outerboundary="",
             environ=os.environb, keep_blank_values=0, strict_parsing=0,
if 'QUERY_STRING' in environ:
   qs = environ[b'QUERY_STRING']
elif sys.argv[1:]:
   qs = sys.argv[1]
   qs = b""
if isinstance(qs, str):
   encoding = locale.getpreferredencoding()
   qs = qs.encode(encoding, 'surrogateescape'))
fp = BytesIO(qs)
If you would like to support byte *and* Unicode environment (eg. 
environ=os.environ and environ=os.environb), you should do something a little 
bit more complex: see os.get_exec_path(). I can work on a patch if you would 
like to. A generic function should maybe be added to the os module, function 
with an optional environ argument (as os.get_exec_path()).

if fp is None:
   fp = sys.stdin
if fp is sys.stdin:
you should use sys.stdin.buffer if fp is None, and accept sys.stdin.buffer in 
the second test. Something like:
stdin = sys.stdin
if isinstance(fp,TextIOBase):
   stdin_buffer = stdin.buffer
   stdin_buffer = stdin
if fp is None:
   fp = stdin_buffer
if fp is stdin or fp is stdin_buffer:

Don't you think that a warning would be appropriate if sys.stdin is passed 
        # must return bytes
        if isinstance(fp,TextIOBase):
            self.fp = fp.buffer
            self.fp = fp
Maybe a DeprecationWarning if we would like to drop support of TextIOWrapper 
later :-)

For the else case: you should maybe add a strict test on the type, eg. check 
for RawIOBase or BufferedIOBase subclass, isinstance(fp, (io.RawIOBase, 
io.BufferedIOBase)). It would avoid to check that returns a bytes 
object (or get an ugly error later).

Set sys.stdin.buffer.encoding attribute is not a good idea. Why do you modify 
fp, instead of using a separated attribute on FieldStorage (eg. 
        # field keys and values (except for files) are returned as strings
        # an encoding is required to decode the bytes read from self.fp
        if hasattr(fp,'encoding'):
            self.fp.encoding = fp.encoding
            self.fp.encoding = 'latin-1' # ?

I only read the constructor code.
Date User Action Args
2011-01-10 13:03:51vstinnersetrecipients: + vstinner, gvanrossum, barry, amaury.forgeotdarc, ggenellina, pitrou, eric.araujo, v+python, r.david.murray, oopos, tercero12, tcourbon, tobias, flox, pebbe, quentel, erob
2011-01-10 13:03:49vstinnerlinkissue4953 messages
2011-01-10 13:03:49vstinnercreate