This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: wsgiref.handlers.BaseHandler and subclasses of str
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: andreypopp, eric.araujo, exarkun, ods, pitrou, pje, r.david.murray, riffm
Priority: normal Keywords: patch

Created on 2011-01-18 17:29 by riffm, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pep-3333-no-subclasses.diff eric.araujo, 2011-06-04 14:50
Messages (24)
msg126471 - (view) Author: Tim Perevezentsev (riffm) Date: 2011-01-18 17:29
This code:

    assert type(val) is StringType,"Header values must be strings"

(from here http://svn.python.org/view/python/tags/r271/Lib/wsgiref/handlers.py?revision=86833&view=markup)

from "start_response" method, is not allowing to use str subclasses objects as header value.

Usecase:

I made class URL which subclasses str and has additional methods to manipulate query string. It is very handy. But when I need to set header
"Location" with URL object as value I get assertion error.

Can't we do this instead:

    assert isinstance(val, str),"Header values must be strings"
msg126478 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-18 19:16
This is by design.  PEP 333 and PEP 3333 contain more information about that.  You’ll need to convert your objects to str before passing them to start_response.  Sorry!
msg126480 - (view) Author: Tim Perevezentsev (riffm) Date: 2011-01-18 19:25
str - immutable. So every str subclass object is normal string. I don't see any design violation here.
msg126483 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-18 19:34
Eric, could you point out the part of the specification that requires exactly a string and makes a string subclass invalid?  I did a quick scan and couldn't find it, and unfortunately don't have the time to re-read the whole spec right now.
msg126487 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-18 19:46
See http://bugs.python.org/issue5800#msg121958
msg126488 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-18 19:53
OK.  So he is saying that when the spec says "an object of type str" he means 'type(x) is str' as opposed to 'isinstance(x, str)'.  I would naively have expected the latter, as other people clearly do as well.  I didn't participate in any of the discussions that led to this decision, so I won't pursue it further, but it does break the expectation that many people have about how python programs work, so I expect we'll be seeing this bug report again sometime :)
msg126522 - (view) Author: Denis S. Otkidach (ods) * Date: 2011-01-19 11:00
Current behavior is unpythonic: documentation explicitly mentions isinstance as preferred way to check type (see http://docs.python.org/library/types.html ).

Also 2.7 is the last minor version with str as "main" string type. So I believe it should use isinstance(val, basestring) to help transition to Python 3.
msg126537 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-19 16:43
Doesn't matter how unpythonic it is: the spec calls for exact types and has done so for six years already, so it's a bit late to do anything about it.  (And any version of Python that allowed string subclasses was in violation of the spec and therefore buggy.)

In principle, this class could allow non-str objects if and ONLY if they were converted to actual str objects upon receipt -- but they would have to be the *exact* type after this conversion.

If somebody wants to implement that, I have no objection.  But it MUST reject non-basestring input values and values that don't convert to an exact type str.  (IOW, "type(str(x)) is str" must hold.)

To put it another way, the WSGI protocol requires output headers to be of type 'list' where all elements are type 'tuple' and containing two 'str' entries.  The Headers class cannot fulfill this contract if it allows non-conforming input.  So non-conforming input must either be rejected or made to conform.
msg126538 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-19 16:52
> OK.  So he is saying that when the spec says "an object of type str" he 
> means 'type(x) is str' as opposed to 'isinstance(x, str)'.  I would
> naively have expected the latter, as other people clearly do as well.

+1 with RDM here.

> Doesn't matter how unpythonic it is: the spec calls for exact types

Can you clarify why it does?
msg126542 - (view) Author: Andrey Popp (andreypopp) Date: 2011-01-19 17:12
> the spec says "an object of type str" he means 'type(x) is str' as opposed to 'isinstance(x, str)'

-1 Liskov substitution principle states, that every subtype S of type T can be used whenever type T is used.
msg126555 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-19 18:33
One of the original reasons was to make it easier for server authors writing C code to interface with WSGI.  C APIs that operate on lists and dicts often do not do what you would expect, when called on a subclass.  Essentially, this could lead to an app that appears 
to work correctly on one server, but breaks strangely when run on another.

(IOW, Python's C API and built-in types often break the Liskov principle: there are C-level operations that don't call back into Python subclass methods, so overriding just a few methods usually doesn't work as expected.)

Another reason was to avoid having to document precisely which methods of a str, list, etc. are required to be implemented.  (This is somewhat easier now that we have abc's, but really, it's still a royal PITA.)

In any event, it's entirely moot now, six years later.  Any change requests should be sent to the Web-SIG for WSGI 2.0 discussion, as changing the existing PEPs is not an option.  (Guido has pronounced that I cannot change PEP 333 in any way, so even if I agreed with the requests in this thread, there is simply no way that wsgiref is changing in 2.x.  PEP 3333 has just been approved as well, so the odds of even a 3.x change are low.  But as I said, I won't object to a Headers patch that *converts* its non-conforming inputs to objects of type str, as long as they were stringlike objects to start with.)
msg126595 - (view) Author: Denis S. Otkidach (ods) * Date: 2011-01-20 09:50
Phillip, your argument about interfacing with code written in C doesn't work for built-in immutable types like str. Any subclass of str must call str.__new__ thus keeping proper internal state.
msg126604 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2011-01-20 13:14
> Phillip, your argument about interfacing with code written in C doesn't work for built-in immutable types like str.

Sure it does.  Definitely-str is easier to handle in C than maybe-str-subclass.  It doesn't matter that str.__new__ gets called.  Other things might get called too, with who-knows-what side-effects.

wsgi is right to demand str and only str and exactly str.
msg126605 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-20 13:16
> Jean-Paul Calderone <invalid@example.invalid> added the comment:
> 
> > Phillip, your argument about interfacing with code written in C
> doesn't work for built-in immutable types like str.
> 
> Sure it does.  Definitely-str is easier to handle in C than
> maybe-str-subclass.

Well, PyString_AsString() works on subclasses as well as on str itself.
I'm not sure what operations you're thinking about here.
msg126607 - (view) Author: Andrey Popp (andreypopp) Date: 2011-01-20 13:24
I've also sent message[1] to web-sig about this issue.

[1]: http://mail.python.org/pipermail/web-sig/2011-January/004986.html
msg126636 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-20 18:39
PyString_AsString() only "works on subclasses" if their internal representation is the same as type str.  So we can't say "subclass of str" without *also* specifying that the subclass store its contents in exactly the same way as an object of type str...  which means all we've really done is to make the specification longer and more complicated.
msg126637 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-20 19:30
> PyString_AsString() only "works on subclasses" if their internal
> representation is the same as type str.  So we can't say "subclass of
> str" without *also* specifying that the subclass store its contents in
> exactly the same way as an object of type str...

There's no point in subclassing str if you're using a different
representation. You're not only wasting space, but some things will
behave badly (precisely because of lot of C functions will call
PyString_Check() and then PyString_AsString()).
So, what you call a limitation isn't really one.

> which means all we've really done is to make the specification longer
> and more complicated

That doesn't follow from the above.
msg126749 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-21 15:40
Implicit knowledge in your own head about what might or might not be a good idea to program is not the same thing as a specification.  "type(x) is str" is a good specification in this context, while "string subclasses, but only if they're really str" does not.

And the reason why that is, is because the first specification allows server implementers to say, "your type is not str, so you are not conformant; go fix your code."   The second "specification" is just an invitation to (number of server implementations)*(number of string implementations) arguments about what conformance is.

That makes the latter an objectively worse specification than the first, even if EVERYONE would prefer to be able to use their own string type.  Practicality beats purity, and explicit is better than implicit.  These principles are doubly true where interop protocol definitions are concerned.

To put it another way, the greater the social separation between parties, the more explicit and specific a contract between those two parties has to be in order to co-ordinate their actions, because they can't rely on having the same operating assumptions as the other party.  This situation is a terrific example of that principle in action.  ;-)
msg126751 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-21 15:53
> Implicit knowledge in your own head about what might or might not be a
> good idea to program is not the same thing as a specification.
> "type(x) is str" is a good specification in this context, while
> "string subclasses, but only if they're really str" does not.
> 
> And the reason why that is, is because the first specification allows
> server implementers to say, "your type is not str, so you are not
> conformant; go fix your code."   The second "specification" is just an
> invitation to (number of server implementations)*(number of string
> implementations) arguments about what conformance is.

You might argue about this all the way you want, but let me repeat it:
the interpreter already, implicitly, uses the "second specification" in
many of its internal routines (e.g. C implementations of stdlib
functions and types). Why do you think what is fine for the interpreter
and its stdlib is not fine for WSGI?

> Practicality beats purity, and explicit is better than implicit. 

And, ironically, you are arguing for a pure specification at the expense
of practicality. As for explicit/implicit, it isn't involved here: an
isinstance() test is as explicit as a type() equality test.
msg126761 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-01-21 18:19
1. WSGI is a *Python* spec, not a *CPython* spec, so CPython implementation details have little bearing on how the spec should work.

Most non-CPython implementations have a native string type optimized for their runtime or VM (i.e. Jython and IronPython), and thus have *different* implementation-dependent details of what constitutes a "native" string, other than that it has type 'str' as seen by Python code.  (See my previous point about people bringing these sorts of implicit assumptions to the table!)

(Also, as you may or may not be aware, WSGI supports Python 2.1, mainly for Jython's sake at the time, and in Python 2.1 there was *no such thing* as a subclass of 'str' to begin with.)

2. The specific practical goal of WSGI is interoperability, *not* programmer convenience.  (Says so right in the PEP.)

3. You are free to patch wsgiref's Headers class to *accept* non-str values (as long as they are converted to conforming strings in the process) and to make any proposals you like for future versions of WSGI or another interop protocol via the Web-SIG, so this entire discussion has been moot for some time now.

4. The explicit-vs-implicit is about the contract defined in the spec (making explicit what, precisely, is required of both parties), not the type test.

5. Since this discussion is moot (see point 3), we'll have to agree to disagree here.  Whether you (or I!) like what the spec says is not what matters: wsgiref is the *reference library* for the specification, and therefore must conform to the actual spec, not what any of us would like the spec to be.

Heck, I tried to add some *much* more minor amendments to PEP 333 than this about three or four months ago, and Guido said he'd reject the whole PEP if I tried!  That's why we have PEP 3333 instead of a slightly-amended PEP 333.  This particular bikeshed was painted six years ago, so let's get on with the actual bicycling already.
msg126776 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-21 19:34
> 4. The explicit-vs-implicit is about the contract defined in the spec (making explicit what, precisely, is required of both parties), not the type test.

Perhaps a clarification in the (3333) spec that 'type str' means "type(s) is str" would be in order, then, since this discussion makes it clear that "type str" in English can also mean "isinstance(s, str)".
msg126842 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-22 16:48
FYI, #10977 has been opened to tackle the general subclasses problem.
msg137643 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-06-04 14:50
Here’s my try at making the spec more explicit about str subclasses.
msg137661 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2011-06-04 19:33
That change to the spec is fine, though you might also want to add something like, "Like all other WSGI specification types", since *all* types specified in WSGI are 'type()' not 'isinstance()'.
History
Date User Action Args
2022-04-11 14:57:11adminsetgithub: 55144
2011-06-04 19:33:07pjesetmessages: + msg137661
2011-06-04 14:50:06eric.araujosetfiles: + pep-3333-no-subclasses.diff
keywords: + patch
messages: + msg137643
2011-01-22 16:48:17eric.araujosetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126842
2011-01-21 19:34:07r.david.murraysetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126776
2011-01-21 18:19:52pjesetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126761
2011-01-21 15:53:34pitrousetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126751
2011-01-21 15:40:05pjesetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126749
2011-01-20 19:30:56pitrousetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126637
2011-01-20 18:39:44pjesetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126636
2011-01-20 13:24:21andreypoppsetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126607
2011-01-20 13:16:34pitrousetnosy: pje, exarkun, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126605
2011-01-20 13:14:21exarkunsetnosy: + exarkun
messages: + msg126604
2011-01-20 09:50:38odssetnosy: pje, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126595
2011-01-19 18:33:45pjesetnosy: pje, ods, pitrou, eric.araujo, r.david.murray, riffm, andreypopp
messages: + msg126555
2011-01-19 17:12:13andreypoppsetnosy: + andreypopp
messages: + msg126542
2011-01-19 16:52:33pitrousetnosy: + pitrou
messages: + msg126538
2011-01-19 16:43:27pjesetnosy: pje, ods, eric.araujo, r.david.murray, riffm
messages: + msg126537
2011-01-19 11:00:20odssetnosy: + ods
messages: + msg126522
2011-01-18 19:53:12r.david.murraysetnosy: pje, eric.araujo, r.david.murray, riffm
messages: + msg126488
2011-01-18 19:46:32eric.araujosetnosy: pje, eric.araujo, r.david.murray, riffm
messages: + msg126487
2011-01-18 19:34:32r.david.murraysetnosy: + r.david.murray
messages: + msg126483
2011-01-18 19:25:45riffmsetnosy: pje, eric.araujo, riffm
messages: + msg126480
2011-01-18 19:16:01eric.araujosetstatus: open -> closed

versions: - Python 2.6, Python 2.5
nosy: + eric.araujo, pje

messages: + msg126478
resolution: not a bug
stage: resolved
2011-01-18 17:29:19riffmcreate