Message 196262 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	eli.bendersky
Recipients	eli.bendersky, ethan.furman, flox, jcea, jkloth, ncoghlan, python-dev, scoder
Date	2013-08-27.04:01:45
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<CAF-Rda_L_x+scUedofJnEwz0iQBOR3C4kNtcUbGE47XHxB8g2g@mail.gmail.com>
In-reply-to	<1377545706.34.0.416759054466.issue17741@psf.upfronthosting.co.za>

Content
On Mon, Aug 26, 2013 at 12:35 PM, Stefan Behnel <report@bugs.python.org>wrote: > > Stefan Behnel added the comment: > > Here is a proof-of-concept patch that integrates the functionality of the > IncrementalParser into the XMLParser. I ended up reusing most of Antoines > implementation and test suite. In case he'll look back into this ticket at > some point, I'll put a "thank you" here. > > The patch certainly needs more cleanup, but it shows where things are > going and passes all tests. > OK, I looked at the patch. So I think I understood your verbal proposal correctly earlier. And as I said then, I don't like it. I believe Nick has summarized nicely why heaping everything into XMLParser is a bad idea, but since you went the extra mile and made a patch, I will try to provide a more expanded rationale. Putting _setevents aside for the moment, XMLParser is a clean and simple API. Its output is only "push" (by calling callbacks on the target). It doesn't deal with Elements at all. It decomposes the input XML and produces push events. That's it. It doesn't even have to interact with the target to a larger extent than pushing callback calls to it. You propose to add a completely different mode of operation to it, controlled by a constructor flag. In "collect_events mode", XMLParser undergoes a transformation. It becomes a pull API. It starts interacting with the target more deeply by using it to produce elements for events. Coupled with "weird" targets, its semantics become totally unclear. So it's a single class that acts completely differently based on an argument. This is very unpythonic. But things get worse. XMLParser is not supposed to be an isolated entity. Users should be able to implement custom parsers that provide its API. When the API is simple - feed/close and some callbacks, replacing/ehnancing/faking it is easy. When the API is complex -- two different modes of operation -- it's considerably more difficult. That's the gist of it. Sticking to the current design of separation of concerns between the parser and the target/treebuilder is IMO the better way looking forward. In the mean-time, a helper class for asynchronous parsing is useful and timely. By restricting its API, changing its underlying implementation will be easy in the future.

On Mon, Aug 26, 2013 at 12:35 PM, Stefan Behnel <report@bugs.python.org>wrote:

>
> Stefan Behnel added the comment:
>
> Here is a proof-of-concept patch that integrates the functionality of the
> IncrementalParser into the XMLParser. I ended up reusing most of Antoines
> implementation and test suite. In case he'll look back into this ticket at
> some point, I'll put a "thank you" here.
>
> The patch certainly needs more cleanup, but it shows where things are
> going and passes all tests.
>

OK, I looked at the patch. So I think I understood your verbal proposal
correctly earlier. And as I said then, I don't like it. I believe Nick has
summarized nicely why heaping everything into XMLParser is a bad idea, but
since you went the extra mile and made a patch, I will try to provide a
more expanded rationale.

Putting _setevents aside for the moment, XMLParser is a clean and simple
API. Its output is only "push" (by calling callbacks on the target). It
doesn't deal with Elements at all. It decomposes the input XML and produces
push events. That's it. It doesn't even have to interact with the target to
a larger extent than pushing callback calls to it.

You propose to add a completely different mode of operation to it,
controlled by a constructor flag. In "collect_events mode", XMLParser
undergoes a transformation. It becomes a pull API. It starts interacting
with the target more deeply by using it to produce elements for events.
Coupled with "weird" targets, its semantics become totally unclear.

So it's a single class that acts completely differently based on an
argument. This is very unpythonic. But things get worse. XMLParser is not
supposed to be an isolated entity. Users should be able to implement custom
parsers that provide its API. When the API is simple - feed/close and some
callbacks, replacing/ehnancing/faking it is easy. When the API is complex
-- two different modes of operation -- it's considerably more difficult.

That's the gist of it. Sticking to the current design of separation of
concerns between the parser and the target/treebuilder is IMO the better
way looking forward. In the mean-time, a helper class for asynchronous
parsing is useful and timely. By restricting its API, changing its
underlying implementation will be easy in the future.

History
Date	User	Action	Args
2013-08-27 04:01:47	eli.bendersky	set	recipients: + eli.bendersky, jcea, ncoghlan, scoder, jkloth, flox, ethan.furman, python-dev
2013-08-27 04:01:47	eli.bendersky	link	issue17741 messages
2013-08-27 04:01:45	eli.bendersky	create