This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author scoder
Recipients eli.bendersky, ethan.furman, flox, jcea, jkloth, ncoghlan, python-dev, scoder
Date 2013-08-27.07:15:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1377587752.16.0.513918110729.issue17741@psf.upfronthosting.co.za>
In-reply-to
Content
> Putting _setevents aside for the moment,

Agreed, obviously.


> XMLParser is a clean and simple API. Its output is only "push" (by calling callbacks on the target). It doesn't deal with Elements at all.

We already agreed on that, too. Keep things simple.


> It decomposes the input XML and produces push events. That's it. It doesn't even have to interact with the target to a larger extent than pushing callback calls to it.

It does, though. That's essentially how iterparse works. I.e. the API between parser and target that I'm using here is already there. And has been for years. And I think that the feature in question shows that this was a particularly good design. Thank you Fredrik!


> You propose to add a completely different mode of operation to it, controlled by a constructor flag. In "collect_events mode", XMLParser undergoes a transformation.

We clearly disagree here.


> Coupled with "weird" targets, its semantics become totally unclear.

Try to come up with a patch for your own proposal that avoids that. You will notice that you will be using the existing protocol, thus the target will continue to be able to do what it wants. If it's a weird target, then it's a weird target because the user said so. But it's the target's very own concern to do expected things (build a tree) or do weird things (whatever an example for this is).

Except when you disallow using user defined targets, but then, that's an arbitrary restriction that ignores what is there already.

As long as you want to keep up the existing separation of concern between parser and target (and we agreed on that being a good thing), and as long as you don't impose arbitrary (and arbitrarily complex) restrictions on user code, you will not be able to prevent the target from doing "weird" things that impact the events being collected by whatever interface you choose for this. And why should you?


> XMLParser is not supposed to be an isolated entity. Users should be able to implement custom parsers that provide its API. When the API is simple - feed/close and some callbacks, replacing/ehnancing/faking it is easy. When the API is complex -- two different modes of operation -- it's considerably more difficult.

That's only true when you use it for incremental event parsing. If you don't, that part of the API doesn't have to bother you. If you want to implement a custom parser that doesn't support event collection, then simply don't support the "collect_events" keyword argument and you're done. Implement read_events() by returning an empty iterator, if you like, that's up to you. If you want to write a target that prevents parsers from keeping tree fragments alive in memory, simply return None from your callback methods and you're done. If you want to write a target that changes the Elements that the parser collects, then you can do that. Yes, those are actually features. And you could already do most of this before.

Remember that it's the users using the parser and providing (or implementing) the target. We should let them.


I'm sorry if I sound offensive, but most of what I read as counter arguments in this ticket so far was either incomplete or (sometimes) even plain wrong. Some seemed pretty close to FUD, but maybe that's just me (and I'm guilty here too, I guess - sorry for that). Now, how could this discussion possibly give the me impression that everyone has the same level of understanding? Even your comment about "weird" targets, right above, after going through all of this discussion, makes me wonder if you really know how all of this works.

Please take a closer look at the original code before Antoine changed it. Play with the parser targets a bit. That will help you understand this issue better.


Also, if you want to come up with a patch that implements approach 3), i.e. a class between parser and target, then please do. I tried it (the incomplete and partially fake patch is attached to this ticket), and my take on it is that it's too complex and requires some rather non-trivial changes, including user visible ones. So, based on that experience, I decided that it would be better to keep things simple and integrate the feature more closely with what is there anyway. I think it might help if you went through the same experience.
History
Date User Action Args
2013-08-27 07:15:52scodersetrecipients: + scoder, jcea, ncoghlan, jkloth, eli.bendersky, flox, ethan.furman, python-dev
2013-08-27 07:15:52scodersetmessageid: <1377587752.16.0.513918110729.issue17741@psf.upfronthosting.co.za>
2013-08-27 07:15:52scoderlinkissue17741 messages
2013-08-27 07:15:50scodercreate