Issue 12581: Increased test coverage of test_urlparse

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/56790

classification

Title:	Increased test coverage of test_urlparse
Type:		Stage:	resolved
Components:	Tests	Versions:	Python 3.2, Python 3.3, Python 2.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	orsenthil	Nosy List:	eric.araujo, ezio.melotti, haggholm, orsenthil, python-dev, r.david.murray
Priority:	normal	Keywords:	patch

Created on 2011-07-18 05:46 by haggholm, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
urlparsetest.patch	haggholm, 2011-07-18 05:46	Patch	review
urlparsetest.patch	haggholm, 2011-07-19 06:21	Updated patch	review
urlparsetest.patch	haggholm, 2011-07-22 04:48	Updated patch	review

Messages (10)
msg140560 - (view)	Author: Petter Haggholm (haggholm)	Date: 2011-07-18 05:46
Some very trivial tests to increase the test coverage on test_urlparse a bit; also changed a single line to use an ABC instead of attempting to use len() to verify that an object is "sequence-like" (as the comment put it). Mostly I’m trying to get my feet wet in submitting a patch in the manner suggested by the guide. (Curiously, the full test suite coverage report cites 99%, even though the path in question does run: coverage.py fails somehow to mark an else branch consisting of a lone continue statement.) Looking at this, I have some questions, but figured I might as well ask them along with a patch: First, whether this is appropriate use of ABCs in the library; second, whether it’s appropriate to submit related stuff like modified tests and (lightly) modified code in one patch, or whether I should rather open two issues. Third, and more generally, I’m wondering whether the tests are appropriate. These are trivial in scope and nature, but I would be interested to know for future reference (and perhaps it would be useful to mention in the docs?) what the policy is on this. Essentially, I encoded expectations of current behaviour as “correct” and covered the 30-odd statements in urllib/parse.py that were not already covered by a full test run (explicit coverage by test_urlparse is still much lower!) on the assumption that, as the coverage guide suggests, more coverage is always better – and even without domain expertise assurance that current behaviour is correct, it at least provides some assurance that changes in behaviour will be discovered. Still, it’s perhaps too easy to get hung up on those coverage percentages: Is it always better to cover more (keeping in mind the limitations of once-over coverage), or would contributors be better advised not to cover code unless they are very, very confident that current behaviour is correct?
msg140575 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2011-07-18 13:28
I haven't reviewed your tests, but a couple quick comments: we generally prefer duck typing to the use of isintance or ABCs, but sometimes the latter is better (it's a judgement call). I haven't done a deep dive in the code you modified, but from the looks of the code quoted in your patch I'd say that doing 'iter(v)' inside the try/except would be the way to find out if one can loop over the object, which appears to be the only aspect of sequenceness the code cares about. As for coverage, you are right that it is quite possible to get caught up in the statistics. That said, if you don't have domain knowledge, giving us a set of tests to look at and evaluate is better than not having such a set of tests. Pointing out any tests that you aren't sure about the validity of is helpful in any case. The overall goal of the test suite is to test the API of the library functions. This is so that alternate implementations can use it as a validation test suite (Sometimes we have CPython specific tests, in which case we mark them as such). So testing internal implementation details is not as helpful as testing behavior. If you find you have to use a "white box" test (one that pokes at the internals as opposed to making an appropriate call to the API), then the code you can't otherwise test becomes suspect and an appropriate subject for another issue ("what is this code for? I can't get it to trigger.") Finally, your point about comprehensive tests at least showing up behavior changes is valid. If you write tests that you aren't sure are "correct behavior", put in an XXX comment to that effect. If you just have no idea, you can mark a whole block of tests as "this improves coverage, I have no idea if the behavior is valid or not", and we'll either sort it out when we review or commit the tests or just leave the comment in. Thanks for working on this.
msg140651 - (view)	Author: Petter Haggholm (haggholm)	Date: 2011-07-19 06:21
It’s my pleasure — it’s very trivial, but hopefully it’ll get my feet wet and get me in a place where I am familiar enough with procedures and things to contribute something relevant. :) Attaching a modified patch with (1) reversion to duck typing in parse.py, and (2) a few comments added to the tests. At this point, it may even be worth reviewing when someone has the time. (While the tests are pretty trivial, and admittedly coverage driven, they do hit a few edge cases with blank values &c. that were previously not run.)
msg140652 - (view)	Author: Senthil Kumaran (orsenthil) *	Date: 2011-07-19 07:21
Hi Petter, writing tests are ofcourse a good way to start. As long as the tests increase the coverage, those are most welcome. Thanks!
msg140860 - (view)	Author: Petter Haggholm (haggholm)	Date: 2011-07-22 04:48
Added suggested changes from review, removed (rather useless) repr test; left parse.py changes alone (see review comments for rationale)
msg140950 - (view)	Author: Éric Araujo (eric.araujo) *	Date: 2011-07-23 09:37
Peter’s patch now uses iter(thing) instead of len(thing) to assess sequenciness. I made this comment: > You can iterate over an iterator (which is not a sequence). Here I > don’t know if the code talks about sequence because it pre-dates > iterators or because it really wants sequences. Peter’s reply: > Fair point. The code, though, doesn't say either, and len() doesn't > express it any better (IMO) than iter(). It seems to me that the > check basically just verifies that the presumed sequence can be > safely iterated over, before trying to enter the for loop. If so, my > iter() check seems to encapsulate it cleanly in code rather than > needing an additional comment. I think iter may exhaust a non-repeatable iterable, whereas len would give a TypeError and not consume the iterable. iter(thing) being successful does not guarantee that a second iteration will succeed. I’ve had a look at the docstring and the reST docs, and they clearly say that sequences are supported, not arbitrary iterables. > If not -- if the code should enforce "true sequence-ness" -- maybe it > should use the ABC (as my first patch did)? Which is correct, > though, I'm too much an outsider to judge! Note that ABCs are not strict, “true” type checks like there is in other languages :) > (I tried to check the revlog for clues, but the code entered in > 5a416a6417d3, which was the new urllib package combining several old > packages. It's been there for a while.) Log viewing tools usually let you follow the history across renames, or otherwise let you see the history for a file that’s not anymore in the tip revision. Here it was Lib/urlparse.py. (Just saying this for your information, it does not matter anymore to check the history to find if the code pre-dates iterators.)
msg140961 - (view)	Author: Senthil Kumaran (orsenthil) *	Date: 2011-07-23 10:19
On Sat, Jul 23, 2011 at 09:37:27AM +0000, Éric Araujo wrote: > I’ve had a look at the docstring and the reST docs, and they clearly > say that sequences are supported, not arbitrary iterables. Yeah. At the first cut, when I saw the suggestion of iter(), I thought it was better, but looking at it again, we just need to test the sequence-ness. I would leave it as such with len() unless we come up with a case that it does not test the sequenceness completely. As Eric points out, changing it to iter may cause some side-effects (of exhaustion of the container)
msg140963 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-07-23 10:43
New changeset e171db785c37 by Senthil Kumaran in branch '3.2': Fix closes issue12581 - Increase the urllib.parse test coverage. Patch by Petter Haggholm. http://hg.python.org/cpython/rev/e171db785c37 New changeset fcccda3c546f by Senthil Kumaran in branch 'default': merge from 3.2 - Fix closes issue12581 - Increase the urllib.parse test coverage. Patch by Petter Haggholm. http://hg.python.org/cpython/rev/fcccda3c546f New changeset ed79da800b4a by Senthil Kumaran in branch '2.7': merge from 3.2 - Fix closes issue12581 - Increase the urllib.parse test coverage (cases applicable to 2.7). Patch by Petter Haggholm. http://hg.python.org/cpython/rev/ed79da800b4a
msg140967 - (view)	Author: Senthil Kumaran (orsenthil) *	Date: 2011-07-23 10:46
Thanks a lot for the patch, Petter Haggholm. I was initially hesitant to have separate tests for functions/methods which are helper functions and not necessarily have documented api. My thought was that those should be tested as part of the public api. But once I tried coverage report on the module and saw that unexercised parts, I thought your current patch does indeed a lot of value. Aiming for a 100% test coverage of module is always a joy!. :)
msg140988 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2011-07-23 14:53
Exhaustion of the iterator is easily solved by simply retaining a reference to it and iterating that (which is what I had in mind). However, I had not thought about the problem of an inexhaustable iterator, and to cover that case len is indeed better.

History
Date	User	Action	Args
2022-04-11 14:57:19	admin	set	github: 56790
2011-07-23 14:53:40	r.david.murray	set	messages: + msg140988
2011-07-23 10:46:55	orsenthil	set	messages: + msg140967
2011-07-23 10:43:11	python-dev	set	status: open -> closed nosy: + python-dev messages: + msg140963 resolution: fixed stage: resolved
2011-07-23 10:19:54	orsenthil	set	messages: + msg140961
2011-07-23 09:37:27	eric.araujo	set	messages: + msg140950
2011-07-22 04:48:26	haggholm	set	files: + urlparsetest.patch messages: + msg140860
2011-07-21 19:39:45	ezio.melotti	set	nosy: + ezio.melotti
2011-07-20 15:44:32	eric.araujo	set	nosy: + eric.araujo versions: + Python 2.7, Python 3.2
2011-07-19 07:21:42	orsenthil	set	assignee: orsenthil messages: + msg140652
2011-07-19 06:21:09	haggholm	set	files: + urlparsetest.patch messages: + msg140651
2011-07-18 13:28:58	r.david.murray	set	nosy: + orsenthil
2011-07-18 13:28:39	r.david.murray	set	nosy: + r.david.murray messages: + msg140575
2011-07-18 05:46:28	haggholm	create