This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Increased test coverage of test_urlparse
Type: Stage: resolved
Components: Tests Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: eric.araujo, ezio.melotti, haggholm, orsenthil, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2011-07-18 05:46 by haggholm, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
urlparsetest.patch haggholm, 2011-07-18 05:46 Patch review
urlparsetest.patch haggholm, 2011-07-19 06:21 Updated patch review
urlparsetest.patch haggholm, 2011-07-22 04:48 Updated patch review
Messages (10)
msg140560 - (view) Author: Petter Haggholm (haggholm) Date: 2011-07-18 05:46
Some very trivial tests to increase the test coverage on test_urlparse a bit; also changed a single line to use an ABC instead of attempting to use len() to verify that an object is "sequence-like" (as the comment put it). Mostly I’m trying to get my feet wet in submitting a patch in the manner suggested by the guide.

(Curiously, the full test suite coverage report cites 99%, even though the path in question does run: coverage.py fails somehow to mark an else branch consisting of a lone continue statement.)

Looking at this, I have some questions, but figured I might as well ask them along with a patch: First, whether this is appropriate use of ABCs in the library; second, whether it’s appropriate to submit related stuff like modified tests and (lightly) modified code in one patch, or whether I should rather open two issues.

Third, and more generally, I’m wondering whether the tests are appropriate. These are trivial in scope and nature, but I would be interested to know for future reference (and perhaps it would be useful to mention in the docs?) what the policy is on this. Essentially, I encoded expectations of current behaviour as “correct” and covered the 30-odd statements in urllib/parse.py that were not already covered by a full test run (explicit coverage by test_urlparse is still much lower!) on the assumption that, as the coverage guide suggests, more coverage is always better – and even without domain expertise assurance that current behaviour is correct, it at least provides some assurance that changes in behaviour will be discovered. Still, it’s perhaps *too* easy to get hung up on those coverage percentages: Is it *always* better to cover more (keeping in mind the limitations of once-over coverage), or would contributors be better advised not to cover code unless they are very, very confident that current behaviour is correct?
msg140575 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-18 13:28
I haven't reviewed your tests, but a couple quick comments: we generally prefer duck typing to the use of isintance or ABCs, but sometimes the latter is better (it's a judgement call).  I haven't done a deep dive in the code you modified, but from the looks of the code quoted in your patch I'd say that doing 'iter(v)' inside the try/except would be the way to find out if one can loop over the object, which appears to be the only aspect of sequenceness the code cares about.

As for coverage, you are right that it is quite possible to get caught up in the statistics.  That said, if you *don't* have domain knowledge, giving us a set of tests to look at and evaluate is better than not having such a set of tests.  Pointing out any tests that you aren't sure about the validity of is helpful in any case.

The overall goal of the test suite is to test the *API* of the library functions.  This is so that alternate implementations can use it as a validation test suite  (Sometimes we have CPython specific tests, in which case we mark them as such).  So testing internal implementation details is not as helpful as testing behavior.  If you find you have to use a "white box" test (one that pokes at the internals as opposed to making an appropriate call to the API), then the code you can't otherwise test becomes suspect and an appropriate subject for another issue ("what is this code for?  I can't get it to trigger.")

Finally, your point about comprehensive tests at least showing up behavior changes is valid.  If you write tests that you aren't sure are "correct behavior", put in an XXX comment to that effect.  If you just have no idea, you can mark a whole block of tests as "this improves coverage, I have no idea if the behavior is valid or not", and we'll either sort it out when we review or commit the tests or just leave the comment in.

Thanks for working on this.
msg140651 - (view) Author: Petter Haggholm (haggholm) Date: 2011-07-19 06:21
It’s my pleasure — it’s very trivial, but hopefully it’ll get my feet wet and get me in a place where I am familiar enough with procedures and things to contribute something relevant. :)

Attaching a modified patch with (1) reversion to duck typing in parse.py, and (2) a few comments added to the tests. At this point, it may even be worth reviewing when someone has the time. (While the tests are pretty trivial, and admittedly coverage driven, they do hit a few edge cases with blank values &c. that were previously not run.)
msg140652 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-07-19 07:21
Hi Petter, writing tests are ofcourse a good way to start. As long as the tests increase the coverage, those are most welcome. Thanks!
msg140860 - (view) Author: Petter Haggholm (haggholm) Date: 2011-07-22 04:48
Added suggested changes from review, removed (rather useless) repr test; left parse.py changes alone (see review comments for rationale)
msg140950 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-23 09:37
Peter’s patch now uses iter(thing) instead of len(thing) to assess sequenciness.  I made this comment:
> You can iterate over an iterator (which is not a sequence). Here I
> don’t know if the code talks about sequence because it pre-dates
> iterators or because it really wants sequences.

Peter’s reply:
> Fair point. The *code*, though, doesn't say either, and len() doesn't
> express it any better (IMO) than iter(). It seems to me that the
> check basically just verifies that the presumed sequence can be 
> safely iterated over, before trying to enter the for loop. If so, my
> iter() check seems to encapsulate it cleanly in code rather than
> needing an additional comment.

I think iter may exhaust a non-repeatable iterable, whereas len would give a TypeError and not consume the iterable.  iter(thing) being successful does not guarantee that a second iteration will succeed.

I’ve had a look at the docstring and the reST docs, and they clearly say that sequences are supported, not arbitrary iterables.

> If not -- if the code should enforce "true sequence-ness" -- maybe it
> should use the ABC (as my first patch did)? Which is *correct*,
> though, I'm too much an outsider to judge!

Note that ABCs are not strict, “true” type checks like there is in other languages :)

> (I tried to check the revlog for clues, but the code entered in
> 5a416a6417d3, which was the new urllib package combining several old
> packages. It's been there for a while.)

Log viewing tools usually let you follow the history across renames, or otherwise let you see the history for a file that’s not anymore in the tip revision.  Here it was Lib/urlparse.py.  (Just saying this for your information, it does not matter anymore to check the history to find if the code pre-dates iterators.)
msg140961 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-07-23 10:19
On Sat, Jul 23, 2011 at 09:37:27AM +0000, Éric Araujo wrote:
> I’ve had a look at the docstring and the reST docs, and they clearly
> say that sequences are supported, not arbitrary iterables.

Yeah. At the first cut, when I saw the suggestion of iter(), I thought
it was better, but looking at it again, we just need to test the
sequence-ness. I would leave it as such with len() unless we come up
with a case that it does not test the sequenceness completely.

As Eric points out, changing it to iter may cause some side-effects
(of exhaustion of the container)
msg140963 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-07-23 10:43
New changeset e171db785c37 by Senthil Kumaran in branch '3.2':
Fix closes issue12581 - Increase the urllib.parse test coverage. Patch by Petter Haggholm.
http://hg.python.org/cpython/rev/e171db785c37

New changeset fcccda3c546f by Senthil Kumaran in branch 'default':
merge from 3.2 - Fix closes issue12581 - Increase the urllib.parse test coverage. Patch by Petter Haggholm.
http://hg.python.org/cpython/rev/fcccda3c546f

New changeset ed79da800b4a by Senthil Kumaran in branch '2.7':
merge from 3.2 - Fix closes issue12581 - Increase the urllib.parse test coverage (cases applicable to 2.7). Patch by Petter Haggholm.
http://hg.python.org/cpython/rev/ed79da800b4a
msg140967 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-07-23 10:46
Thanks a lot for the patch, Petter Haggholm.

I was initially hesitant to have separate tests for functions/methods
which are helper functions and not necessarily have documented api.
My thought was that those should be tested as part of the public api.

But once I tried coverage report on the module and saw that
unexercised parts, I thought your current patch does indeed a lot of
value. Aiming for a 100% test coverage of module is always a joy!. :)
msg140988 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-23 14:53
Exhaustion of the iterator is easily solved by simply retaining a reference to it and iterating that (which is what I had in mind).  However, I had not thought about the problem of an *in*exhaustable iterator, and to cover that case len is indeed better.
History
Date User Action Args
2022-04-11 14:57:19adminsetgithub: 56790
2011-07-23 14:53:40r.david.murraysetmessages: + msg140988
2011-07-23 10:46:55orsenthilsetmessages: + msg140967
2011-07-23 10:43:11python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg140963

resolution: fixed
stage: resolved
2011-07-23 10:19:54orsenthilsetmessages: + msg140961
2011-07-23 09:37:27eric.araujosetmessages: + msg140950
2011-07-22 04:48:26haggholmsetfiles: + urlparsetest.patch

messages: + msg140860
2011-07-21 19:39:45ezio.melottisetnosy: + ezio.melotti
2011-07-20 15:44:32eric.araujosetnosy: + eric.araujo

versions: + Python 2.7, Python 3.2
2011-07-19 07:21:42orsenthilsetassignee: orsenthil
messages: + msg140652
2011-07-19 06:21:09haggholmsetfiles: + urlparsetest.patch

messages: + msg140651
2011-07-18 13:28:58r.david.murraysetnosy: + orsenthil
2011-07-18 13:28:39r.david.murraysetnosy: + r.david.murray
messages: + msg140575
2011-07-18 05:46:28haggholmcreate