Message 57705 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vincentk
Recipients	dalke, ijmorlan, jjlee, paulj, skip.montanaro, vincentk
Date	2007-11-20.18:57:04
SpamBayes Score	0.018343864
Marked as misclassified	No
Message-id	<1195585025.36.0.234264397575.issue1462525@psf.upfronthosting.co.za>
In-reply-to

Content
Quite like urlparse, uriparse does not fail on input which does not represent valid URI's. At least not early or reliably enough. Specifically, I noticed that urisplit does not fail on input strings with a missing scheme, such as "foo.com/bar". I see no (straightforward) solution to this problem, short of using a proper parser library such as Haskell's Parsec (I unfortunately know of no Python equivalent), but I thought I might want to report this issue nevertheless. The following might work as a quick-fix: Replace regex.match(foo,bar).groups() with something like: mm = re.match(regex, uri) sp = mm.span() if (-1 in sp) or (sp[1] - sp[0] != len(uri)): raise ValueError, "uri regex did not match complete input" p = mm.groups()

Quite like urlparse, uriparse does not fail on input which does not
represent valid URI's. At least not early or reliably enough.
Specifically, I noticed that urisplit does not fail on input strings
with a missing scheme, such as "foo.com/bar". 

I see no (straightforward) solution to this problem, short of using a
proper parser library such as Haskell's Parsec (I unfortunately know of
no Python equivalent), but I thought I might want to report this issue
nevertheless. 

The following might work as a quick-fix: Replace
regex.match(foo,bar).groups()

with something like:

    mm = re.match(regex, uri)
    sp = mm.span()
    if (-1 in sp) or (sp[1] - sp[0] != len(uri)):
        raise ValueError, "uri regex did not match complete input"
    
    p = mm.groups()

History
Date	User	Action	Args
2007-11-20 18:57:05	vincentk	set	spambayes_score: 0.0183439 -> 0.018343864 recipients: + vincentk, skip.montanaro, jjlee, dalke, paulj, ijmorlan
2007-11-20 18:57:05	vincentk	set	spambayes_score: 0.0183439 -> 0.0183439 messageid: <1195585025.36.0.234264397575.issue1462525@psf.upfronthosting.co.za>
2007-11-20 18:57:05	vincentk	link	issue1462525 messages
2007-11-20 18:57:04	vincentk	create