This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Problem with urllib and urllib2 in urlopen?
Type: behavior Stage:
Components: Extension Modules Versions: Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: BitTorment, ambarish, benjamin.peterson, orsenthil
Priority: normal Keywords:

Created on 2008-05-15 17:34 by ambarish, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg66872 - (view) Author: Ambarish Malpani (ambarish) Date: 2008-05-15 17:34
I have the following code:

import urllib
u = 'http://www.mercurynews.com/ci_9216417'
h = urllib.urlopen(u).read()
print h
# Get an empty string
#(can use urllib2 also - get the same behavior)

If I visit the same page with my browser, get the contents of the page
(after some redirects...)
msg66881 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-05-15 21:45
This is what happens on the trunk:

>>> import urllib
>>> u = 'http://www.mercurynews.com/ci_9216417'
>>> h = urllib.urlopen(u).read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/temp/python/trunk/Lib/ssl.py", line 333, in read
data = self._sslobj.read(recv_size)
ssl.SSLError: [Errno 8] _ssl.c:1276: EOF occurred in violation of protocol
msg66888 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2008-05-16 02:36
Here is my analysis:

>>> import urllib2
>>> url = "http://www.mercurynews.com/ci_9216417"
>>> content = urllib2.urlopen(url).read()
>>> print content

>>> opened = urllib2.urlopen(url)
>>> print opened.geturl()
https://secure.passport.mnginteractive.com/mngi/servletDispatch/ErightsPassportServlet.dyn?url=http://www.mercurynews.com/ci_9216417?nclick_check=1&forced=true

# This URL Redirection is a 302 Redirection. 
# Browser is "unable" to launch the redirected site.
https://secure.passport.mnginteractive.com/mngi/servletDispatch/ErightsPassportServlet.dyn?url=http://www.mercurynews.com/ci_9216417?nclick_check=1&forced=true
# Logically, the urllib /urllib2 is giving a Blank when reading this site.

I would't entire say urllib2's fault before understanding how FF is
handling the redirection of the first site.
1) Open the site mentioned in the Location of 302, you will experience
the same behaviour as 302.

It seems more of an issue at server end, we have to know how, Firefox is
handling at the first place.
msg67003 - (view) Author: Martin McNickle (BitTorment) Date: 2008-05-17 15:59
I verified the behaviour but this is a problem with that particular
site, not with urllib/urllib2.

Should be closed.
History
Date User Action Args
2022-04-11 14:56:34adminsetgithub: 47117
2008-05-17 16:50:13benjamin.petersonsetstatus: open -> closed
resolution: not a bug
2008-05-17 15:59:37BitTormentsetnosy: + BitTorment
messages: + msg67003
2008-05-16 02:36:50orsenthilsetnosy: + orsenthil
messages: + msg66888
2008-05-15 21:45:24benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg66881
2008-05-15 17:34:25ambarishcreate