This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sourcejedi
Recipients sourcejedi
Date 2016-04-24.15:58:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1461513531.9.0.223983001389.issue26838@psf.upfronthosting.co.za>
In-reply-to
Content
python3-3.4.3-5.fc23-x86_64

So far I spelunked here.  Starting from <https://github.com/kurtmckee/feedparser/issues/30>.  I experimented with using setCharacterStream() instead of setByteStream()

setCharacterStream() is shown in documentation but exercising it fails

>>> help(InputSource)
 |  setCharacterStream(self, charfile)
 |      Set the character stream for this input source. (The stream
 |      must be a Python 2.0 Unicode-wrapped file-like that performs
 |      conversion to Unicode strings.)
 |      
 |      If there is a character stream specified, the SAX parser will
 |      ignore any byte stream and will not attempt to open a URI
 |      connection to the system identifier.

Actually using an InputSource set up this way errors out as follows:

  File "/home/alan/.local/lib/python3.4/site-packages/feedparser-5.2.1-py3.4.egg/feedparser/api.py", line 236, in parse
  File "/usr/lib64/python3.4/site-packages/drv_libxml2.py", line 146, in parse
    source = saxutils.prepare_input_source(source)
  File "/usr/lib64/python3.4/xml/sax/saxutils.py", line 355, in prepare_input_source
    sysidfilename = os.path.join(basehead, sysid)
  File "/usr/lib64/python3.4/posixpath.py", line 79, in join
    if b.startswith(sep):
AttributeError: 'NoneType' object has no attribute 'startswith'

because the character stream is not actually used:

def prepare_input_source(source, base=""):
    """This function takes an InputSource and an optional base URL and
    returns a fully resolved InputSource object ready for reading."""

    if isinstance(source, str):
        source = xmlreader.InputSource(source)
    elif hasattr(source, "read"):
        f = source
        source = xmlreader.InputSource()
        source.setByteStream(f)
        if hasattr(f, "name") and isinstance(f.name, str):
            source.setSystemId(f.name)

    if source.getByteStream() is None:
        sysid = source.getSystemId()
        basehead = os.path.dirname(os.path.normpath(base))
        sysidfilename = os.path.join(basehead, sysid)
History
Date User Action Args
2016-04-24 15:58:51sourcejedisetrecipients: + sourcejedi
2016-04-24 15:58:51sourcejedisetmessageid: <1461513531.9.0.223983001389.issue26838@psf.upfronthosting.co.za>
2016-04-24 15:58:51sourcejedilinkissue26838 messages
2016-04-24 15:58:51sourcejedicreate