Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml.sax.saxutils.prepare_input_source ignores character stream in InputSource #45824

Closed
ygale mannequin opened this issue Nov 21, 2007 · 7 comments
Closed

xml.sax.saxutils.prepare_input_source ignores character stream in InputSource #45824

ygale mannequin opened this issue Nov 21, 2007 · 7 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ygale
Copy link
Mannequin

ygale mannequin commented Nov 21, 2007

BPO 1483
Nosy @terryjreedy, @serhiy-storchaka
Dependencies
  • bpo-17089: Expat parser parses strings only when XML encoding is UTF-8
  • Files
  • test_prepare_input_source.py: Almost full set of tests for prepare_input_source()
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-04-02.18:09:58.593>
    created_at = <Date 2007-11-21.14:03:15.714>
    labels = ['type-bug', 'library']
    title = 'xml.sax.saxutils.prepare_input_source ignores character stream in InputSource'
    updated_at = <Date 2015-04-02.18:09:58.592>
    user = 'https://bugs.python.org/ygale'

    bugs.python.org fields:

    activity = <Date 2015-04-02.18:09:58.592>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-04-02.18:09:58.593>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2007-11-21.14:03:15.714>
    creator = 'ygale'
    dependencies = ['17089']
    files = ['9536']
    hgrepos = []
    issue_num = 1483
    keywords = []
    message_count = 7.0
    messages = ['57737', '57749', '62808', '62902', '107425', '116791', '239938']
    nosy_count = 3.0
    nosy_names = ['terry.reedy', 'ygale', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue1483'
    versions = ['Python 3.1', 'Python 2.7', 'Python 3.2']

    @ygale
    Copy link
    Mannequin Author

    ygale mannequin commented Nov 21, 2007

    In the documentation for xml.sax.xmlreader.InputSource objects
    (section 8.12.4 of the Library Reference) we find that
    users of InputSource objects should use the following
    sequence to get their input data:

    1. If the InputSource has a character stream, use that.
    2. Otherwise, if the InputSource has a byte stream, use that.
    3. Otherwise, open a URI connection to the system ID.

    prepare_input_source() skips step 1.

    This is a one-line fix in Lib/xml/sax/saxutils.py:

    • if source.getByteStream() is None:
      + if source.getCharacterStream is None and source.getByteStream() is
      None:

    @ygale ygale mannequin added the stdlib Python modules in the Lib dir label Nov 21, 2007
    @ygale
    Copy link
    Mannequin Author

    ygale mannequin commented Nov 22, 2007

    Oops, obvious typo, sorry:

    • if source.getByteStream() is None:
      + if source.getCharacterStream() is None and source.getByteStream() is
      None:

    @akuchling
    Copy link
    Member

    Could you please provide a simple little test case for the bug? I'd
    like to add a test when I commit the change, but you can probably boil
    the problem down into a test faster than I can.

    @ygale
    Copy link
    Mannequin Author

    ygale mannequin commented Feb 24, 2008

    Sure. Here is a simple test case:

      def testUseCharacterStream(self):
        '''If the source is an InputSource with a character stream, use 
    it.'''
        src = xml.sax.xmlreader.InputSource(temp_file_name)
        src.setCharacterStream(StringIO.StringIO(u"foo"))
        prep = xml.sax.saxutils.prepare_input_source(src)
        self.failIf(prep.getCharacterStream() is None, "ignored character 
    stream")

    If "temp_file_name" is omitted, you'll get an
    AttributeError, and if you put it in but the
    file doesn't exist, you'll get an IOError.

    I'm attaching an almost full set of tests.
    It omits the case of a URL. You can easily
    put that in if you have a handy function that
    converts a file path to a file URL, with all
    the fidgety stuff you need for Windows. (Does that
    already exist somewhere?)

    Unfortunately, I now see that the problem
    is a bit deeper than this. There are two more
    related bugs that need to be fixed before
    this really works.

    See bpo-2174 and bpo-2175.

    @jafo jafo mannequin assigned akuchling Mar 20, 2008
    @terryjreedy
    Copy link
    Member

    Are this and the other issues still problems in 2.7 (rc out now) and 3.1?

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Sep 18, 2010

    There's a one line patch in msg57749 and some unit tests are attached so would a committer take a look please. Also note that bpo-2174 and bpo-2175 are related.

    @BreamoreBoy BreamoreBoy mannequin added the type-bug An unexpected behavior, bug, or error label Sep 18, 2010
    @akuchling akuchling removed their assignment Nov 12, 2010
    @serhiy-storchaka serhiy-storchaka self-assigned this Jan 16, 2013
    @serhiy-storchaka
    Copy link
    Member

    Fixed in bpo-2175.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants