Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BOM incorrectly inserted before writing, after seeking in text file #67171

Closed
MarkIngramUK mannequin opened this issue Dec 2, 2014 · 7 comments
Closed

BOM incorrectly inserted before writing, after seeking in text file #67171

MarkIngramUK mannequin opened this issue Dec 2, 2014 · 7 comments
Labels
topic-IO type-bug An unexpected behavior, bug, or error

Comments

@MarkIngramUK
Copy link
Mannequin

MarkIngramUK mannequin commented Dec 2, 2014

BPO 22982
Nosy @amauryfa, @pitrou
Files
  • append-test.py: Test case
  • bom_seek_append.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-04-13.18:05:22.910>
    created_at = <Date 2014-12-02.16:41:42.901>
    labels = ['type-bug', 'expert-IO']
    title = 'BOM incorrectly inserted before writing, after seeking in text file'
    updated_at = <Date 2015-04-13.18:05:22.908>
    user = 'https://bugs.python.org/MarkIngramUK'

    bugs.python.org fields:

    activity = <Date 2015-04-13.18:05:22.908>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-04-13.18:05:22.910>
    closer = 'pitrou'
    components = ['IO']
    creation = <Date 2014-12-02.16:41:42.901>
    creator = 'MarkIngramUK'
    dependencies = []
    files = ['37345', '37378']
    hgrepos = []
    issue_num = 22982
    keywords = ['patch']
    message_count = 7.0
    messages = ['232015', '232025', '232091', '232092', '232263', '240688', '240689']
    nosy_count = 4.0
    nosy_names = ['amaury.forgeotdarc', 'pitrou', 'python-dev', 'MarkIngramUK']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue22982'
    versions = ['Python 3.4', 'Python 3.5']

    @MarkIngramUK
    Copy link
    Mannequin Author

    MarkIngramUK mannequin commented Dec 2, 2014

    If you open a text file for append, but then perform any form of seeking, before attempting to write to the file, it will cause the BOM to be written before you text. See the attached file for an example.

    If you run the test, take a look at the output file, and you'll notice the UTF16 BOM gets written out before each number.

    I'm running a 2014 iMac with Yosemite.

    @MarkIngramUK MarkIngramUK mannequin added topic-IO type-bug An unexpected behavior, bug, or error labels Dec 2, 2014
    @amauryfa
    Copy link
    Member

    amauryfa commented Dec 2, 2014

    bpo-5006 was supposed to take care of this, but it has a flaw IMO:
    This statement https://hg.python.org/cpython/file/0744ceb5c0ed/Lib/_pyio.py#l2003 is missing an "and whence!=2".

    @pitrou
    Copy link
    Member

    pitrou commented Dec 3, 2014

    This is a limitation more than a bug. When you seek to the start of the file, the encoder is reset because Python thinks you are gonna to write there. If you remove the call to file.seek(0, io.SEEK_SET), things work fine.

    @Amaury, whence can only be zero there:
    https://hg.python.org/cpython/file/0744ceb5c0ed/Lib/_pyio.py#l1960

    @MarkIngramUK
    Copy link
    Mannequin Author

    MarkIngramUK mannequin commented Dec 3, 2014

    It's more than a limitation, because if I call file.seek(0, io.SEEK_END) then the encoder is still reset, and will still write the BOM, even at the end of the file.

    This also means that it's impossible to seek in a text file that you want to append to. I've had to work around this by opening the file as binary, manually writing the BOM, and writing the strings as encoded bytes.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 7, 2014

    Here is a patch.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 13, 2015

    New changeset 946740824eaf by Antoine Pitrou in branch '3.4':
    Issue bpo-22982: Improve BOM handling when seeking to multiple positions of a writable text file.
    https://hg.python.org/cpython/rev/946740824eaf

    New changeset 3583e5191b96 by Antoine Pitrou in branch 'default':
    Issue bpo-22982: Improve BOM handling when seeking to multiple positions of a writable text file.
    https://hg.python.org/cpython/rev/3583e5191b96

    @pitrou
    Copy link
    Member

    pitrou commented Apr 13, 2015

    Fix is pushed. Thanks for the report!

    @pitrou pitrou closed this as completed Apr 13, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants