Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open() shouldn't silently ignore buffering=1 in binary mode #76417

Closed
izbyshev mannequin opened this issue Dec 6, 2017 · 10 comments
Closed

open() shouldn't silently ignore buffering=1 in binary mode #76417

izbyshev mannequin opened this issue Dec 6, 2017 · 10 comments
Labels
3.8 only security fixes topic-IO type-bug An unexpected behavior, bug, or error

Comments

@izbyshev
Copy link
Mannequin

izbyshev mannequin commented Dec 6, 2017

BPO 32236
Nosy @gpshead, @pitrou, @vstinner, @benjaminp, @izbyshev
PRs
  • bpo-32236: Issue RuntimeWarning if buffering=1 for open() in binary mode #4842
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-10-20.00:24:35.987>
    created_at = <Date 2017-12-06.23:46:49.278>
    labels = ['type-bug', '3.8', 'expert-IO']
    title = "open() shouldn't silently ignore buffering=1 in binary mode"
    updated_at = <Date 2018-10-20.00:24:35.935>
    user = 'https://github.com/izbyshev'

    bugs.python.org fields:

    activity = <Date 2018-10-20.00:24:35.935>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-10-20.00:24:35.987>
    closer = 'vstinner'
    components = ['IO']
    creation = <Date 2017-12-06.23:46:49.278>
    creator = 'izbyshev'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32236
    keywords = ['patch']
    message_count = 10.0
    messages = ['307775', '307968', '308000', '308227', '308571', '310672', '324962', '325088', '328098', '328100']
    nosy_count = 6.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'vstinner', 'benjamin.peterson', 'stutzbach', 'izbyshev']
    pr_nums = ['4842']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue32236'
    versions = ['Python 3.8']

    @izbyshev
    Copy link
    Mannequin Author

    izbyshev mannequin commented Dec 6, 2017

    The fact that "buffering=1" is usable only in text mode is documented for open(). In binary mode, setting buffering to 1 is silently ignored and equivalent to using default buffer size. I argue that such behavior is:

    1. Inconsistent
      For example, attempts to use buffering=0 in text mode or to specify encoding in binary mode raise ValueError.

    2. Dangerous
      Consider the following code fragment running on *nix (inspired by real-world code):

    with open("fifo", "wb", buffering=1) as out:
        for line in lines:
            out.write(line)

    "fifo" refers to a FIFO (named pipe). For each line, len(line) <= PIPE_BUF. Because of line buffering, such code must be able to safely assume that each write() is atomic, so that even in case of multiple writers no line read by a FIFO reader will ever get intermixed with another line. And it's so in Python 2. After migration to Python 3 (assuming that line is now bytes), this code still works on Linux because of two factors:
    a) PIPE_BUF is 4096, and so is the default IO buffer size (usually)
    b) When a write() is going to overflow the buffer, BufferedWriter first flushes and then processes the new write() (based on my experiment). So, each OS write() is called with complete lines only and is atomic per (a).

    But PIPE_BUF is 512 on Mac OS X (I don't know about default buffer size). So, we are likely to have a subtle 2-to-3 migration issue.

    I suggest to raise a ValueError if buffering=1 is used for binary mode. Note that bpo-10344 (msg244354) and bpo-21332 would have been uncovered earlier if it was done from the beginning.

    @izbyshev izbyshev mannequin added 3.7 (EOL) end of life topic-IO type-bug An unexpected behavior, bug, or error labels Dec 6, 2017
    @pitrou
    Copy link
    Member

    pitrou commented Dec 10, 2017

    I suggest to raise a ValueError if buffering=1 is used for binary mode.

    Either that, or we instead accept buffering=1 as a regular buffer size. But since buffering=1 means something else for text mode, maybe you're right that it's better to raise in binary mode, to avoid any possible confusion.

    @izbyshev
    Copy link
    Mannequin Author

    izbyshev mannequin commented Dec 10, 2017

    I'm in favor of raising an exception because it'll expose existing code with incorrect assumptions. I'll check whether it breaks any tests and submit a PR.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 13, 2017

    After looking at the PR, I think it would be a bit too strong to raise an error. Perhaps emit a warning instead?

    @izbyshev
    Copy link
    Mannequin Author

    izbyshev mannequin commented Dec 18, 2017

    I had similar thoughts when I was fixing tests that broke due to ValueError. I've updated the PR to issue a RuntimeWarning instead.

    @izbyshev
    Copy link
    Mannequin Author

    izbyshev mannequin commented Jan 25, 2018

    Any feedback on the updated PR?

    @gpshead
    Copy link
    Member

    gpshead commented Sep 10, 2018

    My problem with a warning is the standard one: People who see a warning are often end users of python applications (who don't even have to know what Python is, let alone know anything about the code). For that reason, never add a warning to a stable branch - only new releases (meaning 3.8 here).

    Given that this isn't not a deprecation of meaningful API behavior but is highlighting questionably undefined nonsense behavior, users complaining upon obtaining 3.8 should ultimately reach library and application developers who use the API wrong to update their call sites to explicitly ask for what they intended instead of being ambiguious.

    FYI - the subprocess.py related changes in your PR are correct.

    @gpshead gpshead added 3.8 only security fixes and removed 3.7 (EOL) end of life labels Sep 10, 2018
    @izbyshev
    Copy link
    Mannequin Author

    izbyshev mannequin commented Sep 11, 2018

    Thank you, Gregory. I didn't intend to add the warning to stable branches -- it's just that 3.7 hasn't been released yet when this report was submitted.

    @vstinner
    Copy link
    Member

    New changeset a267056 by Victor Stinner (Alexey Izbyshev) in branch 'master':
    bpo-32236: open() emits RuntimeWarning if buffering=1 for binary mode (GH-4842)
    a267056

    @vstinner
    Copy link
    Member

    I don't think that it would be a good idea to start emitting a new warning in a minor release like the future Python 3.7.2, so I suggest to not backport the change.

    I close the issue.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants