Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that the null character '\0' terminates a struct format spec #79895

Closed
mr-nfamous mannequin opened this issue Jan 10, 2019 · 12 comments
Closed

Document that the null character '\0' terminates a struct format spec #79895

mr-nfamous mannequin opened this issue Jan 10, 2019 · 12 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mr-nfamous
Copy link
Mannequin

mr-nfamous mannequin commented Jan 10, 2019

BPO 35714
Nosy @mdickinson, @stevendaprano, @serhiy-storchaka, @mr-nfamous, @ZackerySpytz, @miss-islington, @iritkatriel
PRs
  • bpo-35714: Reject null characters in struct format strings #16928
  • [3.9] bpo-35714: Reject null characters in struct format strings (GH-16928) #20373
  • [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) #20374
  • [3.7] bpo-35714: Reject null characters in struct format strings (GH-16928) #20375
  • [3.6] bpo-35714: Reject null characters in struct format strings (GH-16928) #20376
  • [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) #20419
  • [3.7] [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) (GH-20419) #20420
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-06-19.09:17:15.207>
    created_at = <Date 2019-01-10.22:42:10.256>
    labels = ['3.7', '3.8', 'type-bug', 'library', 'docs']
    title = "Document that the null character '\\0' terminates a struct format spec"
    updated_at = <Date 2021-06-19.09:17:15.207>
    user = 'https://github.com/mr-nfamous'

    bugs.python.org fields:

    activity = <Date 2021-06-19.09:17:15.207>
    actor = 'mark.dickinson'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2021-06-19.09:17:15.207>
    closer = 'mark.dickinson'
    components = ['Documentation', 'Library (Lib)']
    creation = <Date 2019-01-10.22:42:10.256>
    creator = 'bup'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35714
    keywords = ['patch']
    message_count = 12.0
    messages = ['333424', '333430', '333441', '355407', '355410', '369859', '369950', '369951', '369959', '369961', '396078', '396121']
    nosy_count = 8.0
    nosy_names = ['mark.dickinson', 'steven.daprano', 'docs@python', 'serhiy.storchaka', 'bup', 'ZackerySpytz', 'miss-islington', 'iritkatriel']
    pr_nums = ['16928', '20373', '20374', '20375', '20376', '20419', '20420']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue35714'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @mr-nfamous
    Copy link
    Mannequin Author

    mr-nfamous mannequin commented Jan 10, 2019

    ie.:
        >>> from struct import calcsize
        >>> calcsize('\144\u0064\000xf\U00000031000\60d\121\U00000051')
        16

    I'm sure some people think it's obvious or even expect the null character to signal EOF but it probably isn't obvious at all to those without experience in lower level languages. It actually seems like Python goes out of its way to make sure everything treats the null character no more special than the letter "H", which is good.

    At first glance I'd think something like this was just another trivial quirk of the language and not bring it up, but because the documentation doesn't mention it I actually got stuck on something related for half an hour when unit testing some dynamically generated format specs.

    Without going into unnecessary detail, what happened was that a typo in another tangentially related part of the test was enabling the generation of a rogue null byte. I'm bad at those "find face in the crowd" puzzles and this was hardly different, being literally camouflaged within a 300 character format spec containing a random mixture of escaped and non-escaped source characters in the forms: \Uffffffff, \uffff, \777, \xff, \x00, + latin/ascii.

    If I'm not the only one who sees this as a slightly bigger deal than poor documentation, the fix is trivial with an extra call to PyBytes_GET_SIZE when null is found. But just because I can't think of a use case in allowing the null character to precede other characters in the format string doesn't mean there isn't one, which is why only documentation is currently selected.

    @mr-nfamous mr-nfamous mannequin added 3.7 (EOL) end of life 3.8 only security fixes labels Jan 10, 2019
    @mr-nfamous mr-nfamous mannequin assigned docspython Jan 10, 2019
    @mr-nfamous mr-nfamous mannequin added docs Documentation in the Doc dir stdlib Python modules in the Lib dir labels Jan 10, 2019
    @stevendaprano
    Copy link
    Member

    I'm not sure whether having NULLs terminate a struct format string is a feature or a bug.

    Given that nearly every other string in Python treat NULLs as ordinary characters, I'm inclined to say this is a bug. Or at least an unnecessary restriction that ought to be lifted.

    @stevendaprano stevendaprano added the type-bug An unexpected behavior, bug, or error label Jan 11, 2019
    @serhiy-storchaka
    Copy link
    Member

    I think the null character is illegal character in the format string, and struct functions should raise a struct.error for it.

    @mdickinson
    Copy link
    Member

    I agree with Serhiy. Any other unrecognised character would raise an error. The null character should do the same.

    @ZackerySpytz
    Copy link
    Mannequin

    ZackerySpytz mannequin commented Oct 26, 2019

    I've created a patch to reject null characters in the format string.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 3f59b55 by Zackery Spytz in branch 'master':
    bpo-35714: Reject null characters in struct format strings (GH-16928)
    3f59b55

    @miss-islington
    Copy link
    Contributor

    New changeset 5221a10 by Miss Islington (bot) in branch '3.9':
    bpo-35714: Reject null characters in struct format strings (GH-16928)
    5221a10

    @serhiy-storchaka
    Copy link
    Member

    Zackery, do you mind to create a backport to 3.8?

    @serhiy-storchaka
    Copy link
    Member

    New changeset 5ff5edf by Zackery Spytz in branch '3.8':
    [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) (GH-20419)
    5ff5edf

    @miss-islington
    Copy link
    Contributor

    New changeset 4ea8028 by Miss Islington (bot) in branch '3.7':
    [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) (GH-20419)
    4ea8028

    @iritkatriel
    Copy link
    Member

    This seems resolved, can it be closed?

    @mdickinson
    Copy link
    Member

    Yes, this looks closeable. Thank you!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants