Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shlex not posix compliant when parsing "foo#bar" #51860

Open
jjdmol2 mannequin opened this issue Dec 31, 2009 · 7 comments
Open

shlex not posix compliant when parsing "foo#bar" #51860

jjdmol2 mannequin opened this issue Dec 31, 2009 · 7 comments
Labels
3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@jjdmol2
Copy link
Mannequin

jjdmol2 mannequin commented Dec 31, 2009

BPO 7611
Nosy @terryjreedy, @merwok, @meadori
Files
  • lexer_test.py: test to show shlex behaviour
  • shlex_posix.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2009-12-31.08:42:42.098>
    labels = ['type-feature', 'library', '3.9', '3.10']
    title = 'shlex not posix compliant when parsing "foo#bar"'
    updated_at = <Date 2020-11-11.17:08:24.800>
    user = 'https://bugs.python.org/jjdmol2'

    bugs.python.org fields:

    activity = <Date 2020-11-11.17:08:24.800>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2009-12-31.08:42:42.098>
    creator = 'jjdmol2'
    dependencies = []
    files = ['15709', '15718']
    hgrepos = []
    issue_num = 7611
    keywords = ['patch']
    message_count = 7.0
    messages = ['97081', '97082', '97125', '112740', '148270', '148292', '148456']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'ferringb', 'eric.araujo', 'meador.inge', 'jjdmol2', 'cadf']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue7611'
    versions = ['Python 3.9', 'Python 3.10']

    @jjdmol2
    Copy link
    Mannequin Author

    jjdmol2 mannequin commented Dec 31, 2009

    The shlex parser parses "foo#bar" as "foo", discarding the rest as a
    comment. This is actually one of the test cases, even in POSIX mode.

    However, POSIX (see below) only allows comments to start at the
    beginning of a token, so "foo#bar" has to result in a "foo#bar" token.
    To easily see this, do "echo foo#bar" in bash, versus "echo foo #bar".

    Fixing this might break some applications that rely on this broken
    behaviour, even though they're not strictly POSIX compliant.

    POSIX 2008, Rationale C.2.3 (which refers to Shell & Utilities 2.3(10)):

    The (10) rule about '#' as the current character is the first in the
    sequence in which a new token is being assembled. The '#' starts a
    comment only when it is at the beginning of a token. This rule is also
    written to indicate that the search for the end-of-comment does not
    consider escaped <newline> specially, so that a comment cannot be
    continued to the next line.

    @jjdmol2 jjdmol2 mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Dec 31, 2009
    @jjdmol2
    Copy link
    Mannequin Author

    jjdmol2 mannequin commented Dec 31, 2009

    Attached a program which shows the relevant behaviour:

    import shlex
    
    tests = [ "foo#bar", "foo #bar" ]
    
    for t in tests:
      print "%s -> %s" % (t,[x for x in shlex.shlex(t,posix=True)])

    results in

    $ python lexer_test.py
    foo#bar -> ['foo']
    foo #bar -> ['foo']

    (expected of course is ['foo#bar'] on the first line).

    @cadf
    Copy link
    Mannequin

    cadf mannequin commented Jan 2, 2010

    Here's a patch addressing the behavior described.

    @terryjreedy
    Copy link
    Member

    Given that test_shlex.py tests for the current behavior, it is hard to call this a bug in the tracker sense of the term. I would only change with a new version.

    The manual just says "When operating in POSIX mode, shlex will try to be as close as possible to the POSIX shell parsing rules." but gives no reference to which authority it is following or what the rules are in either case. Manual section 23.2.2. Parsing Rules only discusses the differences between posix and non-posix rules, not the common rules.

    I suspect this module was written well over a decade ago, maybe closer to two. Is it possible that earlier versions were different on this issue? Or is the 2008 version only cosmetically different some 1990s version?

    @terryjreedy terryjreedy added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Aug 4, 2010
    @merwok
    Copy link
    Member

    merwok commented Nov 24, 2011

    The manual just says "When operating in POSIX mode, shlex will try to be as close as
    possible to the POSIX shell parsing rules." but gives no reference to which authority it is
    following or what the rules are in either case.
    I think it actually does: The POSIX specification defines the behavior of a compliant /bin/sh shell.

    See also bpo-1521950.

    @terryjreedy
    Copy link
    Member

    The doc section has no reference, as in a live web link, to any version of the POSIX specification. This is unlike other doc sections that implement various RFCs (which also get updated). The docs also link to specific references for the Unicode version supported, which has changed from version to version.

    The OP quotes (without giving a link) from the 2008 version. POSIX and shlex are much older than that, implying that shlex might conform to an earlier version, just as other modules implement older RFCs that have been superceded.

    @meadori
    Copy link
    Member

    meadori commented Nov 27, 2011

    @iritkatriel iritkatriel added 3.9 only security fixes 3.10 only security fixes labels Nov 11, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants