Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mailcap does not respect precedence in the presence of wildcards #59182

Closed
manu-beffara mannequin opened this issue Jun 1, 2012 · 19 comments
Closed

mailcap does not respect precedence in the presence of wildcards #59182

manu-beffara mannequin opened this issue Jun 1, 2012 · 19 comments
Labels
topic-email type-bug An unexpected behavior, bug, or error

Comments

@manu-beffara
Copy link
Mannequin

manu-beffara mannequin commented Jun 1, 2012

BPO 14977
Nosy @warsaw, @bitdancer, @akheron, @michael-lazar
Files
  • mailcap.patch
  • mailcap_v2.patch
  • mailcap_v3.patch
  • mailcap_v4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-10.00:10:59.948>
    created_at = <Date 2012-06-01.09:28:52.694>
    labels = ['type-bug', 'expert-email']
    title = 'mailcap does not respect precedence in the presence of wildcards'
    updated_at = <Date 2016-09-10.00:10:59.946>
    user = 'https://bugs.python.org/manu-beffara'

    bugs.python.org fields:

    activity = <Date 2016-09-10.00:10:59.946>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-10.00:10:59.948>
    closer = 'r.david.murray'
    components = ['email']
    creation = <Date 2012-06-01.09:28:52.694>
    creator = 'manu-beffara'
    dependencies = []
    files = ['43682', '43695', '43706', '43710']
    hgrepos = []
    issue_num = 14977
    keywords = ['patch']
    message_count = 19.0
    messages = ['162064', '162070', '168438', '270143', '270145', '270158', '270226', '270243', '270283', '270284', '270299', '270306', '270320', '270322', '270373', '270423', '272543', '275506', '275507']
    nosy_count = 7.0
    nosy_names = ['barry', 'r.david.murray', 'python-dev', 'petri.lehtinen', 'manu-beffara', 'michael-lazar', 'Fabio Alessandro Locati']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14977'
    versions = ['Python 3.5', 'Python 3.6']

    @manu-beffara
    Copy link
    Mannequin Author

    manu-beffara mannequin commented Jun 1, 2012

    According to RFC 1542, the first matching entry in mailcap files should be used for handling a given type. The mailcap module does not respect this rule when wildcards are used in some rules, because the "lookup" function always checks entries for a specific type (say image/jpeg) before checking the generic ones (image/*). As a consequence, if a user overrides system defaults using a generic type, this choice is not satisfied, and the behaviour is inconsistent with other tools using mailcap (including the standard "run-mailcap").

    @manu-beffara manu-beffara mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 1, 2012
    @bitdancer
    Copy link
    Member

    Hmm. It seems to me this is an edge case as far as backward compatibility goes. On the one hand it does seem like an RFC violation (ie: bug), on the other hand there could be user programs depending on the current behavior. I'm inclined toward treating it as a bug, but I can see it being argued the other way. The fact that the file is designed for interoperability and other tools do take the wildcard first argues in favor of treating it as a bug, I think.

    @bitdancer bitdancer added easy topic-email and removed stdlib Python modules in the Lib dir labels Jun 1, 2012
    @akheron
    Copy link
    Member

    akheron commented Aug 17, 2012

    Sounds like a bug to me.

    It's not too straightforward to fix, though. The order of MIME types is lost because they are stored as keys of a dict. AFAICS, it wouldn't help to use OrderedDict and checking for the wildcard type first if its index is smaller. Example:

    image/; foo %s; test=/bin/false
    image/jpeg; bar %s; test=/bin/true
    image/
    ; baz %s; test=/bin/true

    In this case, the image/jpeg entry should be returned, but both image/* entries get grouped together, so they would be evaluated first and the "baz %s" entry would be returned.

    The API expects the result of getcaps() (a dict) to be passed to findmatch(), which makes it very hard to change the caps representation. The only way I can think of would be to change the representation to a dict subclass with extra semantics. Plain dict would still have to be supported for backwards compatibility, though.

    @akheron akheron removed the easy label Aug 17, 2012
    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 11, 2016

    Hello. In my opinion this is a pretty major deficiency. I was trying to add definitions to my ~/.mailcap file (which should take priority over system mailcap files) but they weren't getting applied because of the wildcard bug. This was prohibiting me from using mailcap, so I wrote a patch and submitted it to PyPI. I've never contributed before; what would be the first step towards submitting this to python?

    https://github.com/michael-lazar/mailcap_fix
    https://pypi.python.org/pypi/mailcap-fix/0.1.0

    @bitdancer
    Copy link
    Member

    Submit it as a proposed patch here for review, please. git or hg diff against the default branch, or a pointer to an hg clone with the patch applied.

    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 11, 2016

    Alright thanks, I've submitted a patch

    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 12, 2016

    Submitting an updated patch; simplified the implementation and updated test_mailcap.py

    @bitdancer
    Copy link
    Member

    I think we need to preserve backward compatibility in the readmailcapfile function even though it isn't technically a public API (make lineno a keyword argument and the behavior dependent on its presence). I haven't experimented with the patch yet, but it looks reasonable.

    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 13, 2016

    I can certainly do that. Although in addition to adding a keyword argument, we would also have to change the return signature to switch between modes like this:

        if lineno is None:
            return caps
        else:
            return caps, lineno

    Overall I'm not a fan of this technique and would like to avoid it if possible. The problem is that we have to keep track of the current line between successive calls to readmailcapfile(). An alternative would be to go back to using lineno as a generator. This is close to what I had in the initial patch.

        lineno = itertools.count()
        caps = readmailcapfile(fp, lineno=lineno)
        caps = readmailcapfile(fp2, lineno=lineno)
        caps = readmailcapfile(fp3, lineno=lineno)
        ...etc

    Happy to hear any insights you have on this.

    @bitdancer
    Copy link
    Member

    How about this: rename the existing readmailcapfile as an internal _readmailcapfile with the new signature. Then add a backward compatible readmailcapfile with the existing signature/return value that uses a dummy value for lineno and throws away the lineno on output. The point here is just to avoid breaking programs that are using the existing api, even though it is an internal one. Yes, it is cruft, but backward compatibility sometimes requires cruft. We could deprecate the old api with a message that says it is an internal method and should not be used, and see if anyone complains about it being deprecated.

    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 13, 2016

    That works for me, patch updated to match your suggestion. So to recap, the proposed fix implements the following changes to the api:

    getcaps()
       The returned dict now has an additional `lineno` field. There's a *slim* chance that this could break behavior for somebody who relies on the output for something other than passing it to findmatch().
    
    readmailcapfiles()
        Marked as deprecated, but the behavior will remain the same.
    
    lookup(caps, ...)
        If the caps dict contains `lineno`, the metadata will be used to sort the matching entries. If the caps dict does not contain `lineno`, for example if it was constructed manually, the behavior will remain the same.
    
    findmatch(caps, ...)
        Same as lookup()

    @bitdancer
    Copy link
    Member

    Looks like you didn't attach the new patch.

    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 13, 2016

    Whoops

    @bitdancer
    Copy link
    Member

    There's actually programmatic way to generate a deprecation warning when the function is called (and then we'd want a test to make sure it gets generated). if you don't feel like working through that hopefully someone else will pick it up. If you want to work on it, you can find examples by grepping for DeprecationWarning.

    @michael-lazar
    Copy link
    Mannequin

    michael-lazar mannequin commented Jul 14, 2016

    Got it, I found some examples and it didn't look too complicated so I took a shot at it. Is there anything else that needs to be added? Does the documentation need to be updated at all?

    btw, thanks for the rapid iteration. I appreciate you taking the time to help guide me though this :)

    @bitdancer
    Copy link
    Member

    I think a doc update is probably worthwhile, but I haven't looked at the docs so I'm not sure.

    I'll try to give this a complete review this weekend, though I'm a bit busy so it may not happen.

    @FabioAlessandroLocati
    Copy link
    Mannequin

    FabioAlessandroLocati mannequin commented Aug 12, 2016

    Any news on this?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 10, 2016

    New changeset f1bf0abcca0c by R David Murray in branch '3.5':
    bpo-14977: Make mailcap respect the order of the lines in the mailcap file.
    https://hg.python.org/cpython/rev/f1bf0abcca0c

    New changeset efd692c86429 by R David Murray in branch 'default':
    Merge: bpo-14977: Make mailcap respect the order of the lines in the mailcap file.
    https://hg.python.org/cpython/rev/efd692c86429

    @bitdancer
    Copy link
    Member

    Thanks, Michael. I didn't see anything that needed updating in the docs.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-email type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants