Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimetypes.guess_extension result changes after mimetypes.init() #49213

Closed
siona mannequin opened this issue Jan 16, 2009 · 35 comments
Closed

mimetypes.guess_extension result changes after mimetypes.init() #49213

siona mannequin opened this issue Jan 16, 2009 · 35 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes easy stdlib Python modules in the Lib dir topic-email type-bug An unexpected behavior, bug, or error

Comments

@siona
Copy link
Mannequin

siona mannequin commented Jan 16, 2009

BPO 4963
Nosy @warsaw, @terryjreedy, @pitrou, @tiran, @abadger, @bitdancer, @bertjwregeer, @vadmium, @zooba, @maxking, @pombredanne, @davidkhess
PRs
  • bpo-4963: Fix for initialization and non-deterministic behavior issues in mimetypes #3062
  • [3.8] bpo-4963: Fix for initialization and non-deterministic behavior issues in mimetypes (GH-3062) #14375
  • [3.7] bpo-4963: Fix for initialization and non-deterministic behavior issues in mimetypes (GH-3062) #14376
  • Files
  • mimetypes-init-test.patch
  • issue4963.patch: stable guess_extension patch and test
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/zooba'
    closed_at = <Date 2019-06-25.16:14:49.543>
    created_at = <Date 2009-01-16.12:04:54.057>
    labels = ['easy', 'type-bug', '3.8', 'expert-email', '3.7', 'library', '3.9']
    title = 'mimetypes.guess_extension result changes after mimetypes.init()'
    updated_at = <Date 2021-01-10.02:17:07.465>
    user = 'https://bugs.python.org/siona'

    bugs.python.org fields:

    activity = <Date 2021-01-10.02:17:07.465>
    actor = 'terry.reedy'
    assignee = 'steve.dower'
    closed = True
    closed_date = <Date 2019-06-25.16:14:49.543>
    closer = 'steve.dower'
    components = ['Library (Lib)', 'email']
    creation = <Date 2009-01-16.12:04:54.057>
    creator = 'siona'
    dependencies = []
    files = ['17803', '34821']
    hgrepos = []
    issue_num = 4963
    keywords = ['patch', 'easy']
    message_count = 35.0
    messages = ['79955', '79961', '79962', '108650', '108674', '108704', '108707', '108746', '108755', '108853', '108934', '108967', '182465', '214948', '216104', '216764', '246649', '293248', '293249', '293258', '293261', '293262', '293626', '293628', '300092', '315145', '322534', '346456', '346457', '346534', '346537', '346539', '384730', '384731', '384750']
    nosy_count = 18.0
    nosy_names = ['barry', 'terry.reedy', 'pitrou', 'christian.heimes', 'wichert', 'a.badger', 'r.david.murray', 'siona', 'sascha_silbe', 'l0nwlf', 'X-Istence', 'martin.panter', 'steve.dower', 'wodny', 'maxking', 'pombredanne', 'sivert', 'dhess']
    pr_nums = ['3062', '14375', '14376']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue4963'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    @siona
    Copy link
    Mannequin Author

    siona mannequin commented Jan 16, 2009

    Asking mimetypes to reload mime.types can cause guess_extension() to
    return a different result if multiple extensions are mapped to that mime
    type:

    >>> import mimetypes
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpe'
    >>> mimetypes.init()
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpeg'
    >>>

    This is because both the forward (extension to type) and inverse (type
    to extension) type mapping dicts are populated by iterating through the
    existing forward (extension to type) dict (types_map), then supplemented
    by reading from mime.types (or any other files given to init()). The
    fully populated forward dict becomes the new types_map. Initially,
    types_map is hard-coded, but when the type mapping dicts are
    repopulated, by explicitly or implicitly calling init() again, it is
    done by iterating over the types_map created by the first init() call,
    not the hard-coded one. If the iteration order for a set of extensions
    with the same type is different in these two versions of the forward
    dict, the order of extensions appearing for that type in the inverse
    dict will change. And so the behavior of guess_all_extensions() and
    hence guess_extension() will change.

    @siona siona mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 16, 2009
    @terryjreedy
    Copy link
    Member

    3.0, WinXP
    import mimetypes
    print(mimetypes.guess_extension('image/jpeg'))
    mimetypes.init()
    print(mimetypes.guess_extension('image/jpeg'))
    gives
    .jpe
    .jpe

    I wonder at this answer since .jpg and occasionally .jpeg is standard in
    Windows usage, but the doc is unclear to me as to the actual intent of
    the function.

    @siona
    Copy link
    Mannequin Author

    siona mannequin commented Jan 16, 2009

    Ah, yes, forgot to mention this is on Debian 4.0. I doubt you're going
    to run into it on a Windows system unless you explicitly give init() a
    mime.types file, looking at the knownfiles list used by default.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 25, 2010

    Can't reproduce under Mandriva Linux:

    >>> import mimetypes
    >>> print(mimetypes.guess_extension('image/jpeg'))
    .jpe
    >>> mimetypes.init()
    >>> print(mimetypes.guess_extension('image/jpeg'))
    .jpe

    The fact that it returns ".jpe" rather than ".jpg", however, could be a bug in itself (since the latter will really be expected by everyone, not the former).

    @bitdancer
    Copy link
    Member

    I can't reproduce this either, and without a reproducer we might as well close it.

    Antoine it is possible that your fix for bpo-5853 inadvertently fixed this, but I don't feel like untangling the logic of the module enough to figure it out :) So I'm going to close it 'works for me'. If S Arrowsmith can reproduce it with 2.7r2, we can reopen.

    By the way, it produced 'jpe' for me, too...but, then, my system (Gentoo) /etc/mime.types file has 'jpe' as the first filetype for jpeg, so I don't think that association is Python's bug, per se. Though I may eventually have to address it in email6. (Also by the way, I tried switching the order and passing in the modified file explicitly on the explicit init, but that didn't change the behavior).

    @siona
    Copy link
    Mannequin Author

    siona mannequin commented Jun 26, 2010

    Sorry, still there:

    Python 2.7rc2 (r27rc2:82137, Jun 26 2010, 11:27:59) 
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import mimetypes
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpe'
    >>> mimetypes.init()
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpeg'

    The fact that it's not reproducible on other Linux systems (I can't reproduce on the RedHat box I have to hand) might suggest there's something odd about Debian's mime.types . But I've just tried it passing init() the mime.types from the (working) RedHat box, and it's still producing the odd behaviour. (And I'm now on Debian 5.0, so it's not a Debian 4.0-specific issue either.) Wish I had a convenient Ubuntu install to try it on.

    Bizarre.

    @l0nwlf
    Copy link
    Mannequin

    l0nwlf mannequin commented Jun 26, 2010

    Can't reproduce.

    16:36:36 l0nwlf-MBP:~$ python2.7
    Python 2.7rc2+ (trunk:82148M, Jun 22 2010, 10:32:46) 
    [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import mimetypes
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpe'
    >>> mimetypes.init()
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpe'
    >>> 

    Results were same in python2.5, 2.6 too. I wonder whether this is machine specific or distro specific.

    @bitdancer
    Copy link
    Member

    S Arrowsmith: can put a print statement into mimetypes.init, find out what files are loading, and paste the image/jpeg lines from each of those files here? That might provide a clue.

    @bitdancer bitdancer reopened this Jun 26, 2010
    @siona
    Copy link
    Mannequin Author

    siona mannequin commented Jun 26, 2010

    >>> import mimetypes 
    >>> mimetypes.guess_extension('image/jpeg')
    /etc/mime.types
    '.jpe'
    >>> mimetypes.init()
    /etc/mime.types
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpeg'
    >>> 
    $ grep jpeg /etc/mime.types
    image/jpeg					jpeg jpg jpe
    $

    That big chunk of whitespace is 5 tabs. Not very helpful, I fear.

    @siona
    Copy link
    Mannequin Author

    siona mannequin commented Jun 28, 2010

    I've dug into it -- again -- and my original analysis still holds. Getting consistent guess_extension() results across an explicit init() call depends on dict.items() returning keys in the same order on two different dictionaries (the original, hard-coded types_map and the one created by the first, implicit init() call).

    Why should this be different on Debian to other Linuxes, even given same data as a "working" distribution? Is there something in the implementation details of dict.items() which is that distribution dependent?

    (A "fix", BTW, is to insert a call to _default_mime_types() either in init() or in MimeTypes.__init__ before it calls init().)

    @bitdancer
    Copy link
    Member

    It must be that the different key order only happens on the one platform because of the quirky nature of dictionary construction. That is, there is *something* on that platform that is changing where things get hashed when the dictionary is recreated.

    The problem with fixing this is that any fix is going to change the behavior, unless we go to the lengths of recording the order of the initializations in add_type and replay it when init is called a second time. That solution is pretty much a non-starter :)

    The mimetypes docs say that init can be called more than once, They say that a MimeTypes object starts out "with the same database as provided by the rest of the module". The docs explain how the initial database state is created.

    What the docs don't do is say what *happens* when you call init more than once. There are two possibilities: either we (1) restart from the initial state, or we (2) start from the current (possibly modified) state of the database and then add whatever is specified in the init call. (Actually, there's a third possibility: we could also add back in anything from the default init that was deleted; but this halfway version is unlikely to be anyone's intent or expectation.)

    The actual implementation of the mimetypes module does (2) if and only if you pass init a list of files. If you don't then it does something that isn't even the third way above: it reloads *just* the data from the system files it managed to find, without reloading the data from the internal tables.

    Clearly this behavior is....odd. When no files are passed, init should do one of two things: either nothing, or reset the global db state to its initial value.

    It's not so clear what the behavior should be when you pass init one or more files. It is possible, even highly probable, that there is code out there that depends on the fact that doing so is additive.

    Given this analysis, I think that the best fix would be implement (and document) the following behavior for init:

    If called with no arguments, it rebuilds the module database from scratch

    If called with a list of files, it adds the contents of those files to the module database

    The second is a backward compatibility hack. Ideally it would be deprecated in favor of some sort of load_mime_files method.

    It is possible that the first will also break code, but I think it is less likely, and probably an acceptable risk in a new major release. But I'd be prepared to change it to 'init does nothing' if breakage showed up during RC testing.

    The problem with this "fix" is that it does not, in fact, address the root cause of the OP's bug report. The specific behavior he observes when calling init() would be fixed, but the underlying problem remains. If he were to instead instantiate a new MimeTypes db, then when it "copies" the module database, it will build its own database by running the old database in key order, and once again the results returned by guess_extension might mutate. This means that the new db is *not* a copy of the old db when it starts.

    That problem could be fixed by having MimeTypes.__init__ do a copy of the types_map and types_map_inv data structures instead of rebuilding them from scratch. This would mean shifting the initialization of these structures out of MimeTypes and in to init (in the 'reinitialize' code path) or perhaps into _default_mime_types, but I don't see that as a big problem, once init is doing a full reinitialization by default. (There is also the question of whether it should be a 'deep copy', but I don't think that is needed since a user would need to be doing something pretty hackish to run afoul of a shallow-copy-induced problem.)

    Can anyone see flaws in this analysis and proposed solution? I've marked the fix as easy since a python hacker should be able to knock out a solution in a day, but it isn't trivial. And I have no clue how to write a unit test for the MimeTypes.__init__ order-shifting bug.

    I'm also resetting the priority to normal since I consider the ambiguity of what calling init twice actually does to be a bigger issue than it sometimes changing the results of a function with 'guess' in its name :)

    I've attached a patch with a unit test for the 'init doesn't re-init' behavior.

    (By the way, it also appears to me from reading the code that read_mime_types is buggy in that it actually returns a merge of the loaded file with the current module DB state, but I haven't checked that observation.)

    @bitdancer bitdancer added the easy label Jun 29, 2010
    @siona
    Copy link
    Mannequin Author

    siona mannequin commented Jun 30, 2010

    That solution looks sound to me, in particular documenting the semantics of repeated init() calls!

    As for the underlying problem, it seems to me that an alternative to copying the existing structures rather than rebuilding them would be to use OrderedDicts. Although I can't think why it might be a preferable alternative, other than being a bit clearer that order of insertion can affect behaviour.

    @bitdancer
    Copy link
    Member

    I'd forgotten about this issue. I wonder if the dictionary randomization makes the problem worse.

    @wichert
    Copy link
    Mannequin

    wichert mannequin commented Mar 27, 2014

    I can reproduce this on Both OSX 10.9 and Ubuntu 12.04:

    >>> import mimetypes
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpe'
    >>> mimetypes.init()
    >>> mimetypes.guess_extension('image/jpeg')
    '.jpeg'

    The same thing happens for Python 3.4:

    Python 3.4.0rc3 (default, Mar 13 2014, 10:48:59) 
    [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import mimetypes
    >>> mimetypes.guess_all_extensions('image/jpeg')
    ['.jpg', '.jpeg', '.jpe']
    >>> mimetypes.init()
    >>> mimetypes.guess_all_extensions('image/jpeg')
    ['.jpeg', '.jpe', '.jpg']

    This also looks related to bpo-1043134

    @abadger
    Copy link
    Mannequin

    abadger mannequin commented Apr 14, 2014

    Took a look at this and was able to reproduce it on Fedora Linux 20 and current cpython head. It is somewhat random though. I'm able to get reasonably consistent failures using image/jpeg and iterating the test case about 20 times.

    Additionally, it looks like the data structure that mimetypes.guess_extensions() is reading its extensions from is a list so it doesn't have to do with dictionary sort order. It has something to do with the way the extensions are read in from the files and then given to add_type().

    Talking to r.david.murray I think that this particular problem can be solved by simply sorting the list of extensions prior to guess_extension taking the first extension off of the list.

    The question of what to do when the first extension in the list isn't the best extension should be resolved in bpo-1043134.

    I'll attach a patch with test case for this problem.

    @bitdancer
    Copy link
    Member

    OK, it is great having a test that makes this at least mostly reproducible :)

    Having reloaded my brain on this thing, I'm thinking that the best solution may be indeed to switch to ordered dicts. If we then reorder the hardcoded lists to be in "preferred" order, that should then also solve bpo-1043134.

    @sivert
    Copy link
    Mannequin

    sivert mannequin commented Jul 12, 2015

    I bumped into a similar issue with mimetypes.guess_extension on Arch Linux 64-bit in February. The behavior is still present in python 3.4.3.

    $ python test.py
    .htm
    $ python test.py
    .html
    $ cat test.py
    from mimetypes import guess_extension
    print(guess_extension('text/html'))
    $ python
    Python 3.4.3 (default, Mar 25 2015, 17:13:50)
    [GCC 4.9.2 20150304 (prerelease)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>>

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented May 8, 2017

    Concur with @sivert – the result of guess_extension() is non-deterministic between mimetypes module initialization.

    $ python
    Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
    [GCC 4.8.4] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpe
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpe
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpe
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpeg
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpeg
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpe
    $ python -c 'import mimetypes;print(mimetypes.guess_extension("image/jpeg"))'
    .jpg
    $

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented May 8, 2017

    And the underlying problem causing this:

    $ python -c 'import mimetypes;print(mimetypes.guess_all_extensions("image/jpeg"))'
    ['.jpeg', '.jpg', '.jpe']
    $ python -c 'import mimetypes;print(mimetypes.guess_all_extensions("image/jpeg"))'
    ['.jpg', '.jpe', '.jpeg']
    $ python -c 'import mimetypes;print(mimetypes.guess_all_extensions("image/jpeg"))'
    ['.jpg', '.jpeg', '.jpe']
    $ python -c 'import mimetypes;print(mimetypes.guess_all_extensions("image/jpeg"))'
    ['.jpe', '.jpg', '.jpeg']
    $ python -c 'import mimetypes;print(mimetypes.guess_all_extensions("image/jpeg"))'
    ['.jpeg', '.jpg', '.jpe']
    $ 

    If the module can't know which extension is preferred, perhaps guess_extension should just be deprecated and the results of guess_all_extensions sorted on return?

    At least that would give us some determinism to work with.

    @bitdancer
    Copy link
    Member

    @dhess: do you want to work on the OrderedDict + correctly ordered hardcoded lists solution?

    @bitdancer bitdancer added the 3.7 (EOL) end of life label May 8, 2017
    @vadmium
    Copy link
    Member

    vadmium commented May 8, 2017

    I suggest to discuss the non-determinism problem in bpo-1043134 (about determining a canonical extension for each content type). I understood this bug (bpo-4963) is about the behaviour of repeated initialization of the same instance of mimetypes.

    BTW an ordered dictionary wouldn’t help with duplicate dictionary keys; see guess_extension("application/excel").

    @vadmium
    Copy link
    Member

    vadmium commented May 9, 2017

    I understand hash randomization was added after this bug was opened. Here is a demonstration with “video/mp4”, which only has the extension “.mp4” built in. But my /etc/mime.types file lists “mp4 mp4v mpg4”, so after the second initialization the behaviour changes:

    PYTHONHASHSEED=0 python3.5 -c 'from mimetypes import *; print(guess_all_extensions("video/mp4")); init(); print(guess_all_extensions("video/mp4"))'
    ['.mp4', '.mp4v', '.mpg4']
    ['.mpg4', '.mp4', '.mp4v']

    The first extension is always “.mp4”, regardless of hash randomization, due to the built-in list. But after re-initialization, the first extension depends on the order in the internal dictionary.

    Using an ordered dictionary may work as a bug fix, but the whole initialization logic is so complex and it would be good to simplify it in the long term.

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented May 13, 2017

    Ok, I followed @r.david.murray's advice and decided to take a shot at this.

    First, I noticed that I couldn't reproduce the non-deterministic behavior that I reported above on the latest code (i.e. pre-3.7). After doing some research it appears this was the sequence of events:

    1. Pre-3.3, hashing was stable and this wasn't a problem.
    2. Hash randomization became the default in version 3.3 and this non-determinism showed up.
    3. A new dict implementation was introduced in 3.6 and key orders became stable between runs and this non-determinism was gone. However, as the notes on the new dict implementation indicate, this ordering should not be relied upon.

    I also looked at some other issues:

    • 6626 - The patch here basically rewrote the module. I agreed with the last comment on that issue that it probably doesn't need that.
    • 24527 - Related to the .init() problems discussed here in r.david.murray's excellent analysis of the init behavior.
    • 1043134 - Where the preferred extension issue was addressed via a proposed new map.

    My approach with this patch is to address the init problem, the non-determinism and the preferred extension issue.

    For the init, I made two changes:

    1. I added new references to the initial values of the maps so they could be retained between init() calls. I also modified MimeTypes.__init__ to refer to these.

    2. I modified the init() function to check the files argument as r.david.murray suggested. If it is supplied, then the existing database is used and the files are added to it. If it is not supplied, then the module reinitializes from scratch. I'll update the documentation to reflect this if the commit passes muster.

    For the non-determinism and preferred extension, I changed the two extension type maps to be OrderedDicts. I then sorted the entries to the OrderedDict constructor by mime type and then placed the preferred extension as the first extension to be processed. This guarantees that it will be the extension returned for guess_type. The OrderedDict also guarantees that guess_all_extensions will always build and return the same value.

    The commit can be reviewed here:

    davidkhess@ecabb1c

    I'll open a PR if and when this approach gets enough positive feedback.

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented May 14, 2017

    Pushed more commits so here's a branch compare:

    master...davidkhess:fix-issue-4963

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented Aug 10, 2017

    FYI, PR opened: #3062

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented Apr 9, 2018

    Are there any committers watching this issue that are able to review the PR?

    #3062

    It's close to 6 months old now with no action on it. I'm willing to help but doing so and then having the PR gather dust is pretty discouraging.

    Thanks in advance!

    @tiran
    Copy link
    Member

    tiran commented Jul 28, 2018

    The PR has a merge conflict.

    @zooba
    Copy link
    Member

    zooba commented Jun 24, 2019

    New changeset 9fc720e by Steve Dower (David K. Hess) in branch 'master':
    bpo-4963: Fix for initialization and non-deterministic behavior issues in mimetypes (GH-3062)
    9fc720e

    @zooba
    Copy link
    Member

    zooba commented Jun 24, 2019

    Sorry for the delays!

    I've merged into master. The backports (3.7, 3.8) are going to need some help, so I'll take a look at them soon unless someone else gets there first.

    @zooba zooba added 3.8 only security fixes 3.9 only security fixes labels Jun 24, 2019
    @zooba
    Copy link
    Member

    zooba commented Jun 25, 2019

    New changeset 25fbe33 by Steve Dower in branch '3.8':
    bpo-4963: Fix for initialization and non-deterministic behavior issues in mimetypes (GH-14375)
    25fbe33

    @davidkhess
    Copy link
    Mannequin

    davidkhess mannequin commented Jun 25, 2019

    Thank you Steve!

    Nice to see this one make it across the finish line.

    @zooba
    Copy link
    Member

    zooba commented Jun 25, 2019

    New changeset 2a99fd9 by Steve Dower in branch '3.7':
    bpo-4963: Fix for initialization and non-deterministic behavior issues in mimetypes (GH-14376)
    2a99fd9

    @zooba zooba closed this as completed Jun 25, 2019
    @zooba zooba self-assigned this Jun 25, 2019
    @pombredanne
    Copy link
    Mannequin

    pombredanne mannequin commented Jan 9, 2021

    The changes introduced by this ticket in 9fc720e#r45794801 are problematic.

    I discovered this from having tests failing when testing on Python 3.7 and up

    The bug is that calling mimetypes.init(files) will NOT use my files, but instead use both my files and knownfiles.
    This was not the case before as knownfiles would be ignored as expected when I provide my of files list.

    This is a breaking API change IMHO and introduces a buggy unstability : even if I want to ignore knownfiles by providing my list of of files, knownfiles will always be added and this results in erratic and buggy behaviour as the content of "knownfiles" is completely random based on the OS version and else.

    The code I am using is here https://github.com/nexB/typecode/blob/ba07c04d23441d3469dc5de911376d408514ebd8/src/typecode/contenttype.py#L308

    I think we should reopen to fix (or create a new ticket)

    @pombredanne
    Copy link
    Mannequin

    pombredanne mannequin commented Jan 9, 2021

    Actually this is problematic on multiples counts:

    1. the behaviour changes and this is a regression
    2. even if that new buggy behaviour was the one to use, it should not give preference to knownfiles ovr init-provided files, but at least take the provided files first and knownfiles second.

    @terryjreedy
    Copy link
    Member

    Phillipe: I was the first to comment, but had no futher involvement with this issue until now. Multiple people, including coredevs, considered the old behavior to be buggy and multiple people, including coredevs, contributed to the fix. You are unlikely to prevail arguing that the change is totally a mistake and should be totally reverted.

    the behaviour changes and this is a regression

    Code bug fixes are supposed to change behavior. We know that the change will break code that depends on the buggy behavior. That is why we include an updated change log with each release.

    The intended change is *not* a regression in itself. A regression in a bug fix is an unintended change to some other behavior. I don't believe (but could be mistaken) that you have argued or shown this.

    Assuming that you are not asking for complete reversion, I suggest that you open a new issue, referencing this one, but taking the current behavior as the given. Propose a revised behavior and argue that it is even better. If you want to argue that the current behavior is buggy (compared to the current docs) so that your proposed revision should be backported, make that a separate argument.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes easy stdlib Python modules in the Lib dir topic-email type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants