classification
Title: fnmatch failed with leading caret (^)
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: cykerway, josh.r, serhiy.storchaka, xtreak
Priority: normal Keywords:

Created on 2018-11-26 10:58 by cykerway, last changed 2018-11-26 23:30 by cykerway. This issue is now closed.

Messages (4)
msg330417 - (view) Author: Cyker Way (cykerway) * Date: 2018-11-26 10:58
In short, `fnmatch.fnmatch` doesn't match shell result. To test this, create a dir with 2 files: `a.py` and `b.py`. Then `ls [!b].py` and `ls [^b].py` will both show `a.py`. However, `fnmatch.fnmatch('a.py', '[!b].py')` returns `True` but `fnmatch.fnmatch('a.py', '[^b].py')` returns `False`.

Problem seems to come from an escaped caret: https://github.com/python/cpython/blob/master/Lib/fnmatch.py#L124

I don't see why caret and exclamation mark are different from `man bash`:

>   ...If the first character following the [ is a !  or a ^ then any character not enclosed is matched...

Could someone please confirm it's a bug or intended behavior?
msg330430 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-26 13:10
'^' is not considered as a special character in shell-style wildcards. Use '!' for negating the character set: '[!b].py'.

https://docs.python.org/3/library/fnmatch.html

`man bash` describes the behavior of Bash, not Python.
msg330476 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2018-11-26 23:10
Finished typing this while Serhiy was closing, but just for further explanation:

This isn't a bug. fnmatch provides "shell-style" wildcards, but that doesn't mean it supports every shell's extensions to the globbing syntax. It doesn't even claim support for full POSIX globbing syntax. The docs explicitly specify support for only four forms:

*
?
[seq]
[!seq]

There is no support for [^seq]; [^seq] isn't even part of POSIX globbing per glob(7):

"POSIX has declared the effect of a wildcard pattern "[^...]" to be undefined."
msg330481 - (view) Author: Cyker Way (cykerway) * Date: 2018-11-26 23:30
Thank you for confirmation. Knowing it is not fully POSIX-compliant helps with understanding.

I'm asking this because I had interoperability issues writing python scripts providing shell-like utilities for filename expansion and the result may surprise users. The glibc fnmatch provides a flag named `FNM_PATHNAME`, which is missing in the python fnmatch implementation. So I think there is currently no way to tell the python library if we are matching a filename or not.

All right so this is not a bug, but probably a good enhancement.

## TLDR

This is what POSIX says **for filename expansion**, in section 2.13.3:

<http://pubs.opengroup.org/onlinepubs/9699919799.2008edition/>

>   when pattern matching notation is used for filename expansion:
>
>   1.  The <slash> character in a pathname shall be explicitly matched by using one or more <slash> characters in the pattern; it shall neither be matched by the <asterisk> or <question-mark> special characters nor by a bracket expression. <slash> characters in the pattern shall be identified before bracket expressions; thus, a <slash> cannot be included in a pattern bracket expression used for filename expansion. If a <slash> character is found following an unescaped <left-square-bracket> character before a corresponding <right-square-bracket> is found, the open bracket shall be treated as an ordinary character. For example, the pattern "a[b/c]d" does not match such pathnames as abd or a/d. It only matches a pathname of literally a[b/c]d.

Currently python fnmatch.fnmatch gives:

    >>> fnmatch('abd', 'a[b/c]d')
    True
    >>> fnmatch('a/d', 'a[b/c]d')
    True
    >>> fnmatch('a[b/c]d', 'a[b/c]d')
    False

Ideally we can call `fnmatch('a/d', 'a[b/c]d', fnm_pathname=True)` to correct the behavior.
History
Date User Action Args
2018-11-26 23:30:16cykerwaysetmessages: + msg330481
2018-11-26 23:10:36josh.rsetnosy: + josh.r
messages: + msg330476
2018-11-26 13:10:15serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg330430

resolution: not a bug
stage: resolved
2018-11-26 12:26:00xtreaksetnosy: + xtreak
2018-11-26 10:58:12cykerwaycreate