classification
Title: Explain ConfigParser 'valid section name' and .SECTCRE
Type: enhancement Stage: needs patch
Components: Documentation Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: docs@python, lukasz.langa, miloskomarcevic, r.david.murray, spaceone, terry.reedy, vstinner, xflr6
Priority: normal Keywords:

Created on 2014-03-14 14:21 by miloskomarcevic, last changed 2019-02-24 22:14 by BreamoreBoy.

Messages (15)
msg213554 - (view) Author: Miloš Komarčević (miloskomarcevic) Date: 2014-03-14 14:21
It would be good if ConfigParser supported angled brackets in section names by being greedy when parsing.

For example, section:

[Test[2]_foo]

gets parsed as:

Test[2
msg213609 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-03-15 01:16
This is invalid as a bug report, unless one considers that the presence of '_foo]' on the same line should have raised an exception. As an enhancement request, I think it should be rejected.

ConfigParser configuration language is based on msdos/windows .ini files. Similar files are used on windows. "A configuration file consists of sections, each led by a [section] header," That and "The section name appears on a line by itself, in square brackets ([ and ])." from
  https://en.wikipedia.org/wiki/.ini#Sections
mark [] as delimiters. I am rather sure that no other language/system allows square brackets in the section name. If you know differently, please present evidence. Unescaped delimiters are generally not allowed between delimiters unless there is a semantic reason to have nesting, and there is not one here. Parentheses, angle brackets, and curly brackets (and more from the rest of Unicode) are available to use. The request here is similar to asking that
  'abc'cd'
be parsed as one string. Parsing nested constructs is much more complicated than parsing flat constructs.
msg213678 - (view) Author: Miloš Komarčević (miloskomarcevic) Date: 2014-03-15 20:07
Thanks for the exhaustive explanation.

I did however come across a proprietary application that stores it's configuration in an INI like file that exhibits this corner case behaviour with section names, hence the suggestion for enhancement.
msg213687 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-03-15 21:50
I see. Perhaps it uses a proprietary .ini reader ;-).
msg234954 - (view) Author: Sebastian Bank (xflr6) Date: 2015-01-29 08:33
If this is the intended behaviuour, I guess ConfigParser should warn the user if he supplies a section name with a ']' (prevent failed roundtrips).

See http://bugs.python.org/issue23301

The current behaviour looks like the opposite of  Postel's law.
msg255263 - (view) Author: SpaceOne (spaceone) Date: 2015-11-24 12:32
IMHO your rejection is stupid. User input should always be validated.

At least a ValueError should be raised if add_section() is called with a string containing ']\x00\n['. As this will always create a broken configuration.

Otherwise ConfigParser cannot be used to write new config files without having deeper knowledge about the implementation.
msg255269 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-24 15:21
The enhancement request was rejected.  At this point I think it would be better to open a new bug requesting that an error be raised if the supplied section name contains a ']'.  The question there is if there are backward compatibility issues.  That can be discussed on the new issue you open.
msg255277 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-11-24 16:16
Sebastian: you have it backwards.  A paraphrase of Postel's recommendation (calling it a Law is wrong) is 'accept dirty data, emit clean data'.  This is the current behavior.  See https://en.wikipedia.org/wiki/Robustness_principle.  This article also explains the opposite viewpoint, that dirty data should be rejected, as SpaceOne is suggesting.

SpaceOne: unless it is your intention to discourage people from volunteering their time to respond to issues raised on the tracker, you should read what they write more carefully and think more carefully about how you express your opinions.  If you really want a ValueError here, open an new enhancement issue for 3.6.
msg255278 - (view) Author: SpaceOne (spaceone) Date: 2015-11-24 16:32
Sorry about that!
I created http://bugs.python.org/issue25723.
msg255279 - (view) Author: Sebastian Bank (xflr6) Date: 2015-11-24 16:33
Terry: I am not so sure about that interpretation. Do we agree that the INI-files are the data/message? ConfigParser refuses to accept dirty INI-Files (with ']' in section names) but will produce this kind of files.
I we see the arguments given to ConfigParser as data/message, it does indeed accept dirty data as you say, but still it does not emit clean one in that case, right?
msg255281 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-11-24 16:46
Why the debate on an issue that was closed over 18 months ago?
msg255304 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-11-24 23:37
Discussion continues because my close message was, I now realize, incomplete and therefore unsatisfying.  Ditto for the doc.  So I complete my close message here and reopen issue to augment the doc.

The discussion has so far has glossed over the key question: "What is a legal section name?"  Pulling the answer from the doc was a challenge. It uses 'legal section name', once, as if one should already know. Reading further, I found the answer: there is no (fixed) answer!

The legal section name for a particular parser is determined by its .SECTCRE class attribute.
'''configparser.SECTCRE
    A compiled regular expression used to parse section headers. The default matches [section] to the name "section".'''  (This neglects to say whether the closing ']' is the first or last ']' on the line after the opening '['.) A non-verbose version of the default is
re.compile(r"\[(?P<header>[^]]+)\]").

I propose adding near the top of the doc:
"By default, a legal section name can be any string that does not contain '\n' or ']'.  To change this, see configparser.SECTCRE."

So my response to Miloš should have been to set SECTCRE to something like p below.

>>> p = re.compile(r"\[(?P<header>.*)\]")
>>> m = p.search("[Test[2]_foo]")
>>> m.group('header')
'Test[2]_foo'

Additional note: Postel's principle was formulated for internet protocols, which .ini files are not.  In any case, it is not a Python design principle.  Neither is "always check user input", which amounts to 'look before you leap'.  So I will not debate these. However, "Errors should never pass silently." is #10 on the Zen of Python ('import this') and that I do attend to.
msg255325 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-11-25 07:59
> It would be good if ConfigParser supported angled brackets in section names by being greedy when parsing.

I agree with Terry, it's the opposite: we must explicitly reject them to be compatible with other applications. Please move the discussion to issue #25723.
msg255368 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-11-25 17:26
Victor, I reopened this a a doc issue to add the sentence that would have cut short the discussion.  Please leave it.
msg255369 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-25 17:36
Yes, the doc issue is separate from the other bug, since that one will apply only to 3.6, and the doc changes apply to all maintenance releases.
History
Date User Action Args
2019-02-24 22:14:24BreamoreBoysetnosy: - BreamoreBoy
2015-11-25 17:36:11r.david.murraysetmessages: + msg255369
2015-11-25 17:31:23terry.reedysettitle: ConfigParser should nested [] in section names. -> Explain ConfigParser 'valid section name' and .SECTCRE
2015-11-25 17:28:59terry.reedysetassignee: docs@python -> terry.reedy
2015-11-25 17:26:24terry.reedysetstatus: closed -> open
resolution: fixed ->
messages: + msg255368
2015-11-25 07:59:57vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg255325

resolution: fixed
2015-11-24 23:37:00terry.reedysetstatus: closed -> open

assignee: docs@python
components: + Documentation
versions: + Python 2.7, Python 3.4, Python 3.6
nosy: + docs@python

messages: + msg255304
resolution: rejected -> (no value)
stage: test needed -> needs patch
2015-11-24 16:46:05BreamoreBoysetnosy: + BreamoreBoy
messages: + msg255281
2015-11-24 16:33:05xflr6setmessages: + msg255279
2015-11-24 16:32:02spaceonesetmessages: + msg255278
2015-11-24 16:16:49terry.reedysettype: behavior -> enhancement
messages: + msg255277
2015-11-24 15:21:41r.david.murraysetnosy: + r.david.murray
messages: + msg255269
2015-11-24 12:32:37spaceonesetnosy: + spaceone
messages: + msg255263
2015-01-29 08:33:57xflr6setnosy: + xflr6
messages: + msg234954
2015-01-22 16:06:52r.david.murraylinkissue23301 superseder
2014-03-15 21:50:36terry.reedysetmessages: + msg213687
2014-03-15 20:07:38miloskomarcevicsetmessages: + msg213678
2014-03-15 01:16:37terry.reedysetstatus: open -> closed


title: ConfigParser should be greedy when parsing section name -> ConfigParser should nested [] in section names.
nosy: + terry.reedy
versions: + Python 3.5, - Python 3.3
messages: + msg213609
resolution: rejected
stage: test needed
2014-03-14 14:40:37r.david.murraysetnosy: + lukasz.langa
2014-03-14 14:21:45miloskomarceviccreate