classification
Title: extend configparser to support mapping access(__*item__)
Type: enhancement Stage: commit review
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: fdrake Nosy List: BreamoreBoy, eric.araujo, fdrake, jkaufman, lukasz.langa, michael.foord, r.david.murray, ysj.ray
Priority: normal Keywords: needs review

Created on 2009-03-03 15:14 by jkaufman, last changed 2010-11-20 15:32 by lukasz.langa. This issue is now closed.

Files
File name Uploaded Description Edit
configparser.patch jkaufman, 2009-03-03 15:14 patch to add support for [] notation to configparser
issue5412.diff lukasz.langa, 2010-11-05 16:57 Patch that introduces mapping protocol access
Messages (26)
msg83075 - (view) Author: Jeff Kaufman (jkaufman) Date: 2009-03-03 15:14
This is a patch against the configparser in the cvs version of 3.1 to
support [] notation:

>>> import configparser_patched
>>> config = configparser_patched.SafeConfigParser()
>>> config.add_section("spam")
>>> config["spam", "eggs"] = "yummy"
>>> config["spam", "eggs"]
'yummy'
>>> del config["spam", "eggs"]
>>> 

The functions are just syntactic sugar for the simple forms of "get",
"set", and "remove_option".
msg109761 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-09 16:21
Patch is simple, are people for or against syntactic sugar in configparser?
msg111420 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-07-24 02:34
I'd say we go for it. At the moment most (if not all) "drop-in replacements" for ConfigParser support this functionality. It's backwards compatible and useful. One might argue than it's more pythonic than getfloat() & co.

I'd also add attribute-like syntax but before I present a patch I'd need #7113, #4686 and #1682942 accepted. The last one is waiting for the first two as well.
msg111616 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-07-26 13:42
There are no docs or tests in the patch. I like the functionality though and doubt it will be controversial. The current api is a bit "arcane". So +1 from me.
msg111782 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-07-28 10:26
I’d find more natural to have cp['spam'] return the section (as a dict) and cp['spam']['ham'] return the value.
msg111786 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-07-28 10:55
Éric, first thing: forget about the current patch because it's very much incomplete.

Second thing, while I normally would agree with you about the ['section']['key'] idea, in this case the current syntax has following advantages:

- we can implement a cohesive mapping protocol that extends to get(), del, in, etc. For now get() seems somewhat similar to what dictionaries give you and I would build on that (adding a `default` attribute would be another thing).

- manipulation on the internal structures is much simpler when we have a single key like that. Having config['name'] return the section would make us create another proxy object just to support mutating keys in the section.

I can see arguments for and against this approach but overall, it looks nice.
msg111799 - (view) Author: ysj.ray (ysj.ray) Date: 2010-07-28 12:43
lukasz,

> - manipulation on the internal structures is much simpler when we have a single key like that. Having config['name'] return the section would make us create another proxy object just to support mutating keys in the section.

I'm afraid this could not be a good reason against the config['section']['key'] style. Since implementing this in python is not too difficult. For the more readability this style brings than config['section', 'key'], I think it's worthy.
msg111800 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-07-28 12:45
As in ConfigParser you are always accessing a section and value I'm happy with tuple indexing.
msg111821 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-07-28 15:00
Note that the two versions are not exclusive: We can look for an item if a tuple is given and look for a section if it’s a string. Maybe confusing.

> - we can implement a cohesive mapping protocol that extends to get(),
> del, in, etc. For now get() seems somewhat similar to what
> dictionaries give you

We have the same problem in distutils2 with a class that supports some mapping operations but has an incompatible get method. Luckily we can still break compat there.

The winning argument in my opinion is user convenience, not moderate implementation issues. Is is generally useful that config parsers and sections behave like mutable mappings? Then add the methods with the behavior I proposed. Are there compatibility problems and not much incentive? Then do it your way, and add another mechanism for my wished use. It looks like we have to go for the latter.

> - manipulation on the internal structures is much simpler when we
> have a single key like that. Having config['name'] return the section
> would make us create another proxy object just to support mutating
> keys in the section.

Yeah, dict/DictMixin subclasses that implement checking and conversion wouldn’t be hard to write, but maybe it’s not worth it and we just need a method to convert the config parser to a dict (already easy with sections: dict(cp.items(section))

So if every other config parser supports cp['section', 'key'], I’m +1. I can get a real dict for a section, and I’ll open another report to request methods update and asdict methods for RawConfigParser.
msg111822 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-07-28 15:02
> The winning argument in my opinion is user convenience

Well yes, for me too - as the user will always be operating on (section, key) pairs the extra level of indirection seems pointless.
msg111891 - (view) Author: ysj.ray (ysj.ray) Date: 2010-07-29 04:02
> Note that the two versions are not exclusive: We can look for an item if a tuple is given and look for a section if it’s a string. Maybe confusing.

+1. I think this is a good idea. Getting a section and Getting a key-value are both very common operations when parsing "ini" config file. Both operations should be supported conveniently.
msg112105 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-07-30 20:58
OK, please find attached a preliminary patch for the functionality. This patch is complete in terms of implementation with one significant ommision, __setitem__ on the parser (adding a section), which needs further discussion.

Because of the development state for this functionality, the patch includes some comments that are meant to aid reviewers and the developer (me). They are obviously not supposed to be included in the final patch. There is neither any ReST documentation and only the BasicTest was ported to check for the mapping behaviour as well at the moment. All of these will be corrected once there is agreement upon the interface and its implementation.

The mapping interface implementation is using the parser['section']['value'] notation that seems to be more popular amongst the developers. The alternative implementation presented in the first patch shall not be included because "There should be preferably only one obvious way to do it". Implementing both approaches in a consistent manner would be needlessly tricky and would bloat the code, the tests and the documentation. I believe it's not worth it.

The mapping interface implementation enables getting parser['section'] which returns a mutable view on the section. This means that:
- the values are not copied but they are taken from the original parser on demand
- the values mutated using this view are mutated in the original parser

The implementation is using the existing high-level API so that subclassing the original interface makes the mappings work as expected as well. This has the drawback of having potentially weak performance (especially __len__ and __iter__ on a section). One difference is the explicit lack of support for the `__name__` special key. This is because the existing behaviour of `__name__` is very inconsistent and supporting it would only lead to problems. Details here: http://mail.python.org/pipermail/python-dev/2010-July/102556.html

The mapping interface was implemented so that it behaves as close to an actual dictionary as possible. The existing differences are:
- all sections do include DEFAULTSECT values as well
  -- these values cannot be deleted from the section (because technically they are not there). If they are overriden in the section, deleting causes the default value to be visible again. Trying to delete a default value causes a KeyError
  -- because of this reason, .clear() on a section does not leave the section empty
- trying to delete the DEFAULTSECT throws ValueError
- the mapping interface is complete thanks to using the MutableMapping ABC. However, there are two methods in the old API that hide the dictionary interface:
  -- .get() - this one is unsolvable even when we get to implementing default= for it because invoking as .get('string1', 'string2') would be ambiguous. This difference shall be explicitly documented in the docs.
  -- .items() - this one could potentially be solvable by providing a special case when no arguments are passed. The problem is with existing subclasses (possibly 3rd party) which override this method without handling this case. I'm still on the fence whether to include a special case here.

As for adding a section with the mapping protocol access, I like the current behaviour used for instance in ConfigObj is good for us. They use something like:

parser['section'] = {'key1': 'value', 'key2': 'value'}

This should overwrite existing sections as well. If no-one objects, I'll implement this behaviour as well.
msg113311 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-08-08 20:20
Patch updated for a state when #9452 is applied.

Creating/overwriting sections with __setitem__ on the parser added. More thorough unit test suite for the mapping protocol.

No documentation created yet. 

Fred, if you applied #9452 you might review this patch, too (thanks in advance!). If it's more-less okay, I'll introduce documentation as well.

For the record, another difference in behaviour with a regular dictionary was discovered as well:
- by default, all keys in sections are accessible in a case-insensitive manner
  -- that means that for a section that holds key "a", both are True: "a" in parser["section"], "A" in parser["section"]
  -- but, `for option in parser["section"]` lists only `optionxform`ed  option key names

The default is compatible with the legacy API and of course may be changed by specifying `optionxform = str`.
msg113325 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-08-08 21:18
No time to review the patch, but quick feedback: Your mappings behaving
a bit differently than dicts but in a way compatible with usual
configparser access is +1 for me.
msg113326 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-08-08 21:21
I am mostly gathering information for inclusion in the docs. So the comments about differences in behaviour are rather "for the record". Still, I'm happy that your +1 on them, too :)
msg115654 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-09-05 15:45
Patch updated to avoid fuzzes from other commits.
msg117376 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-09-25 15:31
Patch updated:

 * uses collections because they are built into the executable for 3.2
 * uses itertools for the same reasons
 * in the end doesn't use weakrefs since after recent commits (Raymond removed __del__ from OrderedDict) the reference cycle problem went away on its own

Still needs docs but I didn't manage to make sensible edits last night (the train I took from Hannover arrived 80 minutes later).
msg120459 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-04 23:54
Patch updated yet again:

 * optional .get() arguments are now keyword only. This is a bit
   backwards incompatible but:

   * protects users of mapping protocol access from invoking get() as if
     the last positional argument was the `fallback` value (it would be
     `raw` for (Safe)ConfigParsers and `vars` for RawConfigParsers)

   * so it will improve API compatibility between Raw and others

   * it will make the client code more readable, e.g.:

      anyParser.get('section', 'value', fallback=True)

      vs.

      rawParser.get('section', 'value', {}, True)
      safeParser.get('section', 'value', 0, {}, True)
      parserUsedWrong('section', 'value', True) # NOT a fallback! 

 * mapping protocol access is now quite thoroughly documented

 * renamed `default` arguments to get() to `fallback`. This is *not*
   backwards incompatible because those arguments were just recently
   added by me. I chose to rename them since because of DEFAULTSECT
   there already is a well-defined "default value" concept. Now the
   situation is much clearer, e.g. the docs can simply state: "default
   values have precedence over fallback values." and this is
   unambiguous.

As for some suggestions to just provide the section dictionaries
directly, this won't work because of interpolation and the terrific
DEFAULT section special case, e.g.  if a config file contains a DEFAULT
section and other sections lack a value, it will be returned instead on
get(). Also, DEFAULT values are used in interpolations. On top of that,
the provided section proxies ensure while setting that the database
holds only strings (which it should by design).

There is still one paragraph to be finished (Customizing Parser
Behaviour). This has to wait for tomorrow.
msg120511 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-05 16:57
Documentation complete.
msg120577 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2010-11-06 03:56
Code
====

- The __name__-aversion in the mapping interface is a little
  heavy-handed, but given the implementation of __name__ elsewhere, I
  think this can be revisited separately if anyone cares enough.

  In particular, it should be allowed to give an __name__ value
  explicitly and have it exposed:

      >>> import configparser
      >>> import io

      >>> cp = configparser.RawConfigParser()
      >>> cp.read_file(io.StringIO('[sect]\n__name__ = FooBar\n'))

      >>> cp.get('sect', '__name__')
      'FooBar'

      >>> cp['sect']['__name__']
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File ".../py3k/Lib/configparser.py", line 1132, in __getitem__
          raise ValueError(self._noname)
      ValueError: __name__ special key access and modification not supported through the mapping interface.

  In this example, the __name__ key isn't special, it's just a key.

Documentation
=============

- I dislike the 'fallback'/'default' terminology problem; everywhere
  else, the fallback is called 'default'.  This really points to
  configparser (ne ConfigParser) not having been fully thought out
  before it was released as part of Python.  There's probably nothing
  to be done about it at this point.

  Has anyone attempted to determine how widely used the "defaults"
  (mis-)feature is?  I've not seen a good user for it as implemented
  outside the original application ConfigParser was written for
  (though others probably exist that have a similar need).

- There's no value in saying

      `None` can be provided as a `fallback` value.

  (regardless of the terminology).  None is a value like any other, and
  the docs already indicate it is used if provided.  If anything need be
  said, identify the exception raised if the default isn't provided and
  no value is present.

  Identifying None as a possible value isn't needed or done in any of
  the mapping descriptions, so let's not add it here, even for the
  methods that don't correlate to mapping methods.

- I dislike the documentation style where we indicate arguments are
  keyword-only using a bare * in the signature, and then repeat that in
  prose (this is not the only place this happens).  It's redundant
  (more work to maintain) and more verbose than helpful.

I still need to give the new docs a thorough read.
msg120596 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-11-06 07:44
Email uses 'failobj' instead of 'default'.  I find that kind of odd, but oh well.

I'm not sure how useful a single data point is, but just last month I wrote an application that uses the DEFAULT section.  Each configfile section gives a set of parameters describing certain attributes of an input file, and the DEFAULT section gives the default values for each of those parameters.  A few years ago I had an application that made heavy use of interpolation and provided for config files that would override 'earlier' config files, and I used the 'defaults' parameter to the constructor to achieve this easily.  I'm pretty sure I also added a few defaults programatically, but the memory is a bit hazy (I no longer maintain that application and I don't think it is in use any longer).
msg120929 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-10 19:06
Committed in r86402.

Documentation changes pending, to be made after Fred's detailed review. Fred, you can review using trunk. Reassigned as a reminder.
msg120954 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-11 14:39
In my experience of cfg files, it’s uncommon to have leading spaces (and also, may I add, ugly).  It’s great that configparser supports them now, but I wouldn’t have put them in the basic examples.
msg120957 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-11 15:21
Yup, most config parsers don't support that so there are not so many INI type files in the wild with this kind of formatting. On the other hand, Samba, XULRunner, mke2fs and others all use this.

I mean, you're not forced to use that ugly feature but it's good to have it exposed so people don't later say they knew not that it was possible.

Of course, I'm open for suggestions on better examples :)
msg120973 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-11 21:39
Well, I suggest removing the leading spaces in the examples, except maybe in one place if you think it’s needed.
msg121000 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-12 00:57
Fred, while you're at it have a look at your message here: http://bugs.python.org/issue9421#msg115558 (also documentation remarks on an otherwise closed issue).
History
Date User Action Args
2010-11-20 15:32:13lukasz.langasetstatus: open -> closed
resolution: accepted
2010-11-12 00:57:35lukasz.langasetmessages: + msg121000
2010-11-11 21:39:13eric.araujosetmessages: + msg120973
2010-11-11 15:21:39lukasz.langasetmessages: + msg120957
2010-11-11 14:39:59eric.araujosetmessages: + msg120954
2010-11-10 19:06:11lukasz.langasetkeywords: + needs review, - patch
assignee: lukasz.langa -> fdrake
messages: + msg120929

stage: test needed -> commit review
2010-11-06 07:44:39r.david.murraysetnosy: + r.david.murray
messages: + msg120596
2010-11-06 03:56:14fdrakesetmessages: + msg120577
2010-11-05 16:58:07lukasz.langasetfiles: - issue5412.diff
2010-11-05 16:57:54lukasz.langasetfiles: + issue5412.diff

messages: + msg120511
2010-11-04 23:54:16lukasz.langasetfiles: + issue5412.diff

messages: + msg120459
2010-11-04 23:51:56lukasz.langasetfiles: - issue5412.diff
2010-09-25 15:31:33lukasz.langasetfiles: + issue5412.diff

messages: + msg117376
2010-09-25 15:29:45lukasz.langasetfiles: - issue5412.diff
2010-09-05 15:45:19lukasz.langasetfiles: + issue5412.diff

messages: + msg115654
2010-09-05 15:42:26lukasz.langasetfiles: - issue5412.diff
2010-08-08 21:21:08lukasz.langasetmessages: + msg113326
2010-08-08 21:18:03eric.araujosetmessages: + msg113325
2010-08-08 21:12:28lukasz.langasetassignee: lukasz.langa
2010-08-08 20:21:00lukasz.langasetfiles: + issue5412.diff

messages: + msg113311
2010-08-08 20:13:00lukasz.langasetfiles: - issue5412.diff
2010-08-02 18:44:01fdrakesetnosy: + fdrake
2010-07-30 20:58:58lukasz.langasetfiles: + issue5412.diff

messages: + msg112105
2010-07-29 04:02:27ysj.raysetmessages: + msg111891
2010-07-28 15:02:42michael.foordsetmessages: + msg111822
2010-07-28 15:00:18eric.araujosetmessages: + msg111821
title: extend configparser to support [] syntax -> extend configparser to support mapping access(__*item__)
2010-07-28 12:45:00michael.foordsetmessages: + msg111800
2010-07-28 12:43:36ysj.raysetnosy: + ysj.ray
messages: + msg111799
2010-07-28 10:55:09lukasz.langasetmessages: + msg111786
2010-07-28 10:26:03eric.araujosetnosy: + eric.araujo
messages: + msg111782
2010-07-26 13:42:44michael.foordsetnosy: + michael.foord
messages: + msg111616
2010-07-24 02:34:27lukasz.langasetnosy: + lukasz.langa
messages: + msg111420
2010-07-09 16:21:05BreamoreBoysetversions: + Python 3.2, - Python 3.1
nosy: + BreamoreBoy

messages: + msg109761

stage: test needed
2009-03-03 15:14:34jkaufmancreate