This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: string.Template should allow inspection of identifiers
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, ben11kehoe, miss-islington, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2022-01-08 20:12 by ben11kehoe, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 30493 merged ben11kehoe, 2022-01-09 03:43
Messages (11)
msg410112 - (view) Author: Ben Kehoe (ben11kehoe) * Date: 2022-01-08 20:12
Currently, the only thing that can be done with a string.Template instance and a mapping is either attempt to substitute with substitute() and catch a KeyError if some identifier has not been provided in the mapping, or substitute with safe_substitute() and not know whether all identifiers were provided.

I propose adding a method that returns the identifiers in the template. Because the template string and pattern are exposed, this is already possible as a separate function:

def get_identifiers(template):
    return list(
        set(
            filter(
                lambda v: v is not None,
                (mo.group('named') or mo.group('braced') 
                 for mo in template.pattern.finditer(template.template))
            )
        )
    )

However, this function is not easy for a user of string.Template to construct without learning how the template pattern works (which is documented but intended to be learned only when subclassing or modifying id patterns).

As a method on string.Template, this would enable use cases like more comprehensive error handling (e.g., finding all missing mapping keys at once) or interactive prompting.
msg410127 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2022-01-09 01:11
I've never personally needed this, but I could see where it could come in handy.

I wonder if __iter__() would be the right API for that?  I wonder then if we should also implement __contains__()?

Would you be interested in creating a PR for the feature?
msg410128 - (view) Author: Ben Kehoe (ben11kehoe) * Date: 2022-01-09 02:51
Happy to make a PR! In my mind I had been thinking it would be the get_identifiers() method with the implementation above, returning a list.

As for __iter__, I'm less clear on what that would look like:

t = string.Template(...)
for identifier in t:
  # what would I do here?
  # would it include repeats if they appear more than once in the template?

I guess there are two ways to think about it: one is "what identifiers are in this template?" which I think should return a list with no repeats, which I can then iterate over or check if a value is in it. The other is, "what are the contents of the template?" in the style of string.Formatter.parse().

Given that string.Template is supposed to be the "simple, no-frills" thing in comparison to string.Formatter, I see less use for the latter option.
msg410129 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2022-01-09 03:28
I think you’re right that the iterator API isn’t very helpful.  I also agree that you probably really want to answer the “why identifiers are in this template?” question.  As for repeats, there’s two ways to think about it.  You could return all the identifiers in the order in which they’re found in the template (and you can unique-ify them if you want by passing that list to set()).  But maybe you don’t really need that either.

get_identifiers() works for me!

On Jan 8, 2022, at 18:51, Ben Kehoe <report@bugs.python.org> wrote:
> 
> Ben Kehoe <ben@kehoe.io> added the comment:
> 
> Happy to make a PR! In my mind I had been thinking it would be the get_identifiers() method with the implementation above, returning a list.
> 
> As for __iter__, I'm less clear on what that would look like:
> 
> t = string.Template(...)
> for identifier in t:
>  # what would I do here?
>  # would it include repeats if they appear more than once in the template?
> 
> I guess there are two ways to think about it: one is "what identifiers are in this template?" which I think should return a list with no repeats, which I can then iterate over or check if a value is in it. The other is, "what are the contents of the template?" in the style of string.Formatter.parse().
> 
> Given that string.Template is supposed to be the "simple, no-frills" thing in comparison to string.Formatter, I see less use for the latter option.
msg410131 - (view) Author: Ben Kehoe (ben11kehoe) * Date: 2022-01-09 03:49
I opened a PR. By default, it raises an exception if there's an invalid identifier; there's a keyword argument raise_on_invalid to control that.

The implementation I have adds them to a set first, which means the order is not guaranteed. I'm of two minds about this: if there's a big template, you want to gather the identifiers in a set so uniqueness is checked immediately and O(1) and without duplication. On the other hand, if templates are never very big, you could build a list (in order) and check it, O(N) style, or build a duplicate list and set in parallel. Or build a big list and check it at the end.

I kind of think ordering doesn't matter? What would someone do with that information?
msg410144 - (view) Author: Ben Kehoe (ben11kehoe) * Date: 2022-01-09 12:47
Having slept on it, I realized that if I was presenting interactive prompts for a template, I would expect the prompts to be in order that the identifiers appear in the template. Accordingly, I've updated the PR to maintain ordering.
msg410148 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-01-09 13:53
What are the use cases for this feature?
msg410172 - (view) Author: Ben Kehoe (ben11kehoe) * Date: 2022-01-09 22:20
The point is to be able to programmatically determine what is needed for a
successful substitute() call. A basic use case for this is better error
messages; calling substitute() with an incomplete mapping will tell you
only the first missing identifier it encounters; if you know all the
identifiers you can raise an error about all the missing identifiers.
Another error handling use case is checking whether the template is valid,
without needing to provide a complete mapping. A use case unrelated to
error handling that I’ve encountered a few times is interactive prompting
for template values, which you can only do if you can get a list of the
identifiers in the template.
msg410174 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-01-09 22:39
The simplest way of collecting template names is to use a defaultdict:

>>> d = collections.defaultdict(str)
>>> string.Template('$a $b').substitute(d)
' '
>>> d.keys()
dict_keys(['a', 'b'])

You can use a custom mapping if you need special handling of absent keys.
msg410175 - (view) Author: Ben Kehoe (ben11kehoe) * Date: 2022-01-09 22:48
That doesn’t really seem like a Pythonic way of extracting that
information? Nor does it seem like it would be an obvious trick for the
average developer to come up with. A method that provides the information
directly seems useful.
msg410322 - (view) Author: miss-islington (miss-islington) Date: 2022-01-11 19:15
New changeset dce642f24418c58e67fa31a686575c980c31dd37 by Ben Kehoe in branch 'main':
bpo-46307: Add string.Template.get_identifiers() method (GH-30493)
https://github.com/python/cpython/commit/dce642f24418c58e67fa31a686575c980c31dd37
History
Date User Action Args
2022-04-11 14:59:54adminsetgithub: 90465
2022-01-12 22:12:21AlexWaygoodsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2022-01-11 19:15:51miss-islingtonsetnosy: + miss-islington
messages: + msg410322
2022-01-09 22:48:02ben11kehoesetmessages: + msg410175
2022-01-09 22:39:56serhiy.storchakasetmessages: + msg410174
2022-01-09 22:20:41ben11kehoesetmessages: + msg410172
2022-01-09 13:53:49serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg410148
2022-01-09 12:47:25ben11kehoesetmessages: + msg410144
2022-01-09 03:49:10ben11kehoesetmessages: + msg410131
2022-01-09 03:43:54ben11kehoesetkeywords: + patch
stage: patch review
pull_requests: + pull_request28698
2022-01-09 03:28:16barrysetmessages: + msg410129
2022-01-09 02:51:18ben11kehoesetmessages: + msg410128
2022-01-09 01:11:38barrysetmessages: + msg410127
2022-01-08 23:46:34rhettingersetnosy: + barry
2022-01-08 20:12:29ben11kehoecreate