classification
Title: Remove unicode_format.h from stringlib
Type: Stage:
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: anthonypjshaw, eric.smith
Priority: normal Keywords: patch

Created on 2015-08-25 21:30 by eric.smith, last changed 2019-05-06 20:42 by anthonypjshaw.

Pull Requests
URL Status Linked Edit
PR 13137 closed anthonypjshaw, 2019-05-06 19:46
Messages (6)
msg249160 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2015-08-25 21:30
Objects/stringlib/unicode_format.h does not belong in stringlib. Back when it was originally written for 2.x, it used stringlib to provide the str and unicode versions of str.format, str.__format__, int.__format__, etc.

However, in 3.x, and especially with PEP 393 (Flexible String Representation), not only is the stringlib functionality no longer needed, it's not used at all.

My suggestion is to just copy the source into Objects/unicodeobject.c, which is the only place it's used. Then delete the stringlib file.

The only downside of including it in unicodeobject.c is that it makes our largest C file about 8% larger:

wc -l says:
1284  Objects/stringlib/unicode_format.h
15414 Objects/unicodeobject.c

There's some argument to be made to separate out the int.__format__, float.__format__ etc. code, and move them to some other library. I don't think they're a huge part of unicode_format.h. And to separate them out would require creating some _Py_* functions to do their work. But it's probably the right thing to do. I'll investigate.
msg249212 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2015-08-26 18:43
Actually, int.__format__, etc. are not in this file. So that's good.

The things that are in this file but are unrelated to unicodeobject.c are the support routines for implementing string.Formatter. I think I'll move those elsewhere, as a first step.
msg341626 - (view) Author: anthony shaw (anthonypjshaw) * (Python triager) Date: 2019-05-06 19:47
Eric, there have been further changes to Objects/stringlib/unicode_format.h since this original note, I've raised a PR with the intent of your note from 2015.

There also hasn't been any change to the situation, unicode_format.h is only used in unicodeobject.c stil.
msg341627 - (view) Author: anthony shaw (anthonypjshaw) * (Python triager) Date: 2019-05-06 19:48
> The things that are in this file but are unrelated to unicodeobject.c 
 are the support routines for implementing string.Formatter.

I'm not sure which functions that relates to, if you could let me know I'd be happy to add those to the PR.
msg341632 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-05-06 19:56
I think I meant things like PyFieldNameIter_Type, but it would require some analysis.
msg341642 - (view) Author: anthony shaw (anthonypjshaw) * (Python triager) Date: 2019-05-06 20:42
The code is mostly:

FieldNameIterator * related functions
FormatterIterator * related functions
MarkupIterator * related functions

There are a few other utility methods in there as well
History
Date User Action Args
2019-05-06 20:42:20anthonypjshawsetmessages: + msg341642
2019-05-06 19:56:21eric.smithsetmessages: + msg341632
versions: + Python 3.8, - Python 3.6
2019-05-06 19:48:38anthonypjshawsetmessages: + msg341627
2019-05-06 19:47:45anthonypjshawsetnosy: + anthonypjshaw

messages: + msg341626
stage: patch review ->
2019-05-06 19:46:31anthonypjshawsetkeywords: + patch
stage: patch review
pull_requests: + pull_request13050
2015-08-26 18:43:03eric.smithsetmessages: + msg249212
2015-08-25 21:53:48eric.smithsettitle: Remove unicode_fornat.h from stringlib -> Remove unicode_format.h from stringlib
2015-08-25 21:30:11eric.smithcreate