Message240927
Here's an updated patch for #1:
Existing Patch:
- move tokenizer.h from Parser/ to Include/
- Add PyAPI_Func to export tokenizer functions
New:
- Removed unused, undefined PyTokenizer_RestoreEncoding
- Include PyTokenizer_State with limited ABI compatibility (but still undocumented)
- namespace the struct name (PyTokenizer_State)
- Documentation
I'd like particular attention to the documentation for the tokenizer -- I'm not entirely confident that I have documented the functions correctly! In particular, I'm not sure how PyTokenizer_FromString handles encodings.
There's a further iteration possible here, but it's beyond my understanding of the tokenizer and of possible uses of the API. That would be to expose some of the tokenizer state fields and document them, either as part of the limited ABI or even the stable API. In particular, there are about a half-dozen struct fields used by the parser, and those would be good candidates for addition to the public API.
If that's desirable, I'd prefer to merge a revision of my patch first, and keep the issue open for subsequent improvement. |
|
Date |
User |
Action |
Args |
2015-04-14 16:08:00 | djmitche | set | recipients:
+ djmitche, effbot, amaury.forgeotdarc, kirkshorts, meador.inge, Andrew.C |
2015-04-14 16:08:00 | djmitche | set | messageid: <1429027680.06.0.0199156518725.issue3353@psf.upfronthosting.co.za> |
2015-04-14 16:08:00 | djmitche | link | issue3353 messages |
2015-04-14 16:07:59 | djmitche | create | |
|