Issue1436130
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2006-02-21 19:32 by doerwalter, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
codecs.diff | doerwalter, 2006-02-21 19:32 | |||
codecs2.diff | doerwalter, 2006-02-28 12:08 | |||
codecs3.diff | doerwalter, 2006-03-01 14:47 | |||
codecs4.diff | doerwalter, 2006-03-03 17:39 |
Messages (13) | |||
---|---|---|---|
msg49559 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-02-21 19:32 | |
This patch extends the codec machinery to add incremental codecs: stateful codecs that don't use a stream API. It adds the following stuff: a class codecs.CodecInfo (a subclass of tuple), that is used as the return value of codecs.lookup(); codecs.IncrementalEncoder and codecs.IncrementalDecoder (the basic interface classes), codecs.BufferedIncrementalDecoder (a class that can be used to implement decoders that must handle incomplete input); codecs.iterencode() and codecs.iterdecode() (generators that use the incremental codecs for encoding/decoding an input iterable). On the C level PyCodec_IncrementalEncoder() and PyCodec_IncrementalDecoder() are added. |
|||
msg49560 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-02-28 12:08 | |
Logged In: YES user_id=89016 This second version of the patch enhances codecs.iterencode() and codecs.iterdecode(), so that additional keyword arguments are passed through to the Incremental(De|En)coder constructor. |
|||
msg49561 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-03-01 14:47 | |
Logged In: YES user_id=89016 This third version of the patch fixes the bug when the iterator in iterencode() or iterdecode() is empty and updates the docstring in encodings/__init__.py. |
|||
msg49562 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2006-03-02 23:03 | |
Logged In: YES user_id=38388 Very nice ! This is a much better approach than the feed style path you wanted to take previously. Minor nits: Please separate out the non-related changes to the IDNA codec into a new patch and assign that to Martin for review. Is it possible to make IncrementalEncoder/Decoder instances iterable per-se (without the need to go through the helper functions iterencode/iterdecode) ? Thanks. |
|||
msg49563 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-03-03 17:39 | |
Logged In: YES user_id=89016 This fourth version of the patch removes the changes to Lib/encodings/idna.py (only the addition of the IncrementalEncoder/IncrementalDecoder and the changed getregentry() remain). This patch to idna.py probably only makes sense once this patch is in. > Is it possible to make IncrementalEncoder/Decoder > instances iterable per-se (without the need to go > through the helper functions iterencode/iterdecode) ? For IncrementalEncoder/Decoder to be iterable it would have to have some iterable from which it gets the input. But this has the same limitation as the stream API: The user is forced to provide the input as a service that the encoder/decoder uses, which requires support for a certain API. The only change would be that now it's an iterator API instead of a stream API. The incremental codecs invert the call logic: The user no longer has to provide a callback service to the codec, but calls the codec directly. This gives much more flexibility. |
|||
msg49564 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2006-03-15 08:01 | |
Logged In: YES user_id=33168 MAL, do you have any more issues with this patch? Should it be assigned to Martin? MAL, Walter, can you review these patches 1443155 1449471 which I think are related? Should they go in? The first alpha is coming up soon and I'd like to get these patches in ASAP. |
|||
msg49565 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2006-03-15 11:13 | |
Logged In: YES user_id=38388 The patch looks OK, accept for some minor glitches such as this mess :-) ... + if not isinstance(entry, codecs.CodecInfo): + if not 4 <= len(entry) <= 7: + raise CodecRegistryError,\ + 'module "%s" (%s) failed to register' % \ + (mod.__name__, mod.__file__) + if not callable(entry[0]) or \ + not callable(entry[1]) or \ + (entry[2] is not None and not callable(entry[2])) or \ + (entry[3] is not None and not callable(entry[3])) or \ + (len(entry) > 4 and entry[4] is not None and not callable(entry[4])) or \ + (len(entry) > 5 and entry[5] is not None and not callable(entry[5])): raise CodecRegistryError,\ - 'incompatible codecs in module "%s" (%s)' % \ - (mod.__name__, mod.__file__) + 'incompatible codecs in module "%s" (%s)' % \ + (mod.__name__, mod.__file__) + if len(entry)<7 or entry[6] is None: + entry += (None,)*(6-len(entry)) + (mod.__name__.split(".", 1)[1],) + entry = codecs.CodecInfo(*entry) Nevertheless, it can be cleaned up after checkin, so please go ahead with it. Regarding the idna.py patch, I think you should create a new patch item for it and assign it to Martin. Thanks. Neal, I don't have time to review the two CJK patches. |
|||
msg49566 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2006-03-15 11:28 | |
Logged In: YES user_id=55188 1449471 isn't related to incremental codecs. It includes a simple patch to visual studio project file. I think Walter is right person to review 1443155 whether it conforms his interface design. :-) (Thank you in advance!) |
|||
msg49567 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-03-15 11:43 | |
Logged In: YES user_id=89016 Checked in as r43045. Now what do we do with the funny code in encoding.search_function()? Of course we could always *require* the search function to return a CodecInfo object. (but only after the CJK codecs are updated, and even then we should have some form of backwards compatibility). |
|||
msg49568 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2006-03-15 12:14 | |
Logged In: YES user_id=38388 It's only the coding style that looks a bit funny. Requiring CodecInfo objects is not a good idea: that way you'd make it impossible to write codecs that work in both Python 2.5 and 2.4. |
|||
msg49569 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-03-18 15:26 | |
Logged In: YES user_id=89016 MAL, do you have any suggestions on improving the code in encodings.search_function()? |
|||
msg49570 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-03-18 15:53 | |
Logged In: YES user_id=89016 OK, I've submitted a new patch (#1453235) for the idna simplification. |
|||
msg49571 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2006-04-15 15:12 | |
Logged In: YES user_id=89016 Closing the patch. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:15 | admin | set | github: 42929 |
2006-02-21 19:32:00 | doerwalter | create |