This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, serhiy.storchaka, vstinner, xiang.zhang
Date 2016-11-18.15:56:24
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1479484584.63.0.41418328566.issue28531@psf.upfronthosting.co.za>
In-reply-to
Content
Serhiy Storchaka: "The performance of the UTF-7 codec is not important."

Right.


"Actually I'm going to propose replacing it with Python implementation."

Oh. Sadly, PyUnicode_DecodeUTF7() is part of the stable ABI. Do you want to call the Python codec from the C function for backward compatibility?

I dislike UTF-7 because it's complex, but it's not as optimized as the UTF-8 codec, so the code remains not too big and so not too expensive to matain.


"This encoder was omitted form _PyBytesWriter-using optimizations for purpose."

Ah? I don't recall that. When I wrote _PyBytesWriter, I skipped UTF-7 because I don't know well this codec and I preferred to keep the code unchanged to avoid bugs :-)


"The patch complicates the implementation."

Hum, I have to disagree. For me, the patched new is no more complex than the current code. The main change is that it adds code checking the kind to better estimate the output length. It's not hard to understand the link between the Unicode kind of the max_char_size.


I vote +1 on this patch because I consider that it makes the code simpler, not because it makes the codec faster (I don't really care of UTF-7 codec performance).

But again (as in issue #28398), it's up to you Serhiy: I'm also ok to leave the code unchanged if you are against the patch.
History
Date User Action Args
2016-11-18 15:56:24vstinnersetrecipients: + vstinner, ezio.melotti, serhiy.storchaka, xiang.zhang
2016-11-18 15:56:24vstinnersetmessageid: <1479484584.63.0.41418328566.issue28531@psf.upfronthosting.co.za>
2016-11-18 15:56:24vstinnerlinkissue28531 messages
2016-11-18 15:56:24vstinnercreate