Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(22)

#14579: Possible vulnerability in the utf-16 decoder after error handling

Can't Edit
Can't Publish+Mail
Start Review
Created:
5 years, 9 months ago by storchaka+cpython
Modified:
5 years, 6 months ago
Reviewers:
martin
CC:
loewis, Georg, AntoinePitrou, haypo, Benjamin Peterson, ezio.melotti, Arfrever, asvetlov, henri_nerv.fi, devnull_psf.upfronthosting.co.za, sidhpurwala.huzaifa_gmail.com, storchaka
Visibility:
Public.

Patch Set 1 #

Patch Set 2 #

Total comments: 1

Patch Set 3 #

Patch Set 4 #

Total comments: 8

Patch Set 5 #

Patch Set 6 #

Patch Set 7 #

Patch Set 8 #

Patch Set 9 #

Unified diffs Side-by-side diffs Delta from patch set Stats Patch
Lib/test/test_codecs.py View 1 2 3 4 5 6 7 8 2 chunks +26 lines, -4 lines 0 comments Download

Messages

Total messages: 7
loewis
http://bugs.python.org/review/14579/diff/4646/16500 File Objects/unicodeobject.c (right): http://bugs.python.org/review/14579/diff/4646/16500#newcode3610 Objects/unicodeobject.c:3610: re-created or resized the unicode object. */ I think ...
5 years, 9 months ago #1
storchaka_gmail.com
This is a copy of the comment from `decode_utf8_errors` (under the `data` is here meant ...
5 years, 9 months ago #2
loewis
http://bugs.python.org/review/14579/diff/4702/16716 File Objects/unicodeobject.c (right): http://bugs.python.org/review/14579/diff/4702/16716#newcode3478 Objects/unicodeobject.c:3478: e2 = e - 1; I propose to drop ...
5 years, 8 months ago #3
storchaka_gmail.com
http://bugs.python.org/review/14579/diff/4702/16716 File Objects/unicodeobject.c (right): http://bugs.python.org/review/14579/diff/4702/16716#newcode3478 Objects/unicodeobject.c:3478: e2 = e - 1; On 2012/04/24 14:18:05, loewis ...
5 years, 8 months ago #4
storchaka_gmail.com
http://bugs.python.org/review/14579/diff/4702/16716 File Objects/unicodeobject.c (right): http://bugs.python.org/review/14579/diff/4702/16716#newcode3478 Objects/unicodeobject.c:3478: e2 = e - 1; I'm really in a ...
5 years, 8 months ago #5
loewis
http://bugs.python.org/review/14579/diff/4702/16716 File Objects/unicodeobject.c (right): http://bugs.python.org/review/14579/diff/4702/16716#newcode3478 Objects/unicodeobject.c:3478: e2 = e - 1; On 2012/04/24 21:41:21, storchaka ...
5 years, 8 months ago #6
storchaka_gmail.com
5 years, 8 months ago #7
> > It will affect the performance.
> How much?

I see -10% degradation for codes >= U+8000. It's just a bugfix. And the bugfix
without degradation is possible.

> In any case, readability is *very* important to Python's source code,
> much much much much much more than performance. Please accept that.

I understand this, and therefore limit myself to a minimum. Or give a few
options. You choose.

> > e2 points to the byte following the last byte, which can begin a 2-byte
UTF-16
> > code. In the old code it is just e.
> 
> Maybe this is an English language problem here: e2 *follows* the last byte????
> e2 is e-1 (is it not?), so it must be before e. I didn't notice that e is now
> q+size, so ISTM that *e2 /is/ the last byte.

If size is even, then e2 is last used byte. If size is odd, then e2 is first not
used byte. In any case decoding stops when q >= e2. q[0] and q[1] are both in
input buffer if e - q >= 2 (or q < e2).

But forget about it. I made a new patch.
Sign in to reply to this message.

RSS Feeds Recent Issues | This issue
This is Rietveld 894c83f36cb7