Message23493
Logged In: YES
user_id=89016
> I checked the decoding_fgets function (and the enclosed
call
> to fp_readl). The patch is more problematic than i thought
> since decoding_fgets not only takes a pointer to the token
> state but also a pointer to a destination string buffer.
> Reallocating the buffer within fp_readl would mean a very
> very bad hack since you'd have to reallocate "foreign"
> memory based on a pointer comparison (comparing the
passed
> string buffers pointer against tok->inp || tok->buf...).
Maybe all pointers pointing into the buffer should be moved
into a struct?
> As it stands now, patching the tokenizer would mean
changing
> the function signatures or otherwise change the structure
> (more error prone).
All the affected function seem to be static, so at least in
this regard there shouldn't be any problem.
> Another possible solution would be to
> provide a specialized readline() function for which the
> assumption that at most size bytes are returned is correct.
All the codecs would have to provide such a readline().
BTW, the more I look at your patch the more I think
that it gets us as close to the old behaviour as we
can get.
> Oh and about that UTF-8 decoding. readline()'s size
> restriction works on the already decoded string (at least it
> should), so that shouldn't be an issue.
I don't understand that. fp_readl() does the following
two calls:
buf = PyObject_Call(tok->decoding_readline, args, NULL);
utf8 = PyUnicode_AsUTF8String(buf);
and puts the resulting byte string into the char * passed
in, so even if we fix the readline call the UTF-8 encoded
string might still overflow the avaliable space. How can
tokenizer.c be sure how much the foo->utf8 transcoding
shrinks or expands the string?
> Maybe another
> optional parameter should be added to readline() called
> limit=None which doesn't limit the returned string by
> default, but does so if the parameter is a positive number.
But limit to what?
|
|
Date |
User |
Action |
Args |
2007-08-23 14:28:04 | admin | link | issue1076985 messages |
2007-08-23 14:28:04 | admin | create | |
|