This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dark-storm
Recipients
Date 2004-12-03.15:42:49
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=377356

I checked the decoding_fgets function (and the enclosed call
to fp_readl). The patch is more problematic than i thought
since decoding_fgets not only takes a pointer to the token
state but also a pointer to a destination string buffer.
Reallocating the buffer within fp_readl would mean a very
very bad hack since you'd have to reallocate "foreign"
memory based on a pointer comparison (comparing the passed
string buffers pointer against tok->inp || tok->buf...).

As it stands now, patching the tokenizer would mean changing
the function signatures or otherwise change the structure
(more error prone). Another possible solution would be to
provide a specialized readline() function for which the
assumption that at most size bytes are returned is correct.

Oh and about that UTF-8 decoding. readline()'s size
restriction works on the already decoded string (at least it
should), so that shouldn't be an issue. Maybe another
optional parameter should be added to readline() called
limit=None which doesn't limit the returned string by
default, but does so if the parameter is a positive number.

Just my 0.2$
History
Date User Action Args
2007-08-23 14:28:04adminlinkissue1076985 messages
2007-08-23 14:28:04admincreate