Author lemburg
Recipients ezio.melotti, lemburg, rhansen
Date 2010-01-02.14:46:41
SpamBayes Score 4.07452e-14
Marked as misclassified No
Message-id <4B3F5C50.2010607@egenix.com>
In-reply-to <1262316507.32.0.63969926675.issue7615@psf.upfronthosting.co.za>
Content
Richard Hansen wrote:
> 
> New submission from Richard Hansen <rhansen@bbn.com>:
> 
> The description of the unicode_escape codec says that it produces "a
> string that is suitable as Unicode literal in Python source code." [1] 
> Unfortunately, this is not true as it does not escape quotes.  For example:
> 
>   print u'a\'b"c\'\'\'d"""e'.encode('unicode_escape')
> 
> outputs:
> 
>   a'b"c'''d"""e

Indeed. Python only uses the decoder of that codec internally.

> I have attached a patch that fixes this issue by escaping single quotes.
>  With the patch applied, the output is:
> 
>   a\'b"c\'\'\'d"""e
> 
> I chose to only escape single quotes because:
>   1.  it simplifies the patch, and
>   2.  it matches string_escape's behavior.

If we change this, the encoder should quote both single and double
quotes - simply because it is not known whether the literal
will use single or double quotes.

The raw_unicode_escape codec would have to be fixed as well.
History
Date User Action Args
2010-01-02 14:46:44lemburgsetrecipients: + lemburg, ezio.melotti, rhansen
2010-01-02 14:46:42lemburglinkissue7615 messages
2010-01-02 14:46:42lemburgcreate