Message89623
The attached patch includes Neil's original additions to test_xml_etree.py.
I also noticed that _encode_entity wasn't being called in ElementTree in
py3k, with the important bit being the nested function
escape_entities(), in conjunction with _escape and _escape_map.
In 2.x, _encode_entity() is used after _encode() throws Unicode
exceptions [1], so I figured it would make sense to take the core
functionality of _escape_entities() and integrate it into _encode in the
same fashion -- when an exception is thrown.
Basically, I:
- changed _escape regexp from using "[\x0080-\uffff]" to "[\x80-xff]"
- extracted _encode_entity.escape_entities() and made it
_escape_entities of module scope
- removed _encode_entity()
- added UnicodeEncodeError exception in _encode()
I'm not sure what the expected outcome is supposed to be when the text
is not type bytes but str. With this patch, the output has
b"tãt" rather than b"tãt".
Hope this is a step in the right direction.
[1] ElementTree.py:814, ElementTree.py:829, python 2.7 HEAD r50941 |
|
Date |
User |
Action |
Args |
2009-06-23 05:37:28 | jcsalterego | set | recipients:
+ jcsalterego, effbot, pitrou, hodgestar, Neil Muller |
2009-06-23 05:37:27 | jcsalterego | set | messageid: <1245735447.77.0.911688551388.issue6233@psf.upfronthosting.co.za> |
2009-06-23 05:37:25 | jcsalterego | link | issue6233 messages |
2009-06-23 05:37:23 | jcsalterego | create | |
|