classification
Title: xmlrpc library returns string which contain null ( \x00 )
Type: behavior Stage: test needed
Components: XML Versions: Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Steven.Hartland, haypo, loewis
Priority: normal Keywords: patch

Created on 2010-01-17 19:59 by Steven.Hartland, last changed 2010-01-21 02:26 by Steven.Hartland.

Files
File name Uploaded Description Edit
xmlrpc_byte_string.patch haypo, 2010-01-21 02:15
Messages (4)
msg97972 - (view) Author: Steven Hartland (Steven.Hartland) Date: 2010-01-17 19:59
When using SimpleXMLRPCServer that is used to return data that includes strings that have a \x00 in them this data is returned, which is invalid.

The expected result is that the data should be treated as binary and base64 encoded.

The bug appears to be in the core xmlrpc library which relies on type( value ) to determine the data type. This returns str for a string even if it includes the null char.
msg98095 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-21 02:09
Marshaller.dump_string() encodes a byte string in <string>...</string> using the escape() function. A byte string can be encoded in base64 using <base64>...</base64>. It's described in the XML-RPC specification, but I don't know if all XML-RPC implementations do understand this type.
http://www.xmlrpc.com/spec

Should we change the default type to base64, or only fallback to base64 if the byte string cannot be encoded in XML. Test if a byte string can be encoded in XML can be slow, and set default type to base64 may cause compatibility issues :-/
msg98096 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-21 02:15
Here is an example of patch using the following test:

   all(32 <= ord(byte) <= 127 for byte in value)

I don't know how much slower is the patch, but at least it doesn't raise an "ExpatError: not well-formed (invalid token): ...".
msg98097 - (view) Author: Steven Hartland (Steven.Hartland) Date: 2010-01-21 02:26
One thing that springs to mind is how valid is that when applied to utf8 data?
History
Date User Action Args
2010-01-21 02:26:22Steven.Hartlandsetmessages: + msg98097
2010-01-21 02:15:02hayposetfiles: + xmlrpc_byte_string.patch
keywords: + patch
messages: + msg98096
2010-01-21 02:09:11hayposetnosy: + haypo
messages: + msg98095
2010-01-17 20:17:37brian.curtinsetpriority: normal
nosy: + loewis

stage: test needed
2010-01-17 19:59:27Steven.Hartlandcreate