Issue 467384: provide a documented serialization func

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/35270

classification

Title:	provide a documented serialization func
Type:	enhancement	Stage:
Components:	None	Versions:

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:		Nosy List:	gpolo, gvanrossum, loewis, phr, skip.montanaro, tim.peters
Priority:	normal	Keywords:

Created on 2001-10-03 02:25 by phr, last changed 2022-04-10 16:04 by admin. This issue is now closed.

Messages (29)
msg53262 - (view)	Author: paul rubin (phr)	Date: 2001-10-03 02:25
It would be nice if there was a documented library function for serializing Python basic objects (numbers, strings, dictionaries, and lists). By documented I mean the protocol is specified in the documentation, precisely enough to write interoperating implementations in other languages. Code-wise, the marshal.dumps and loads functions do what I want, but their data format is (according to the documentation) intentionally not specified, because the format might change in future Python versions. Maybe that doc was written long enough ago that it's ok to freeze the marshal format now, and document it? I just mean for the basic types listed above. Stuff like code objects don't have to be specified. In fact it would be nice if there was a flag to the loads and dumps functions to refuse to marshal/ unmarshal those objects. Pickle/cpickle aren't really appropriate for what I'm asking, since they're complicated (they try to handle class instances, circular structure, etc.) and anyway they're not documented either. The XDR library is sort of ok, but it's written in Python (i.e. slow) and it doesn't automatically handle compound objects. Thanks
msg53263 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2001-10-06 00:10
Logged In: YES user_id=21627 So what's wrong with xmlrpclib?
msg53264 - (view)	Author: paul rubin (phr)	Date: 2001-10-12 05:12
Logged In: YES user_id=72053 I haven't looked at xmlrpclib, but I'm looking for a simple, compact, binary representation, not something that needs a complicated parser and expands the data by an order of magnitude.
msg53265 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2001-10-12 09:39
Logged In: YES user_id=21627 Well, then I guess you need to specify your requirements more clearly. XML-RPC was precisely developed to be something simple for primitive types and structures that is sufficiently well-specified to allow interoperation between various languages. I don't see why extending the data 'by an order of magnitude' would be a problem per se, nor do I see why 'requiring a complicated parser' is a problem if the implementation already does all the unpacking for you under the hoods. Furthermore, I believe it is simply not true that XML-RPC expands the representation by an order of magnitude. For example, the Python Integer object 1 takes 12 bytes in its internal representation (plus the overhead that malloc requires); the XML-RPC representation '<int>1</int>' also uses 12 bytes. In short, you need to say as precise as possible what it is that you want, or you won't get it. Also, it may be that you have conflicting requirements (e.g. 'compact, binary', and 'simple, easily processible in different languages'); then you won't get it either. For a marshalling format that is accessible from different languages, you better specify it first, and implement it then.
msg53266 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2001-10-12 14:33
Logged In: YES user_id=6380 Paul, I don't understand the application that you are envisioning. If you think that the marshal format is what you want, why don't you write a PEP that specifies the format? That would solve the documentation problem.
msg53267 - (view)	Author: paul rubin (phr)	Date: 2001-10-12 19:29
Logged In: YES user_id=72053 I just want to be able to do convenient transfers of python data to other programs including over the network. XMLRPC is excessive bloat in my opinion. Sending a number like 12345678 should take at most 5 bytes (a type byte and a 4-byte int) instead of <int>12345678</int>. For long ints (300 digits) it's even worse. The marshal format is fine, and writing a PEP would solve the doc problem, but the current marshal doc says the non-specification is intentional. Writing it in a PEP means not just documenting--it means asking the language maintainers to freeze the marshal format of certain types, instead of reserving the right to change the format in future versions. Writing the PEP only makes sense if you're willing to freeze the format for those types (the other types can stay undocumented). Is that ok with you? Thanks Paul
msg53268 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2001-10-12 19:37
Logged In: YES user_id=6380 If the PEP makes a reasonable case for freezing the spec, yes. I wonder why you can't use decimal? Are you talking really large volumes? The PEP needs to motivate this with an example, preferably plucked from real life!
msg53269 - (view)	Author: paul rubin (phr)	Date: 2001-10-12 20:16
Logged In: YES user_id=72053 Decimal is bad not just because of the data expansion but because the arithmetic to convert a decimal string to binary can be expensive (all that multiplication). I'd rather use hex than decimal for that reason. One envisioned application is communicating a cryptography coprocessor: an 8-bit microcontroller (with a public key accelerator) connected to the host computer through a slow serial port. Most of the ints involved would be around 300 decimal digits. A simple binary format is a lot easier to deal with in that environment than something like xmlrpc. Also, the format would be used for data persistence, so again, unnecessary data expansion isn't desirable. I looked at XMLRPC and it's not designed for this purpose. It's intended as an RPC protocol over HTTP and isn't well suited for object persistence. Also, it doesn't support integers over 32 bits, and binary strings must be base64 encoded (more bloat). Finally, it's not included with Python, so I'd have to bundle an implementation written in Python (i.e. slow) with my application (I don't know whether Fred's implementation is Python or C). I think the marshal format hasn't changed since before Python 1.5, so basing serialization on marshal would mean applications could interoperate with older versions of Python as well as newer ones, which helps Python's maturity. (Maturity of a program means, among other things, that users rarely need to be told they need the latest version in order to use some feature). Really, the marshal functions are written the way they're written because that's the simplest and most natural way of doing this kind of thing. So the proposal is mainly to make them available for user applications, rather than only for system internals.
msg53270 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2001-10-12 20:24
Logged In: YES user_id=6380 This helps tremendously. I think that marshal is probably overkill. Rather, you need helper routines to convert longs to and from binary. You can do everything else using the struct module, and it's probably easier to write your own protocol using that and these helpers. I suggest that the best place to add these helpers is the binascii module, which already has a bunch of similar things (e.g. hexlify and crc32). Note the xmlrpc is bundled with Python 2.2. Looking forward to your patch (much simpler to get accepted than a PEP :-).
msg53271 - (view)	Author: Tim Peters (tim.peters) *	Date: 2001-10-12 21:09
Logged In: YES user_id=31435 I'm not sure this is making progress. Paul, if you want to use marshal, you already can: the pack and unpack routines are exposed in Python via the marshal module. Freezing the representation isn't a particularly appealing idea; e.g., if anyone is likely to complain about the speed of Python's longs, it's you <wink>, and the current marshal format for longs is just a raw dump of Python's internal long representation -- but the most obvious everything-benefits way to speed Python longs is to increase the size of the "digits" used in its internal representation. If that's ever done, the marshal format would want to change too. It's easy enough to code your own storage format for longs, e.g. >>> def tobin(i): ... import binascii ... ashex = hex(long(i))[2:-1] # chop '0x' and trailing 'L' ... if len(ashex) & 1: ... ashex = '0' + ashex ... return binascii.unhexlify(ashex) implements "base 256" for unsigned longs, and the runtime cannot be improved by rewriting in C except by a constant factor (the Python spelling has the right O() behavior).
msg53272 - (view)	Author: Skip Montanaro (skip.montanaro) *	Date: 2001-10-12 21:41
Logged In: YES user_id=44345 If you head in the direction of documenting marshal with the aim of potentially interoperating with other languages, I think it would be a good idea to create a Python-independent marshal library. This would facilitate incorporation into other languages. Such a library probably wouldn't be able to do everything marshal can (there isn't an obvious C equivalent of Python's dictionary object, for example), but would still help nail down compatibility issues for the basic scalar types.
msg53273 - (view)	Author: paul rubin (phr)	Date: 2001-10-13 03:08
Logged In: YES user_id=72053 Skip - C has struct objects which are sort of like Python dictionaries. XMLRPC represents structs as name-value pairs, for example. And "other languages" doesn't necessarily mean C. The marshaller should be able to represent the non-Python-specific serializable objects, not just scalars. Basically this means strings, integers (of any length), dictionaries, lists, and floats (hmm--unicode?), but not necessarily stuff like code objects. Having an independent marshal library is ok, I guess, though I don't feel it's necessary to create more implementation work. And one the benefit of using the existing marshaller is that it's already available in most versions of Python that people are running (Red Hat 7.1 still comes with Python 1.5 for example). Tim - yes, I'm originally used a binascii.hexlify hack similar to yours and it worked ok, but it was ugly. I also had to handle strings (generate a length count followed by the contents) and then dictionaries (name-value pairs) and finally felt I shouldn't need to rewrite the marshaller like that. There's already a built-in library function that does everything I need, very efficiently in native code, in one call, and being able to use it is in the "batteries included" spirit. Also, the current long int marshalling format is just a digit count (16-bit digits) followed by the digits in binary. If the digit width changes, the marshalling format doesn't have to change--the marshalling code should still be able to use the same external representation without excessive contortions and without slowing down. (You'll see that it's already not a simple memory dump, but a structure read and written one byte at a time through layers of subroutines). Changing widths while keeping the old format means putting a minor kludge in the marshalling code, but no user will ever notice it. As for the speed of Python longs, my stuff's runtime is dominated by modular exponentiations <wink> and I'm already using gmpy for those when it's available (but I don't depend on it). The speedup with gmpy is substantial, but the speed with ordinary Python longs is quite acceptable on my PIII (the StrongARM is another story--probably the C compiler's fault). Examining Python/marshal.c, I don't see any objects of the types I've mentioned that are likely to need to change representations--do you? Btw I notice that the pickle module represents long ints as decimal strings even in "binary" mode, but I'll resist opening another bug for that, for now.
msg53274 - (view)	Author: Tim Peters (tim.peters) *	Date: 2001-10-13 03:43
Logged In: YES user_id=31435 The marshal long format actually uses 15-bit digits, each stored in 16 bits (the high bit of the high byte of which is always 0). That would be a PITA to preserve even if Python just moved to 16-bit digits. marshal's purpose is for efficient loading of .pyc files, where that odd format makes good sense; since it wasn't designed to be a general- purpose data transmission format (and has many shortcomings for such use), I don't want to see a tail wagging the dog here. Cross-release compatibility is taken seriously in pickle, and pickle handles many more cases than marshal, although pickle's author (as you've discovered) didn't give a hoot about efficient storage of longs. I'd rather add an efficient long format to pickle than hobble marshal (although because pickle does take x-release compatibility seriously, it has to continue accepting the "longs as decimal strings" format forever).
msg53275 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2001-10-13 08:14
Logged In: YES user_id=21627 I would have never guessed that arbitrarily long ints are a requirement in your application... For that application, I'd recommend to use ASN.1 BER as a well-document, efficient, binary marshalling format. I don't think any other format marshals arbitrary large integers in a more compact form. You can find an implementation of that in http://www.cnri.reston.va.us/software/pisces/ or http://www.enteract.com/~asl2/software/PyZ3950/asn1.py or http://sourceforge.net/projects/pysnmp (ber.py) I'd be in favour of having a BER support library in the Python core, but somebody would have to contribute such a thing, of course.
msg53276 - (view)	Author: Tim Peters (tim.peters) *	Date: 2001-10-13 08:39
Logged In: YES user_id=31435 Martin, Paul suggested BER previously in 465045. I suspect he's going to suggest this for every module one by one, until somebody bites <wink>. I doubt he wants genuine ASN.1 BER, though, as that's a complicated beast, and he only cares about ints with a measly few hundred bits; regardless, a Python long can't have more digits than can be counted in a C int.
msg53277 - (view)	Author: paul rubin (phr)	Date: 2001-10-13 10:24
Logged In: YES user_id=72053 1) if Python longs are currently implemented as vectors of 15-bit digits (yikes--why on earth would anyone do that) and marshalled like that, then I agree that THAT much weirdness doesn't need to be propagated to future versions. Wow! I never looked at the long int code closely, but the marshal code certainly didn't reflect that. It's still possible to freeze the current marshal format and let future versions define a new mechanism for loading .pyc's. From my own self-interest (of wanting to distribute apps that work across versions) that idea attracts me, but it's probably not the right thing in the long run. Better may be to fix the long int format right away and THEN document/ freeze it. (Use a new format byte so the 2.2 demarshaller can still read 2.1 .pyc files). By "fix" I mean use a simple packed binary format, no 15 bit digits, no BER, and the length prefix should be a byte or bit count, not multibyte "digits". 2) Unfortunately it's not easy in portable C with 32 bit longs to use digits wider than 16 bits--multiplication becomes too complicated. If the compiler supports wide ints (long long int) then conditionalized code to use them might or might not be deemed worthwhile. Python's long int arithmetic (unlike Perl's Math::BigInt class) is fast enough to be useable for real applications and I don't expect it to go to the extremes that gmpy does (highly tuned algorithms for everything, asm code for many cpu's, etc). So currently I use gmpy when it's available and fall back on longs if gmpy won't import--this works pretty well so far. 3) I like the idea of a BER/DER library for Python but I don't feel like being the guy who writes it. I'd probably use it if it was available, though maybe not for this purpose. (I'd like to handle X509 certificates in Python). BER really isn't the most efficient way to store long ints, by the way, since it puts just 7 useful bits in a byte. 4) My suggestion of BER in 465045 was motivated slightly differently, which was to add a feature from Perl's pack/ unpack function that's missing from Python's struct.pack/ unpack. I understand a little better now what the struct module is for, so binascii may be a better place for such a thing. However, I believe Python really needs a pack/unpack module that does all the stuff that Perl's does. Data conversion like that is an area where Perl is still beating Python pretty badly. (Again, I don't feel like being the one who writes the module). 5) Sorry I didn't notice Guido's post of 20:24 earlier (several arrived at once). I guess I'm willing to submit a patch for binascii to read and write longs in binary. It's slightly humorous to put it in binascii since it's a binary-binary conversion with no ascii involved, but the function fits well there in any case. I'd still rather use marshal, since I want to write out more kinds of data than longs, and with a long->binary conversion function I'd still need to supply Python code to traverse dictionaries and lists, and encode strings. Btw, the struct module doesn't have any way to encode strings with a length, except the Pascal format which is limited to 256 bytes and is therefore useless for many things.
msg53278 - (view)	Author: Tim Peters (tim.peters) *	Date: 2001-10-13 19:07
Logged In: YES user_id=31435 I don't buy the argument that pickle is "complicated", as you weren't going to document the parts of the marshal format you didn't care about either. A subset of pickle is just as easy to document and implement across languages as a subset of marshal, but with the key benefit that the pickle format is stable across releases. So if you want a structure packer, pickle is the obvious choice; it just lacks an efficient (in time and space) scheme for storing longs now. And unlike marshal, it isn't a dead end when you decide your app needs something fancier -- pickle already handles just about everything that can be pickled, and is designed to be extensible to user-defined types too, so you can painlessly expand your view of what the "interesting" subset is as your ambitions grow. I don't really know what you mean by "BER". The ANS.1 std <http://www.itu.int/ITU- T/studygroups/com17/languages/X.690_1297.pdf> section 8.3 is quite clear that all 8 bits are used in each byte for integer representations -- it's a giant 2's-comp integer, with a variable-length length prefix, redundant sign bytes are forbidden, and there's nothing special about the last byte. I agree with Martin that ANS.1 BER is as compact a standardized bigint representation as there is.
msg53279 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2001-10-13 21:04
Logged In: YES user_id=21627 7-bit vs. 8-bit: You were confusing tag encoding and INTEGER value encoding. no way to encode a string with a length: suppose you want a 32 bit length, what's wrong with struct.pack("l",len(s))+s 15-bit representation: I believe the add and sub implementations make use of the guarantee that a short won't overflow if the input fits into 15 bits. Tim: BER = Basic Encoding Rules (as in the subtitle of X.690) Even after all this discussion, I still cannot see why the existing libraries (including those offered for free by third parties) are not sufficient. It appears that Paul wants, among other things, that marshal becomes documented; it also appears that this won't hapen. What the other things are that Paul wants, I cannot tell, so I recommend to close this report with "won't fix". Paul, if you have a specific change that you want to be made, or a specific problem that you want to point out, please submit a new report. This issue "provide a documented serialization func" really ought to be closed as "Fixed"; xmlrpclib is already part of standard library and fits the original problem description: - it is a library for serializing Python basic objects - it is documented in the sense that the protocol is specified precisely enough to write interoperating implementations in other languages.
msg53280 - (view)	Author: Tim Peters (tim.peters) *	Date: 2001-10-13 22:25
Logged In: YES user_id=31435 Martin, you're right about add and sub, but it's a shallow assumption easy to relax (basically just declare carry/borrow as twodigits instead of digit). I'd be more worried about the stwodigits type, but since nothing is actually broken here I'm not keen to fritter away time proving bounds on the temps in bigint division. How we implement bigints internally is off topic anyway (provided we're not trying to hijack internal implementation formats for unintended purposes). About BER, yes, and the URL I included is to a freely downloadable copy of the X.690 std; section 8.3 spells out the INTEGER rules. They aren't at all the rules Paul sketched, hence "I don't really know what you [Paul] mean by 'BER'". For the rest, while xmlrcplib may meet the letter of what Paul asked for at first, it's clear to me that it doesn't meet what he really wants. My suggestion remains to add a new, efficient bigint format to pickle, which would meet everything except Paul's desire to have a special gimmick limited to his specific application and without having to write one himself. The internal API functions _PyLong_AsByteArray and _PyLong_FromByteArray already do the heavy lifting in both directions (to or from base 256, unsigned or complemented, big- or little-endian).
msg53281 - (view)	Author: paul rubin (phr)	Date: 2001-10-13 23:29
Logged In: YES user_id=72053 A pickle subset ("gherkin"?) could possibly also fill this need, if it was documented, even though pickle format is considerably more complicated than marshal format (it uses marshal.dumps for binary output, actually taking apart the marshalled strings). It was obvious in seconds how marshal.c works but after 30 minutes of looking at pickle.py I'm still not sure I understand it. It looks like the unpickler can construct arbitrary class instances and import arbitrary modules, which makes a security hole if the pickled strings are potentially hostile, but I might not be reading it right. Also, the unpickler must implement constant folding (the memo scheme), which complicates it somewhat, though it's not that bad. The idea of leaving the marshal formats of some Python- specific objects undocumented isn't to get out of documenting stuff, but to leave those formats open to later change. Re BER/DER, Burt Kaliski's "Layman's Guide" is pretty readable (http://borg.isc.ucsb.edu/aka/Auth/ASN1layman.htm). You're right about using all 8 bits in BER integers--it looks like the 7 bit representation is only used for OID components (I didn't realize that til checking on it just now). BER might be ok for what I'm doing--I'm not sure right now since I don't understand ASN1 that well. It looks not in the spirit of marshal/pickle though: to encode a compound object it looks like you need an ASN1 spec of EXACTLY what you expect to find in the object.
msg53282 - (view)	Author: paul rubin (phr)	Date: 2001-10-14 00:49
Logged In: YES user_id=72053 I agree with Tim that the internal implementation of long arithmetic isn't relevant to this--it was just surprising, and means the current marshal format isn't all that natural for external use. I don't have a particular agenda to get marshal documented, beyond that it would happen to solve my immediate problem. Alternatives are fine too. The ones suggested so far just don't seem to do the job, viz.: xmlrpc does NOT serialize basic Python objects--in particular it doesn't serialize integers longer than 32 bits. I can't consider using pickle until I've convinced myself that it doesn't make security holes, and so far it looks like the opposite. (Can someone tell me I'm not reading it right?). Yes, of course, it's not that difficult to write Python code to do everything I want. It's just surprising that I should need to do that. I mean, imagine if there was no integer addition function (no "+" operator) and the maintainers said "that's ok, to add a and b, just use 'a - (-b)'". It's not a showstopping obstacle, but I'm surprised to get so much grief for suggesting making the operation more convenient, since it's an obvious thing to want to do (as evidenced by there already being so many overlapping serialization functions: marshal, pickle, rpclib, the Serialization class from Vaults of Parnassus, three different ASN1 implementations you mentioned, etc). I can't see anywhere where I've requested a "special gimmick". Yes, an efficient bigint representation in pickle is nice and ought to be added, but I can live without it. I can NOT live with security holes, but wanting security shouldn't be considered a special gimmick! With binary bigints, a documented format, and a way to 100% stop the unpickler from ever calling eval or apply on untrusted data, I wouldn't mind using pickle despite its additional complexity compared to marshal. I don't want to depend on third party modules unless I bundle them with my application (again not a showstopper, but it's not in Python's "batteries included" spirit to need them at all). Telling a user "to run this app, first download modules vreeble from <url1> and frob from <url2>" where url1 and url2 usually turn out to be broken links by the time the user sees them is not the right way to distribute an app. (It happens I'm going to sometimes tell the user "to run this app, first destroy your handheld computer's OS by reflashing the firmware..." but it's the principle of the thing, you know). Anyway, the 15 bit bigint representation is reason enough to not want to freeze the current marshal format. Maybe a future marshaller can use a cleaner bigint format and at that point perhaps the issue can be revisited.
msg53283 - (view)	Author: Tim Peters (tim.peters) *	Date: 2001-10-14 20:36
Logged In: YES user_id=31435 Ack -- Paul, you add a new hitherto secret <wink> requirement with each reply. marshal isn't secure at all: because its purpose is to load .pyc files, marshal creates Python code objects out of any bytes you happen to feed it following a "code object" tag. That's a hole big enough to swallow the solar system. In 2.2, marshal refuses to unpack code objects in restricted execution mode, but not before 2.2, and it never refuses in unrestricted mode. In contrast, pickle doesn't know anything about code objects, so doesn't have this hole. The pickle docs are clear about this, too, spelling out that marshal's code- object abilities create "the possibility of smuggling Trojan horses into a program". When wondering about security, you should be looking at (and using) cPickle.c instead of pickle.py; cPickle doesn't use marshal at all, nor does it do eval()s etc. Yes, it can reconstruct pickled instances of classes that already exist, but it cannot create new classes. I haven't heard that characterized as an insecurity before, but to each his own level of discomfort. I want to go back to the start: if the question is whether Python is interested in documenting another data transmission format, my answer is no. There are many already (don't forget the ones from the CORBA and ILU worlds either) available from Python, and there's no reason to believe encoders/decoders for a Python-specific format would get implemented in any other language. pickle is Python's generic answer to the Python-specific serialization question. I'd be happy to see patches to improve it (whether for efficient longs, or some stricter notion of security, or even just docs). But I expect any additional Python-specific serialization scheme has an audience of one (if you disagree, fine, write a PEP and get some community consensus).
msg53284 - (view)	Author: paul rubin (phr)	Date: 2001-10-14 22:27
Logged In: YES user_id=72053 My understanding of marshal (I better check it, but I did mention the issue in the original request) is that it can create code objects but it doesn't actually execute the code in them. My implementation currently uses marshal but checks that the stuff marshal returns doesn't contain anything unexpected. Unpickle is different--it looks like it can execute hostile code before the loads call ever returns. By the time you have a chance to check the result, it's too late. cPickle.c appears to work exactly the same way (using eval and creating arbitrary instances, but maybe not calling marshal) as pickle.py. It never would have occured to me that the unpickler would work that way (and I'm still not convinced I understand it--I better try putting together a test to see if it's really like that). That's why I didn't notice the security issue til we started discussing pickle and I actually looked at the code. I'm sorry if that sounds like I'm adding requirements. I'd have thought it would go without saying that an important utility shouldn't have security holes. I'm ok with using pickle if the doc and security concerns are taken care of. More efficient longs would be helpful but they would break interoperability with old versions and I can probably live without them. It's really sad that longs were shrugged off when the pickle binary format was designed. Now in order to have efficient longs, yet another flag will have to be added to the constructor. Btw, if the unpickle security issue is real (I'm still not convinced!), I feel it should be treated as a major bug and that an announcement should be sent out. Unpickle already anticipates hostile pickled strings in the non-binary format and checks for them (see _is_secure_string) though I'd want want to spend an hour or two checking both the `...` code and the evaluator before believing that _is_secure_string is really safe-- and even if it is, it's brittle. But it looks like object creation security is an area they didn't think about. Basically I have nothing against pickle in principle, but it has these (fixable) problems, and while marshal is straightforwardly written, both pickle implementations are excessively clever and make me queasy. Anyway, I can go along with the idea that the right solution is to fix pickle--but at present, pickle looks like it's in worse shape than marshal.
msg53285 - (view)	Author: paul rubin (phr)	Date: 2001-10-16 23:28
Logged In: YES user_id=72053 Tim has opened a doc bug for pickle/marshal security issues as #471893.
msg53286 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2003-05-17 00:25
Logged In: YES user_id=357491 Is this still an issue? If so, shouldn't this be made an RFE?
msg53287 - (view)	Author: paul rubin (phr)	Date: 2003-05-17 01:17
Logged In: YES user_id=72053 Yes, it's still an issue, even more than before since pickle is now explicitly documented to NOT be ok to use with untrusted data. This is already classified as a feature request. I don't know if an RFE is something different than that.
msg53288 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2003-05-17 01:30
Logged In: YES user_id=357491 RFE is specifically a feature request. I have gone ahead and reclassified this as such.
msg64277 - (view)	Author: Guilherme Polo (gpolo) *	Date: 2008-03-21 20:45
Sorry, but is the feature request related to constructing a safe unpickler ? If yes, then I suppose this issue should be closed and an appropriate one be created. Nevertheless, reading the following comment at pickletools.py (trunk) makes me think this feature request won't be done, not in the pickle module at least: "Another independent change with Python 2.3 is the abandonment of any pretense that it might be safe to load pickles received from untrusted parties -- no sufficient security analysis has been done to guarantee this and there isn't a use case that warrants the expense of such an analysis."
msg64285 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-03-21 21:53
There isn't anything actionable in this bug request. It makes much more sense to start a discussion about requirements etc. on python-ideas.

History
Date	User	Action	Args
2022-04-10 16:04:29	admin	set	github: 35270
2008-03-21 21:53:57	gvanrossum	set	status: open -> closed resolution: out of date messages: + msg64285
2008-03-21 20:45:48	gpolo	set	nosy: + gpolo messages: + msg64277
2007-10-07 23:19:04	brett.cannon	set	nosy: - brett.cannon
2001-10-03 02:25:40	phr	create