This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author PaulMcMillan
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, georg.brandl, gvanrossum, gz, jcea, lemburg, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner
Date 2012-01-08.02:40:39
SpamBayes Score 1.0523804e-12
Marked as misclassified No
Message-id <CAO_YWRUVxD8Sw148ezEfxqV2b-0hhMqYDC8MdjU3pAjh0NxK2w@mail.gmail.com>
In-reply-to <1325982780.36.0.940971927339.issue13703@psf.upfronthosting.co.za>
Content
> Alex, I agree the issue has to do with the origin of the data, but the modules listed are the ones that deal with the data supplied by this particular attack.

They deal directly with the data. Do any of them pass the data
further, or does the data stop with them? A short and very incomplete
list of vulnerable standard lib modules includes: every single parsing
library (json, xml, html, plus all the third party libraries that do
that), all of numpy (because it processes data which probably came
from a user [yes, integers can trigger the vulnerability]), difflib,
the math module, most database adaptors, anything that parses metadata
(including commonly used third party libs like PIL), the tarfile lib
along with other compressed format handlers, the csv module,
robotparser, plistlib, argparse, pretty much everything under the
heading of "18. Internet Data Handling" (email, mailbox, mimetypes,
etc.), "19. Structured Markup Processing Tools", "20. Internet
Protocols and Support", "21. Multimedia Services", "22.
Internationalization", TKinter, and all the os calls that handle
filenames. The list is impossibly large, even if we completely ignore
user code. This MUST be fixed at a language level.

I challenge you to find me 15 standard lib components that are certain
to never handle user-controlled input.

> Note that changing the hash algorithm for a persistent process, even though each process may have a different seed or randomized source, allows attacks for the life of that process, if an attack vector can be created during its lifetime. This is not a problem for systems where each request is handled by a different process, but is a problem for systems where processes are long-running and handle many requests.

This point has been made many times now. I urge you to read the entire
thread on the mailing list. Your implementation is impractical because
your "safe" implementation completely ignores all hash caching (each
entry must be re-hashed for that dict). Your implementation is still
vulnerable in exactly the way you mentioned if you ever have any kind
of long-lived dict in your program thread.

> You have entered the class of people that claim lots of vulnerabilities, without enumeration.

I have enumerated. Stop making this argument.
History
Date User Action Args
2012-01-08 02:40:42PaulMcMillansetrecipients: + PaulMcMillan, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, Arfrever, v+python, alex, skrah, dmalcolm, gz, Arach, Mark.Shannon, Zhiping.Deng, Huzaifa.Sidhpurwala
2012-01-08 02:40:41PaulMcMillanlinkissue13703 messages
2012-01-08 02:40:39PaulMcMillancreate