Message 151867 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	PaulMcMillan
Recipients	Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, eric.snow, fx5, georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, jcea, lemburg, mark.dickinson, neologix, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
Date	2012-01-24.00:14:30
SpamBayes Score	3.325118e-13
Marked as misclassified	No
Message-id	<CAO_YWRWeU-ZH5+3r4f7NhzeUDUOph1UCQntMSE7O3VS9C01G=w@mail.gmail.com>
In-reply-to	<4F1D62CD.4000408@egenix.com>

Content
> I think you're asking a bit much here :-) A broken app is a broken > app, no matter how nice Python tries to work around it. If an > app puts too much trust into user data, it will be vulnerable > one way or another and regardless of how the user data enters > the app. I notice your patch doesn't include fixes for the entire standard library to work around this problem. Were you planning on writing those, or leaving that for others? As a developer, I honestly don't know how I can state with certainty that input data is clean or not, until I actually see the error you propose. I can't check validity before the fact, the way I can check for invalid unicode before storing it in my database. Once I see the error (probably only after my application is attacked, certainly not during development), it's too late. My application can't know which particular data triggered the error, so it can't delete it. I'm reduced to trial and error to remove the offending data, or to writing code that never stores more than 1000 things in a dictionary. And I have to accept that the standard library may not work on any particular data I want to process, and must write code that detects the error state and somehow magically removes the offending data. The alternative, randomization, simply means that my dictionary ordering is not stable, something that is already the case. While I appreciate that the counting approach feels cleaner; randomization is the only solution that makes practical sense.

> I think you're asking a bit much here :-) A broken app is a broken
> app, no matter how nice Python tries to work around it. If an
> app puts too much trust into user data, it will be vulnerable
> one way or another and regardless of how the user data enters
> the app.

I notice your patch doesn't include fixes for the entire standard
library to work around this problem. Were you planning on writing
those, or leaving that for others?

As a developer, I honestly don't know how I can state with certainty
that input data is clean or not, until I actually see the error you
propose. I can't check validity before the fact, the way I can check
for invalid unicode before storing it in my database. Once I see the
error (probably only after my application is attacked, certainly not
during development), it's too late. My application can't know which
particular data triggered the error, so it can't delete it. I'm
reduced to trial and error to remove the offending data, or to writing
code that never stores more than 1000 things in a dictionary. And I
have to accept that the standard library may not work on any
particular data I want to process, and must write code that detects
the error state and somehow magically removes the offending data.

The alternative, randomization, simply means that my dictionary
ordering is not stable, something that is already the case.

While I appreciate that the counting approach feels cleaner;
randomization is the only solution that makes practical sense.

History
Date	User	Action	Args
2012-01-24 00:14:32	PaulMcMillan	set	recipients: + PaulMcMillan, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, fx5
2012-01-24 00:14:31	PaulMcMillan	link	issue13703 messages
2012-01-24 00:14:30	PaulMcMillan	create