Message 253452 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhettinger
Recipients	rhettinger
Date	2015-10-26.02:24:38
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1445826280.05.0.307309053044.issue25478@psf.upfronthosting.co.za>
In-reply-to

Content
Allen Downey suggested this at PyCon in Montreal and said it would be useful in his bayesian statistics courses. Separately, Peter Norvig created a normalize() function in his probablity tutorial at In[45] in http://nbviewer.ipython.org/url/norvig.com/ipython/Probability.ipynb . I'm creating this tracker item to record thoughts about the idea. Right now, it isn't clear whether Counter is the right place to support this operation, how it should be designed, whether to use an in-place operation or an operation that creates a new counter, should it have rounding to make the result exactly equal to 1.0, should it use math.fsum() for float inputs? Should it support other target totals besides 1.0? >>> Counter(red=11, green=5, blue=4).normalize(100) # percentage Counter(red=55, green=25, blue=20) Also would it make sense to support something like this? sampled_gender_dist = Counter(male=405, female=421) world_gender_dist = Counter(male=0.51, female=0.50) cs = world_gender_dist.chi_squared(observed=sampled_gender_dist) Would it be better to just have a general multiply-by-scalar operation for scaling? c = Counter(observations) c.scale_by(1.0 / sum(c.values()) Perhaps use an operator? c /= sum(c.values())

Allen Downey suggested this at PyCon in Montreal and said it would be useful in his bayesian statistics courses.  Separately, Peter Norvig created a normalize() function in his probablity tutorial at In[45] in http://nbviewer.ipython.org/url/norvig.com/ipython/Probability.ipynb .

I'm creating this tracker item to record thoughts about the idea.  Right now, it isn't clear whether Counter is the right place to support this operation, how it should be designed, whether to use an in-place operation or an operation that creates a new counter, should it have rounding to make the result exactly equal to 1.0, should it use math.fsum() for float inputs?

Should it support other target totals besides 1.0?

  >>> Counter(red=11, green=5, blue=4).normalize(100) # percentage
  Counter(red=55, green=25, blue=20)

Also would it make sense to support something like this?

  sampled_gender_dist = Counter(male=405, female=421)
  world_gender_dist = Counter(male=0.51, female=0.50)
  cs = world_gender_dist.chi_squared(observed=sampled_gender_dist)

Would it be better to just have a general multiply-by-scalar operation for scaling?

  c = Counter(observations)
  c.scale_by(1.0 / sum(c.values())

Perhaps use an operator?

  c /= sum(c.values())

History
Date	User	Action	Args
2015-10-26 02:24:40	rhettinger	set	recipients: + rhettinger
2015-10-26 02:24:40	rhettinger	set	messageid: <1445826280.05.0.307309053044.issue25478@psf.upfronthosting.co.za>
2015-10-26 02:24:39	rhettinger	link	issue25478 messages
2015-10-26 02:24:38	rhettinger	create