Message 170599 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	aliles, benjamin.peterson, hynek, jcea, loewis, pitrou, serhiy.storchaka, stutzbach
Date	2012-09-17.12:26:17
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1347884778.13.0.815854108199.issue15490@psf.upfronthosting.co.za>
In-reply-to

Content
It can't work well if we count internal Python objects that can be shared. This is because the "work well" concept is not well defined. And because the implementation of a certain defined calculation algorithm can be completely unmaintainable, that is not well. If we wrote in the StringIO the same 1 MB string twice, should the __sizeof__() return about 1) 2 MB, 2) 1 MB or 3) size of empty stream if there are external links to this shared string? Patch implements the second strategy, it can be simplified to the first one or complicated even more to a third one. Even more complication will need to take into account the sharing of eol string ('\r' and '\n' always shared, '\r\n' may be). > I would personally prefer if the computations where done in Py_ssize_t, not PyObject* I too. But on platforms with 64-bit pointers and 32-bit sizes we can allocate total more than PY_SIZE_MAX bytes (hey, I remember the DOS memory models with 16-bit size_t and 32-bit pointers). Even faster we get an overflow if allow the repeated counting of shared objects. What to do with overflow? Return PY_SIZE_MAX or ignore the possibility of errors?

It can't work well if we count internal Python objects that can be shared. This is because the "work well" concept is not well defined. And because the implementation of a certain defined calculation algorithm can be completely unmaintainable, that is not well.

If we wrote in the StringIO the same 1 MB string twice, should the __sizeof__() return about 1) 2 MB, 2) 1 MB or 3) size of empty stream if there are external links to this shared string? Patch implements the second strategy, it can be simplified to the first one or complicated even more to a third one. Even more complication will need to take into account the sharing of eol string ('\r' and '\n' always shared, '\r\n' may be).

> I would personally prefer if the computations where done in Py_ssize_t, not PyObject*

I too. But on platforms with 64-bit pointers and 32-bit sizes we can allocate total more than PY_SIZE_MAX bytes (hey, I remember the DOS memory models with 16-bit size_t and 32-bit pointers). Even faster we get an overflow if allow the repeated counting of shared objects. What to do with overflow? Return PY_SIZE_MAX or ignore the possibility of errors?

History
Date	User	Action	Args
2012-09-17 12:26:18	serhiy.storchaka	set	recipients: + serhiy.storchaka, loewis, jcea, pitrou, benjamin.peterson, stutzbach, aliles, hynek
2012-09-17 12:26:18	serhiy.storchaka	set	messageid: <1347884778.13.0.815854108199.issue15490@psf.upfronthosting.co.za>
2012-09-17 12:26:17	serhiy.storchaka	link	issue15490 messages
2012-09-17 12:26:17	serhiy.storchaka	create