Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation details in sys module #55203

Open
fijall mannequin opened this issue Jan 24, 2011 · 14 comments
Open

implementation details in sys module #55203

fijall mannequin opened this issue Jan 24, 2011 · 14 comments
Labels
docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error

Comments

@fijall
Copy link
Mannequin

fijall mannequin commented Jan 24, 2011

BPO 10994
Nosy @loewis, @brettcannon, @arigo, @terryjreedy, @pitrou, @ambv

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2011-01-24.13:43:39.461>
labels = ['type-bug', 'docs']
title = 'implementation details in sys module'
updated_at = <Date 2013-01-22.12:49:37.730>
user = 'https://bugs.python.org/fijall'

bugs.python.org fields:

activity = <Date 2013-01-22.12:49:37.730>
actor = 'ezio.melotti'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2011-01-24.13:43:39.461>
creator = 'fijall'
dependencies = []
files = []
hgrepos = []
issue_num = 10994
keywords = []
message_count = 14.0
messages = ['126925', '126926', '126927', '126928', '127018', '127023', '127025', '127026', '127027', '127028', '127029', '127878', '127900', '136703']
nosy_count = 8.0
nosy_names = ['loewis', 'brett.cannon', 'arigo', 'terry.reedy', 'pitrou', 'docs@python', 'lukasz.langa', 'fijall']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue10994'
versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4']

@fijall
Copy link
Mannequin Author

fijall mannequin commented Jan 24, 2011

sys module documentation (as it is online) has some things that in my opinion should be marked as implementation details, but are not. Feel free to counter why not.

Some of them has info it should be used for specialized purposes only, but IMO it's not the same as not mandatory for other implementations.

Temporary list:

_clear_type_cache

dllhandle

getrefcount

getdlopenflags (?)

getsizeof - it might be not well defined on other implementations

setdlopenflags

api_version

@fijall fijall mannequin assigned docspython Jan 24, 2011
@fijall fijall mannequin added docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error labels Jan 24, 2011
@pitrou
Copy link
Member

pitrou commented Jan 24, 2011

Well, getsizeof is not better-defined under CPython than elsewhere. It just gives a hint.
Agreed about the other.

@fijall
Copy link
Mannequin Author

fijall mannequin commented Jan 24, 2011

I suppose wrt getsizeof it's more of "if you provide us with a reasonable expectations, we can implement this" other than anything else.

@pitrou
Copy link
Member

pitrou commented Jan 24, 2011

I suppose wrt getsizeof it's more of "if you provide us with a
reasonable expectations, we can implement this" other than anything
else.

The expectation is that it returns the memory footprint of the given
object, and only it (not taking into account sharing, caching,
dependencies or anything else). For example, an instance will not count
its attribute __dict__. But a str object will count its object header
plus the string payload, if the payload is private.

Of course, you are free to tweak these semantics for the PyPy
implementation.

@arigo
Copy link
Mannequin

arigo mannequin commented Jan 25, 2011

The expectation is that it returns the memory footprint of the given
object, and only it (not taking into account sharing, caching,
dependencies or anything else).

It would be nice if this was a well-defined definition, but unfortunately it is not. For example, string objects may appear different from the user's point of view (e.g. as seen by id() and 'is') but share the implementation's data; they may even share only a part of it (if ropes are enabled). Conversely, for user-defined objects you would typically think not to count the "shape" information, which is usually shared among several instances -- but then you risk a gross under-estimation in the (rarer) cases where it is not shared.

Another way to look at the "official" definition is to return the size of the object itself and none of its dependencies, because in theory they might be shared; but that would make all strings, lists, tuples, dicts, and so on have a getsizeof() of 8 or 12, which is rather useless.

I hope this clarifies fijal's original comment: "it might be not well defined on other implementations."

@pitrou
Copy link
Member

pitrou commented Jan 25, 2011

> The expectation is that it returns the memory footprint of the given
> object, and only it (not taking into account sharing, caching,
> dependencies or anything else).

It would be nice if this was a well-defined definition, but
unfortunately it is not.

I didn't claim it was. Actually, if you read the rest of my message, I
did mention that PyPy could tweak the semantics if it made more sense.
So, of course, the more sharing and caching takes place, the less
obvious these semantics are, but even with CPython they are not obvious
anyway. It's not supposed to be an exact measurement for the common
developer, rather a hint that experts can use to tweak their data
structures and algorithms; you need to know details of your VM's
implementation to use that information.

@fijall
Copy link
Mannequin Author

fijall mannequin commented Jan 25, 2011

I can hardly think about a specification that would potentially help me identify actual sizes. Even as a rough estimation. Which experts you had in mind?

@pitrou
Copy link
Member

pitrou commented Jan 25, 2011

Which experts you had in mind?

People who know how the Python implementation works.

@fijall
Copy link
Mannequin Author

fijall mannequin commented Jan 25, 2011

> Which experts you had in mind?

People who know how the Python implementation works.

I'm serious. What semantics would make sense to anyone? Even if you know implementation quite well a single number per object does not provide enough information.

@brettcannon
Copy link
Member

You could return -1 for everything. =)

In all seriousness, it could simply be proportional. IMO as long as people realize if a list takes up less space than a dict then the numbers seem fine to me.

@pitrou
Copy link
Member

pitrou commented Jan 25, 2011

Even if you know implementation quite well a single number per object
does not provide enough information.

Enough information for what? It can certainly provide information about
the overhead of that particular object (again, regardless of sharing).

@loewis
Copy link
Mannequin

loewis mannequin commented Feb 4, 2011

I can propose a specification of getsizeof: if you somehow manage to traverse all objects (without considering an object twice), and sum up the getsizeof results, you should end up with something close to, but smaller than the actual memory consumption. How close is a quality-of-implementation issue (so always returning 0 would be correct-but-useless).

It may be that implementations can also support counting certain hidden memory usage (headers, blocks shared across instances that are not objects themselves). Such functions would should have different names and interfaces (e.g. sys.gethiddenblocks(o) may return a list of (address, size) pairs); CPython doesn't provide any such function (although sys.mallocoverhead might be useful).

In any case: I'm not convinced that it is useful to mark functions as CPython-specific in the documentation. This clutters the documentation, and is of interest only for language lawyers. So if implementation details are to be documented, I'd prefer this to happen in a separate document.

@arigo
Copy link
Mannequin

arigo mannequin commented Feb 4, 2011

Martin: I kind of agree with you, although I guess that for pratical reasons if you don't have a reasonable sys.getsizeof() implementation then it's better to raise TypeError than return 0 (like CPython, which may raise "TypeError: Type %.100s doesn't define __sizeof__").

I agree that it's not really useful to mark functions as CPython-specific in the documentation, if only because whenever a new implementation like PyPy comes along, then it's going to have a rather different set of functions that it wants to consider implementation details. I would say that more than half the functions in the sys module marked CPython-specific in the doc are implemented in PyPy just fine, and there is an equal number of functions not marked CPython-specific that have no chance to be implemented in PyPy.

@terryjreedy
Copy link
Member

The __sizeof__ special attribute shows up in dir(object) but appears not to be documented other than with

>>> help(object.__sizeof__)
Help on method_descriptor:
__sizeof__(...)
    __sizeof__() -> size of object in memory, in bytes

Should it have an entry in Lib 4.12. Special Attributes?

object.__sizeof__
A method used by sys.getsizeof.

It should then show up in the index (missing now) and point people to sys.getsizeof. Looking further, I see that it is mentioned but not indexed in the sys.getsizeof entry.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants