This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gvanrossum
Recipients brett.cannon, christian.heimes, gvanrossum
Date 2007-10-20.15:36:06
SpamBayes Score 0.052568886
Marked as misclassified No
Message-id <ca471dc20710200836s6829a4c2l8c589fa69878cb6f@mail.gmail.com>
In-reply-to <47198279.9070102@cheimes.de>
Content
Thanks for persevering!!!

The dangers of switching between fileno(fp) and fp are actually well
documented in the C and/or POSIX standards. The problem is caused in
PyFile_FromFileEx() -- it creates a Python file object from the file
descriptor. The fix actually only works because we're not using the
FILE struct once PyTokenizer_FindEncoding() is called. I think it
would be better to move the lseek() into call_find_module() so the
FILE abstraction is not broken by PyTokenizer_FindEncoding().

I think there's still a bug or two lurking in this area: first, each
time you call imp.find_module() you leak a FILE object; second, the
encoding allocated in PyTokenizer_FindEncoding() is leaked.

You're right that a lot of this could be avoided if we used file
descriptors consistently. It seems find_module() itself doesn't read
the file; it just needs to know that it's possible to open the file.

Rewriting everywhere that uses PyFile_FromFile[Ex] to use file
descriptors doesn't seem too hard; there are only a few places.
History
Date User Action Args
2007-10-20 15:36:08gvanrossumsetspambayes_score: 0.0525689 -> 0.052568886
recipients: + gvanrossum, brett.cannon, christian.heimes
2007-10-20 15:36:07gvanrossumlinkissue1267 messages
2007-10-20 15:36:07gvanrossumcreate