This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author christian.heimes
Recipients alexandre.vassalotti, christian.heimes, gvanrossum
Date 2007-10-13.23:27:29
SpamBayes Score 0.00028334127
Marked as misclassified No
Message-id <4711545C.7080408@cheimes.de>
In-reply-to <1192310346.63.0.723987066167.issue1272@psf.upfronthosting.co.za>
Content
Guido van Rossum wrote:
> - You added a removal of hotshot from setup.py to the patch; but that's
> been checked in in the mean time.

Oh, the change shouldn't make it into the patch. I guess I forgot a svn
revert on setup.py

> - Why add an 'errors' argument to the function when it's a fatal error
> to use it?

I wanted the signature of the method be equal to the other methods
PyUnicode_Decode*. I copied the FatalError from
*_PyUnicode_AsDefaultEncodedString().

> - Using 0 to autodetect the length is scary.  Normally we have two APIs
> for that, one ..._FromString and one ...FromStringAndSize.  If you
> really don't want that, please use -1, which is at least an illegal value.

Oh right, -1 is *much* better for autodetect than 0. What do you prefer,
a second method or -1 as auto detect?

> - Why is there code in codeobject.c::PyCode_New() that still accepts a
> PyString for the filename?

Because it's my fault that I've overseen it. :/

> - In that file (and possibly others, I didn't check) your code uses
> spaces to indent while the surrounding code uses tabs.  Moreover, your
> space indent seems to assume there are 4 spaces to a tab, but all our
> code (Python and C) is formatted assuming tabs are 8 spaces.  (The
> indent isn't always 8 spaces -- but ASCII TAB characters always are 8,
> for us.)

Some C files like unicodeobject.c are using 4 spaces while other files
are using tabs for indention. My editor may got confused by the mix.
I've manually fixed it in the patch but I may have overseen a line or two.

> - Why copy the default encoding before mangling it?  With a little extra
> care you will only have to copy it once.  Also, consider not mangling at
> all, but assuming the encoding comes in a canonical form -- several
> other functions assume that, e.g. PyUnicode_Decode() and
> PyUnicode_AsEncodedString().

My C is a bit rusty and still need to learn news tricks. I'm trying to
see if I can remove the extra copy without causing a problem.
The other part of your question was already answered by Alexandre. The
aliases map is defined in Python code. It's not available so early in
the boot strapping process.
We'd have to redesign the assignment of co_filename and __file__
completely if we want to use the aliases and other codecs. For example
we could store a PyString at first and redo all names once the codecs
are set up.

Christian
History
Date User Action Args
2007-10-13 23:27:30christian.heimessetspambayes_score: 0.000283341 -> 0.00028334127
recipients: + christian.heimes, gvanrossum, alexandre.vassalotti
2007-10-13 23:27:30christian.heimeslinkissue1272 messages
2007-10-13 23:27:29christian.heimescreate