Message 147537 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	loewis
Recipients	loewis, mhammond, python-dev, santoso.wijaya, sbt, vstinner
Date	2011-11-13.00:16:33
SpamBayes Score	3.6801777e-09
Marked as misclassified	No
Message-id	<4EBF0C56.3040201@v.loewis.de>
In-reply-to	<1321133772.03.0.698103185151.issue13374@psf.upfronthosting.co.za>

Content
> Is this approach of coercing to unicode and only using the wide api > "blessed"? It's not. If people use byte strings, they specifically ask for what they get; Python shouldn't second-guess the data types. > I certainly think it should be. If so then one can get > rid lots windows specific code. How so? This entire handling of file names is windows specific; dealing with different file name data types doesn't make it more windows specific than it already is. > And are we able to assume that on Windows we have access to wide libc > functions? Yes, but Python should avoid using them. > _wcsicmp(), _snwprintf(), _wputenv() are all used already, > so I guess we already make that assumption. It looks like a lot of the > windows specific code attempts to reimplement basic libc functions using > the win32 api just to support unicode - presumably there was a time when > we could not assume that wide libc functions would be available. No: a) we try to get rid of MS libc as much as possible. Ideally, some future version of Python will not rely on libc at all for Windows. If Microsoft had chosen to make the C library a system API, this we would happily use it. Alas, they chose to make it an API of their compiler instead, so we really shouldn't use it. b) the wide libc functions assume a 16-bit wchar_t type. This is not a good match for Python's unicode data type, which readily supports 32-bit characters.

> Is this approach of coercing to unicode and only using the wide api
> "blessed"?

It's not. If people use byte strings, they specifically ask for what
they get; Python shouldn't second-guess the data types.

> I certainly think it should be.  If so then one can get
> rid lots windows specific code.

How so? This entire handling of file names is windows specific;
dealing with different file name data types doesn't make it more
windows specific than it already is.

> And are we able to assume that on Windows we have access to wide libc
> functions?

Yes, but Python should avoid using them.

> _wcsicmp(), _snwprintf(), _wputenv() are all used already,
> so I guess we already make that assumption.  It looks like a lot of the
> windows specific code attempts to reimplement basic libc functions using
> the win32 api just to support unicode - presumably there was a time when
> we could not assume that wide libc functions would be available.

No:
a) we try to get rid of MS libc as much as possible. Ideally, some
   future version of Python will not rely on libc at all for Windows.
   If Microsoft had chosen to make the C library a system API, this
   we would happily use it. Alas, they chose to make it an API of their
   compiler instead, so we really shouldn't use it.
b) the wide libc functions assume a 16-bit wchar_t type. This is not a
   good match for Python's unicode data type, which readily supports
   32-bit characters.

History
Date	User	Action	Args
2011-11-13 00:16:34	loewis	set	recipients: + loewis, mhammond, vstinner, santoso.wijaya, python-dev, sbt
2011-11-13 00:16:33	loewis	link	issue13374 messages
2011-11-13 00:16:33	loewis	create