classification
Title: platform.architecture() gives misleading results for OS X multi-architecture executables
Type: behavior Stage: patch review
Components: Library (Lib), macOS Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: lemburg Nosy List: lemburg, ned.deily, pitrou, ronaldoussoren
Priority: normal Keywords: patch

Created on 2010-12-19 06:28 by ned.deily, last changed 2017-02-20 19:21 by lemburg.

Files
File name Uploaded Description Edit
issue10735-py3k.patch ned.deily, 2010-12-19 08:56 Issue10735 py3k patch rev 1
issue10735-27.patch ned.deily, 2010-12-19 08:57 Issue10735 2.7 patch rev 1
Messages (13)
msg124331 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-12-19 06:28
OS X Mach-O universal executable files often contain multiple architectures including a combination of 32-bit and 64-bit archs, as with the newer OS X installer variants provided on python.org.  In such cases, the platform.architecture() function always returns '64bit' as the bit architecture regardless of whether the interpreter is running in 32-bit or 64-bit mode.  Thus, there is no documented way to reliably tell whether an interpreter is running in 32- or 64-bit in OS X.  Instead of the platform module, one must resort to hacks like examining sys.maxsize (or sys.maxint) or checking type sizes from the struct module.

$ arch -x86_64 /usr/local/bin/python3.2 -c 'import sys,platform; 
print(sys.maxsize,platform.architecture())'
9223372036854775807 ('64bit', '')
$ arch -i386 /usr/local/bin/python3.2 -c 'import sys,platform; 
print(sys.maxsize,platform.architecture())'
2147483647 ('64bit', '')
msg124333 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-12-19 07:10
The attached patches for py3k (3.2+) and 2.7 correct platform.architecture() to return the bit architecture ('32bit' or '64bit') of the running interpreter in the default case where executable = sys.executable.  The linkage string will also contain information about the Mach-O executable including available bit- and processor-architectures as returned by the MacOS X file command.

An example of the most general results, using a 4-way build:

# ./configure --enable-universalsdk=/Developer/SDKs/MacOSX10.5.sdk --with-universal-archs=all MACOSX_DEPLOYMENT_TARGET=10.5
$ file ./python
./python: Mach-O universal binary with 4 architectures
./python (for architecture i386):	Mach-O executable i386
./python (for architecture ppc7400):	Mach-O executable ppc
./python (for architecture ppc64):	Mach-O 64-bit executable ppc64
./python (for architecture x86_64):	Mach-O 64-bit executable x86_64
$ arch -i386 ./python -c 'import platform;print(platform.architecture())'
('32bit', 'MachO 32bit 64bit i386 ppc ppc64 x86_64')
$ arch -x86_64 ./python -c 'import platform;print(platform.architecture())'
('64bit', 'MachO 32bit 64bit i386 ppc ppc64 x86_64')
$ arch -ppc ./python -c 'import platform;print(platform.architecture())'
('32bit', 'MachO 32bit 64bit i386 ppc ppc64 x86_64')
$ arch -i386 ./python -m platform
Darwin-10.5.0-i386-32bit-MachO_32bit_64bit_i386_ppc_ppc64_x86_64
$ arch -x86_64 ./python -m platform
Darwin-10.5.0-i386-64bit-MachO_32bit_64bit_i386_ppc_ppc64_x86_64
$ ./python -m platform --terse
Darwin-10.5.0
msg124349 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-19 13:35
> Instead of the platform module, one must resort to hacks like examining 
> sys.maxsize

I'm not sure why you think it's a hack. To me, it's, by construction, the right way to check for 64-bitness (and also the easiest since it doesn't involved parsing of strings of an unspecified format).
msg124351 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-12-19 13:49
It's only a hack in the sense that platform.architecture is the documented interface in the std library to report "bits" and, unfortunately, users try to use it to determine whether running in 64-bit or 32-bit mode.  For instance, see here:
http://permalink.gmane.org/gmane.comp.python.general/676626
msg124352 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-19 13:56
> It's only a hack in the sense that platform.architecture is the
> documented interface in the std library to report "bits" and,
> unfortunately, users try to use it to determine whether running in
> 64-bit or 32-bit mode.  For instance, see here:
> http://permalink.gmane.org/gmane.comp.python.general/676626

Well, the fact that platform.architecture() returns a free-form string
suggests to me that it could return all kinds of unexpected results
depending on the system (it probably parses the output of some command).
So perhaps the platform docs should warn against this.
msg124376 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-12-20 00:51
Adding a warning sounds like a good idea.  Is it reasonable to include a recommended cross-platform approach in the platform doc, like either the sys.maxsize test or the struct.calsize("P") test (which is used as a default fallback in platform.architecture)?  Are there any currently supported platforms where either of those wouldn't work?
msg124445 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-21 18:36
> Adding a warning sounds like a good idea.  Is it reasonable to include
> a recommended cross-platform approach in the platform doc, like either
> the sys.maxsize test or the struct.calsize("P") test (which is used as
> a default fallback in platform.architecture)?

Yes.

>   Are there any currently supported platforms where either of those
> wouldn't work?

No. I think even on fringe platforms it is unlikely for size_t to not
reflect the native pointer width (although it is theoretically
possible).
msg124448 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-21 18:50
I'm committing a doc update in r87421 with a suggestion to use sys.maxsize. I'll let Marc-André decide how to deal with the rest of the patch.
msg130703 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2011-03-12 22:50
IMHO the change to 'bits' is bogus, it is supposed to return the bit-size of the executable, not that of the currently running executable.

I'd return all executable bitsizes in bits as '32bit', '64bit' or '32bit,64bit' (as appropriate) and only include the machine architectures in the linkage result.  

And finally the executable file format is 'Mach-O', not 'Mach'.

I can provide an updated patch if Marc-Andre agrees that this would be a usefull change.
msg130730 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-03-13 07:49
"IMHO the change to 'bits' is bogus, it is supposed to return the bit-size of the executable, not that of the currently running executable."

Perhaps but (1) the code currently does return the bit-size of the currently running executable if it can't parse the output from 'file'.  More importantly, (2) not surprisingly, platform.architecture was not designed to deal with the somewhat unusual case presented by OS X multi-architecture files that have multiple bit-sizes.

"I'd return all executable bitsizes in bits as '32bit', '64bit' or '32bit,64bit' (as appropriate) and only include the machine architectures in the linkage result."

That's another way to do it.  My thought was that, while the return values of "bits" and "linkage" are deliberately not specified in the documentation, "bits" is likely the more useful and used of the two and it would be useful to return upwardly compatible values while also providing the current bits of the running interpreter.  That's what most programs are really interested in since, AFAIK, in all other platforms and cases except OS X 64-/32- universal binaries, there can be no difference between the value for the interpreter and the executable file.  To me, adding the full set of values possible makes more sense to be returned in the linkage string rather than in bits.  And I still think it makes sense to have 'platform.architecture' be an officially blessed API to determine interpreter execution bit size, rather that the unintuitive sys.maxsize or struct.calcsize('P') tests. What led me to write this patch in the first place was that, on more than one occasion in support groups, I found people recommending testing platform.architecture(bits) to determine 32-bit vs 64-bit and it was clear that it was giving the wrong results in this case.

"And finally the executable file format is 'Mach-O', not 'Mach'."

As is, the patch returns "MachO".  The reason for doing that was to make parsing (both human and machine) of the platform string easier; it currently uses '-' as a separator for the various fields:
  $ python3.2 -m platform
  Darwin-10.6.0-i386-64bit
Dropping the '-' in 'Mach-O' was an easy and unambiguous way to continue to differentiate the fields without introducing any incompatibilities.
msg130738 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2011-03-13 11:58
W.r.t the MachO name: I misread the patch, MachO is fine as the name for the reasons you mention.

I'm not convinced that your hack to make bits return the pointer size of the currently running architecture when testing sys.executable is useful, especially because the behaviour is inconsistent (it doesn't work for other executables) and also does something different than the document behaviour.

I'd prefer to return all pointer sizes supported by the binary, even if that can be surprising for users not used to fat binaries. This can easily be accomplished by added the calculation of 'bits' to the elif branch below:

+ elif ('Mach-O executable' in fileout
+            or 'Mach-O 64-bit executable' in fileout):

Using sys.maxsize or struct.calcsize("P") are both good ways of determining the actual size, and if there is a real need we could add a function that explicitly returns the pointer size (although I don't think that such a function is really necessary).
msg288227 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2017-02-20 19:17
I think there's a misunderstanding in what platform.architecture() is meant for. The purpose is to find out more details about the executable you pass to it, e.g. whether it's a 32-bit or 64-bit binary, or whether it's an ELF or PE binary. And it's a best effort API, just as most other platform APIs - this is also the reason why most of them have parameters available to modify the default return values.

It doesn't work with multi-architecture executables. We'd need a new API for this.

Regarding returning multiple architectures in the linkage return value: I'm not sure whether that's a good idea. The architectures are not necessarily of different linkage types. In fact on Macs, the correct values is "Mach-O". The API should probably return this instead of the default empty string.
msg288228 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2017-02-20 19:21
The term "linkage" is probably a misnomer... "execformat" would be more correct:

 * https://en.wikipedia.org/wiki/Comparison_of_executable_file_formats

Too late to change, I guess.
History
Date User Action Args
2017-02-20 19:21:51lemburgsetmessages: + msg288228
2017-02-20 19:17:13lemburgsetmessages: + msg288227
2017-02-20 16:32:47ned.deilysetassignee: ronaldoussoren -> lemburg
versions: + Python 3.6, Python 3.7, - Python 3.3, Python 3.4
2013-07-06 11:47:20ronaldoussorensetversions: + Python 3.3, Python 3.4, - Python 3.2
2011-03-13 11:58:53ronaldoussorensetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg130738
2011-03-13 07:49:34ned.deilysetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg130730
2011-03-12 22:50:04ronaldoussorensetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg130703
2010-12-21 18:50:43pitrousetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg124448
2010-12-21 18:36:05pitrousetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg124445
2010-12-20 00:51:31ned.deilysetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg124376
2010-12-19 13:56:35pitrousetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg124352
2010-12-19 13:49:08ned.deilysetnosy: lemburg, ronaldoussoren, pitrou, ned.deily
messages: + msg124351
2010-12-19 13:35:56pitrousetnosy: + pitrou
messages: + msg124349
2010-12-19 08:57:37ned.deilysetfiles: - issue10735-27.patch
nosy: lemburg, ronaldoussoren, ned.deily
2010-12-19 08:57:33ned.deilysetfiles: - issue10735-py3k.patch
nosy: lemburg, ronaldoussoren, ned.deily
2010-12-19 08:57:27ned.deilysetfiles: + issue10735-27.patch
nosy: lemburg, ronaldoussoren, ned.deily
2010-12-19 08:56:57ned.deilysetfiles: + issue10735-py3k.patch
nosy: lemburg, ronaldoussoren, ned.deily
2010-12-19 07:11:39ned.deilysetfiles: + issue10735-27.patch
nosy: lemburg, ronaldoussoren, ned.deily
2010-12-19 07:11:07ned.deilysetfiles: + issue10735-py3k.patch
nosy: lemburg, ronaldoussoren, ned.deily
keywords: + patch
2010-12-19 07:10:37ned.deilysetnosy: lemburg, ronaldoussoren, ned.deily
messages: + msg124333
stage: patch review
2010-12-19 06:48:09r.david.murraysetnosy: + lemburg
2010-12-19 06:28:26ned.deilycreate