Title: locale.getpreferredencoding() gives bus error on Mac OS X 10.4.11 PPC
Type: crash Stage:
Components: macOS Versions: Python 3.1, Python 3.2, Python 2.7
Status: closed Resolution: duplicate
Dependencies: Superseder: Obsolete default file encoding "mac-roman" on OS X, not influenced by locale env variables
View: 6202
Assigned To: ronaldoussoren Nosy List: cfr, janssen, loewis, ned.deily, pitrou, ronaldoussoren
Priority: critical Keywords:

Created on 2008-07-15 14:22 by cfr, last changed 2010-10-26 18:16 by ned.deily. This issue is now closed.

Messages (24)
msg69683 - (view) Author: (cfr) Date: 2008-07-15 14:22
Darwin Kernel Version 8.11.0: Wed Oct 10 18:26:00 PDT 2007;

Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) 
[GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys, locale
>>> locale.getpreferredencoding()
Bus error

Sample crash report excerpt follows (plenty more available on request!).
Note that the version of python given in the crash report is *not* the
same as the version of python actually in use. I have never had an alpha
version of python installed. The current version is the standard version
of 2.5.2 available as a dmg download from i.e. the universal
framework build for 10.4.

OS Version:     10.4.11 (Build 8S165)
Report Version: 4

Command: Python
Parent:  bash [27154]

Version: 2.5a0 (2.5alpha0)

PID:    4692
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000000

Thread 0 Crashed:
0 	0x907beac0 CFStringGetCStringPtr + 408
1               	0x000f1cd8 PyLocale_getdefaultlocale + 328
2   org.python.python        	0x002b393c PyEval_EvalFrameEx + 17036
3   org.python.python        	0x002b5e50 PyEval_EvalCodeEx + 2096
4   org.python.python        	0x002b3f48 PyEval_EvalFrameEx + 18584
5   org.python.python        	0x002b5e50 PyEval_EvalCodeEx + 2096
6   org.python.python        	0x002b5ff0 PyEval_EvalCode + 48 (ceval.c:500)
7   org.python.python        	0x002dbb24 PyRun_InteractiveOneFlags + 772
8   org.python.python        	0x002dbd30 PyRun_InteractiveLoopFlags +
288 (pythonrun.c:725)
9   org.python.python        	0x002dc3f0 PyRun_AnyFileExFlags + 176
10  org.python.python        	0x002eba9c Py_Main + 3052 (main.c:523)
11  org.python.python        	0x000019bc 0x1000 + 2492
12  org.python.python        	0x000016c0 0x1000 + 1728

Thread 0 crashed with PPC Thread State 64:
  srr0: 0x00000000907beac0 srr1: 0x000000000000d030                    
   vrsave: 0x0000000000000000
    cr: 0x84244224          xer: 0x0000000020000004   lr:
0x00000000907be930  ctr: 0x00000000907be928
    r0: 0x00000000a07bb678   r1: 0x00000000bfffd3a0   r2:
0x00000000a07bb278   r3: 0x0000000000000000
    r4: 0x0000000000000000   r5: 0x00000000bfffd2e0   r6:
0x0000000000000005   r7: 0x0000000000000007
    r8: 0x0000000000702333   r9: 0x000000000000001c  r10:
0x0000000090bb4bb8  r11: 0x00000000000f33d8
   r12: 0x00000000907be928  r13: 0x0000000000058b24  r14:
0x0000000000071e40  r15: 0x000000000006aa20
   r16: 0x0000000000000000  r17: 0x0000000000000001  r18:
0x00000000000732a8  r19: 0x0000000000619410
   r20: 0x0000000000000000  r21: 0x000000000006a9b0  r22:
0x0000000000000000  r23: 0x0000000000058b39
   r24: 0x00000000000f1b90  r25: 0x00000000006001d0  r26:
0x000000000007e300  r27: 0x0000000000000000
   r28: 0x0000000000000000  r29: 0x00000000a07bbb6c  r30:
0x0000000000000000  r31: 0x00000000907be930

Binary Images Description:
    0x1000 -     0x1fff org.python.python 2.5a0 (2.5alpha0)
   0xa2000 -    0xd9fff 
   0xf0000 -    0xf2fff 
  0x205000 -   0x323fff org.python.python 2.5a0 (2.5)
  0x705000 -   0x74afff libncurses.5.dylib 
  0x7de000 -   0x7e1fff 
0x8fe00000 - 0x8fe52fff dyld 46.16	/usr/lib/dyld
0x90000000 - 0x901bcfff libSystem.B.dylib 	/usr/lib/libSystem.B.dylib
0x90214000 - 0x90219fff libmathCommon.A.dylib 
0x907bb000 - 0x90895fff 6.4.11 (368.35)
0x908e0000 - 0x909e2fff libicucore.A.dylib 	/usr/lib/libicucore.A.dylib
0x90a3c000 - 0x90ac0fff libobjc.A.dylib 	/usr/lib/libobjc.A.dylib
0x90aea000 - 0x90b5afff 1.4 (???)
0x90b70000 - 0x90b82fff libauto.dylib 	/usr/lib/libauto.dylib
0x90b89000 - 0x90e60fff 681.17
0x91111000 - 0x9111ffff libz.1.dylib 	/usr/lib/libz.1.dylib
0x91122000 - 0x912ddfff 4.6 (29770)
0x913dc000 - 0x913e5fff 2.1.2
0x913ec000 - 0x913f4fff libbsm.dylib 	/usr/lib/libbsm.dylib
0x913f8000 - 0x91420fff 1.8.3
0x91433000 - 0x9143efff libgcc_s.1.dylib 	/usr/lib/libgcc_s.1.dylib
0x945e4000 - 0x94604fff libmx.A.dylib 	/usr/lib/libmx.A.dylib

I have got as far as I can tracking this issue down but would be happy
to provide further information if somebody would give me (a pointer to)
instructions or a hint.
msg69685 - (view) Author: (cfr) Date: 2008-07-15 14:31
Although the active version of python on my machine is 2.5.2 and I have
never had an alpha version installed, crash reports for python report
the version as "2.5a0 (2.5alpha0)".

Version details: active version of python is from the current
dmg download for Mac OS X 10.4 i.e. the universal framework build. When
starting python, I get:

Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) 
[GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

but in crash reports, I get:

Command: Python
Parent:  bash [27154]

Version: 2.5a0 (2.5alpha0)

and python is given as version 2.5a0 in the binary image listing which

Darwin Kernel Version 8.11.0: Wed Oct 10 18:26:00 PDT 2007;

I think I did have 2.5.1 installed prior to installing 2.5.2 and I also
have two older versions of python installed - 2.4 (also the
build) and 2.3 (as pre-installed by Apple) - but I never installed 2.5.0
or any version/candidate in the 2.5 line prior to 2.5.1.

I'm not sure what further information might be helpful but would be
happy to provide it on request.
msg69687 - (view) Author: (cfr) Date: 2008-07-15 14:40
Please ignore the second message. I thought I was creating a second bug
report and cannot figure out anyway to edit it now I realise my error.
I've just copied that to a second report with an appropriate header as I
am assuming the two issues I'm seeing are distinct.

This bug report is intended to cover the bus error I see triggered by
msg69717 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-07-15 20:51
It would be good to find out what the code inside
PyLocale_getdefaultlocale precisely is doing. I.e. what is the value of
name at that point, and is that perhaps an illegal parameter for
CFStringGetCStringPtr somehow? If the debugger can't help (not even
after compiling Python with --with-pydebug), augment the source code
with a printf, and rebuild Python.
msg69744 - (view) Author: (cfr) Date: 2008-07-16 01:41
I downloaded the current source (2.5.2) and confirmed that (1) python
will build as a framework (for me) and (2) that the problem occurs for
my build, too. I did not build it as a universal binary just in case
that helped but mostly to speed things up.

I then tried to add the --with-pydebug flag to my configure script and
build that way. I used separate build directories for the two builds to
keep the source clean. Unfortunately, make fails with the following
error in that case:

if test ""; then \
        gcc -o Python.framework/Versions/2.5/Python -arch i386 -arch ppc
-dynamiclib \
                -isysroot "" \
                -all_load libpython2.5.a -Wl,-single_module \
/Library/Frameworks/Python.framework/Versions/2.5/Python \
                -compatibility_version 2.5 \
                -current_version 2.5; \
        else \
        /usr/bin/libtool -o Python.framework/Versions/2.5/Python
-dynamic  libpython2.5.a \
                 -lSystem -lSystemStubs -arch_only ppc -install_name
-compatibility_version 2.5 -current_version 2.5 ;\
ld: Undefined symbols:
/usr/bin/libtool: internal link edit command failed
make: *** [Python.framework/Versions/2.5/Python] Error 1

I am guessing that eprintf has something to do with the debug option
because the symbol occurs in the debug version of libpython2.5.a but not
the plain version as far as I can tell. But I'm not sure how to fix it.

I am not even sure I am running the debugger correctly with the existing
version of python. I tried passing "-v -v" and "-d" to python after
reading the man page and that didn't get me any extra information.
Nothing useful-looking, at least. ("-v -v" produced a lot of output
beforehand but not around the point the error occurs.) Is that what you
meant or should I be looking at something else?

I am sorry but I don't know how to augment the source code with printf
and that is such a common term I'm not sure what to google to find
instructions for doing it.
msg69746 - (view) Author: (cfr) Date: 2008-07-16 02:06
I figured out how to do this:

Python 2.5.2 (r252:60911, Jul 16 2008, 01:44:22) 
[GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pdb
>>> import sys, os, locale
> <string>(1)<module>()
(Pdb) continue
Bus error

though I'm not sure if that is what I was meant to do either. (But it
strikes me as another plausible possibility.)
msg69752 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-07-16 02:54
The Python debugger (pdb) won't help here; you'ld have to use the system
debugger (gdb).

Please add the line

  printf("The value of name is %p\n", name);
  printf("It points to '%s'\n", name);

right before the call to CFStringGetCStringPtr in Modules/_localemodule.c
msg69775 - (view) Author: (cfr) Date: 2008-07-16 12:50

I couldn't get anything from gdb which wasn't already in the crash log -
likely because I don't know how to elicit the information correctly.

Output from a build with the augmented _localemodule.c:

Python 2.5.2 (r252:60911, Jul 16 2008, 01:44:22) 
[GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
iType "help", "copyright", "credits" or "license" for more information.
>>> import os, sys, locale
>>> locale.getpreferredencoding()
The value of name is 0x0
It points to '(null)'
Bus error
msg69860 - (view) Author: (cfr) Date: 2008-07-17 00:05
On the off chance this might be helpful:

I get the same error with python 2.4.3.

Python 2.4.3 (#1, Apr  7 2006, 10:54:33) 
[GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys, locale
>>> locale.getpreferredencoding()
Bus error

I do not get the error with the Apple-supplied python 2.3.5:

Python 2.3.5 (#1, Mar 20 2005, 20:38:20) 
[GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys, locale
>>> locale.getpreferredencoding()
msg69950 - (view) Author: (cfr) Date: 2008-07-18 12:31
A work-around when using python from a shell environment (e.g. from a
bash shell in Terminal) is to issue

export __CF_USER_TEXT_ENCODING=0x1F5:0:0

before starting python. I haven't yet worked out how to apply this to
GUI apps. I tried editing ~/.MacOSX/environment.plist and
~/.CFUserTextEncoding but neither strategy prevents the crash.

I assume the fix works because it means one of the explicitly listed
encodings matches so things never get as far as the code which triggers
the error.

Without the fix, my environment contained


which does not, apparently, correspond to any of the encodings
explicitly listed in _localemodule.c.

- cfr
msg70164 - (view) Author: (cfr) Date: 2008-07-22 21:41
Altering ~/.CFUserTextEncoding so it has the contents "0:0" and then
rebooting seems to prevent the crash for GUI applications, too.

Would like to know how to fix this properly, of course, since I suspect
that the value on my machine was probably not "0:0" for a reason!
msg70937 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-09 16:54
locale.getpreferredencoding() should certainly not crash but the
question remains of what should be the outcome. I can see several
(1) return the empty string
(2) return None
(3) return "ascii" (!!)
(4) raise an exception (which one?)

(2) sounds the most logical to me, there is no preferred encoding in the
environment so we just return None to indicate that the application has
to choose its own default.
msg70941 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-09 17:29
No, getpreferredencoding should always produce an encoding name. If the
application had an idea what to use, it wouldn't have to ask. So I favor
(3), or, perhaps given that OSX uses UTF-8 in many places,

(5) return "UTF-8"
msg71038 - (view) Author: (cfr) Date: 2008-08-12 00:17
I admit to not understanding the code involved, but I *thought* that the
problem involved cases where there *is* a preferred encoding in the
environment but it is not one of those covered by:
    case kCFStringEncodingMacRoman: return "mac-roman";
    case kCFStringEncodingMacGreek: return "mac-greek";
    case kCFStringEncodingMacCyrillic: return "mac-cyrillic";
    case kCFStringEncodingMacTurkish: return "mac-turkish";
    case kCFStringEncodingMacIcelandic: return "mac-icelandic";
The work around basically ensures the preferred encoding given by the
environment is one of those listed so that the rest of that part of the
code doesn't run. I don't think that my crash, at least, resulted from
no preferred encoding being defined in the environment but maybe
something is going wrong in the locale module because it is not one from
the standard list.

msg71046 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-12 07:19
Lists of possible string encodings are here:


So it would be interesting to know what CFStringGetSystemEncoding
returns on your system. Notice the special value
kCFStringEncodingInvalidId, which it might also return.

I think

  printf("Encoding is %x\n", enc);

should do.

I think mac_getscript is fine as it stands: if name is NULL, it tries
CFStringConvertEncodingToIANACharSetName which should perform a lookup
in the Apple database.
msg71074 - (view) Author: (cfr) Date: 2008-08-13 00:34
Interesting. At least the "39" makes sense. I don't understand the
documentation well enough to know what the "79" is about.

I'm sorry but I can't work out what I should do with:

  printf("Encoding is %x\n", enc);

Am I meant to use this in python, a standard shell or something else? I
tried in a bash shell and a python interpreter (after undoing my "work
around") and both gave errors - a syntax error in the case of bash; a
complaint about printf being unrecognised in python. I also tried
"import os, sys, locale" first just in case.

  bash: syntax error near unexpected token `"Encoding is %x\n",'

  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  NameError: name 'printf' is not defined

Sorry for being dumb about this.
msg71075 - (view) Author: (cfr) Date: 2008-08-13 00:49
Just realised what I'm meant to do with it. Sorry - it is late (early,
actually). Will report back when I get a chance to recompile.
msg71079 - (view) Author: (cfr) Date: 2008-08-13 11:24
It returns 27.
msg71081 - (view) Author: (cfr) Date: 2008-08-13 12:33
I noticed there is an issue ( with
Japanese Python users on Macs because the relevant codec is removed in
Tools/unicode/Makefile. That file also removes a number of other codecs,
including Mac Celtic. I just wondered if this might be related in some
way because that issue report mentioned problems with getdefaultlocale etc.
msg71102 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-14 00:30
Ok, now that we have established that the user's encoding is supposed to
be mac-celtic, I think I understand the problem: there simply isn't any
IANA charset name for the mac-celtic encoding, so
CFStringConvertEncodingToIANACharSetName doesn't return any.

If we want to support these systems, I think we need to add another
switch case for mac-celtic. That alone won't be sufficient, as we then
also need an implementation of the encoding, i.e. change Tools/unicode
to preserve mac-celtic.

To better detect that case in the future, it might be useful to return
mac-unsupported as the script name if it isn't in the switch statement
and doesn't have a IANA name, and then alias mac-unknown to ASCII.
msg71104 - (view) Author: (cfr) Date: 2008-08-14 01:56
Do you happen to know why it is returning 27? Is that correct or should
it be returning something else (e.g. 39)?
msg71108 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-14 05:50
0x27==39. It's all fine.
msg119599 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-26 10:44
Is it still reproduceable with 2.7, 3.1 or 3.2?
msg119627 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-10-26 18:16
This was fixed by the changes for Issue6202: 2.7 (r73270) and 3.1 (r73268).  They removed the use of the obsolete MacOS encoding APIs and now use standard POSIX detection.
Date User Action Args
2010-10-26 18:16:58ned.deilysetstatus: open -> closed

nosy: + ned.deily
messages: + msg119627

superseder: Obsolete default file encoding "mac-roman" on OS X, not influenced by locale env variables
resolution: duplicate
2010-10-26 10:44:55pitrousetmessages: + msg119599
2010-08-03 18:01:52terry.reedysetversions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 2.5, Python 3.0
2010-06-22 10:32:05ronaldoussorensetassignee: ronaldoussoren
components: + macOS, - None
nosy: loewis, ronaldoussoren, janssen, pitrou, cfr
2010-05-17 21:25:26pitrousetnosy: + ronaldoussoren, janssen
2008-08-14 05:50:08loewissetmessages: + msg71108
2008-08-14 01:56:28cfrsetmessages: + msg71104
2008-08-14 00:30:10loewissetmessages: + msg71102
2008-08-13 12:33:10cfrsetmessages: + msg71081
2008-08-13 11:24:39cfrsetmessages: + msg71079
2008-08-13 00:49:36cfrsetmessages: + msg71075
2008-08-13 00:35:01cfrsetmessages: + msg71074
2008-08-12 07:19:40loewissetmessages: + msg71046
2008-08-12 00:17:42cfrsetmessages: + msg71038
2008-08-09 17:29:20loewissetmessages: + msg70941
2008-08-09 16:55:59pitrousetversions: + Python 2.6, Python 3.0
2008-08-09 16:54:03pitrousetpriority: critical
nosy: + pitrou
messages: + msg70937
2008-07-22 21:41:42cfrsetmessages: + msg70164
2008-07-18 12:31:58cfrsetmessages: + msg69950
2008-07-17 00:05:21cfrsetmessages: + msg69860
2008-07-16 12:50:30cfrsetmessages: + msg69775
2008-07-16 02:54:54loewissetmessages: + msg69752
2008-07-16 02:06:57cfrsetmessages: + msg69746
2008-07-16 01:41:18cfrsetmessages: + msg69744
2008-07-15 20:51:40loewissetnosy: + loewis
messages: + msg69717
2008-07-15 14:40:13cfrsetmessages: + msg69687
2008-07-15 14:35:09cfrsettype: behavior -> crash
title: python version incorrectly reported in crash reports on Mac OS X 10.4.11 PPC -> locale.getpreferredencoding() gives bus error on Mac OS X 10.4.11 PPC
2008-07-15 14:31:52cfrsettype: crash -> behavior
messages: + msg69685
components: + None
title: locale.getpreferredencoding() gives bus error on Mac OS X 10.4.11 PPC -> python version incorrectly reported in crash reports on Mac OS X 10.4.11 PPC
2008-07-15 14:22:30cfrcreate