This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Support --disable-unicode
Type: Stage:
Components: Build Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: lemburg Nosy List: lemburg, loewis
Priority: normal Keywords: patch

Created on 2001-07-29 21:13 by loewis, last changed 2022-04-10 16:04 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
nounicode.patch loewis, 2001-08-02 06:09 v3
revised-nounicode.patch lemburg, 2001-08-02 17:21 revised patch (v4)
test.patch nobody, 2001-08-03 07:44
pickle.patch loewis, 2001-08-03 07:45
Messages (10)
msg37109 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-07-29 21:13
This patch implements the option --disable-unicode.
In particular, it:
- does not compile unicodeobject, unicodectype, 
_codecsmodule, and unicodedata if Unicode is disabled
- checks for Py_Unicode in all places that use 
Unicode functions
- disables unicode literals, the builtin functions, 
and the string encode and decode methods,
- avoids Unicode literals in a few places in the 
libraries
- adds the types.StringTypes list

Most of the test suite passes with these changes. A 
number of tests fail, mostly because they use Unicode 
literals.
msg37110 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2001-07-30 13:06
Logged In: YES 
user_id=38388

Nice work, Martin !

Some comments:
- I think that we could save some of the #ifdefs by simply assuming that an optimizing will not generate code for "if 
(0)" == "if (PyUnicode_Check(obj))"; this would make the code more readable
- the _codecmodule.c should not be disabled by the configure option... codecs are useful for non-Unicode 
applications as well
- the PyString_Encode/Decode() APIs should not be disabled for the same reason
- the tokenizer/compiler should generate errors with an explicit message stating that the Python version was 
compiled without Unicode support
- dito for the Unicode parser markers (I think that open() on Windows will fail without "es"... ?)
msg37111 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-07-30 14:30
Logged In: YES 
user_id=21627

This patch already makes use of the assumption that
PyUnicode_Check will always return 0. In all the remaining
cases, the code will also call some function of the Unicode
module, which will result in a compile time error since the
functions are not declared anymore. Even if it was declared,
it would probably result in a linker error since not all
compilers will remove the entire code block. Only in cases
where the if-block does not call any Unicode functions
directly, that approach can be used.

I can try to re-enable the _codecs module, although only
register and lookup would remain.

I cannot re-enable PyString_Decode/Encode, since they use 
PyUnicode_GetDefaultEncoding, which is not available since
unicodeobject.c is not compiled.

I will try to have the tokenizer generate more specific
error messages.

Support for "es", "et" is still there; they only work for
strings, though, and they never call any codecs.
msg37112 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2001-07-30 14:39
Logged In: YES 
user_id=38388

Ok, I see your point about the API references.

About the PyString_Encode/Decode: on platforms without Unicode, the encoding should not have a default, so 
passing NULL as encoding should result in an error. I am not even sure, whether it should have a default on 
Unicode builds... probably not.

Trimming down the _codecmodule.c to register and lookup is OK; there are a few codecs in 2.2 which don't
use Unicode at all.
msg37113 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-07-31 07:48
Logged In: YES 
user_id=21627

The new version of the patch implements all features that 
have been discussed.
msg37114 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-07-31 07:56
Logged In: YES 
user_id=21627

Replaced patch, since it contained unrelated fragments.
msg37115 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-08-02 06:09
Logged In: YES 
user_id=21627

Updated patch after merger with descr_branch.
msg37116 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2001-08-02 16:29
Logged In: YES 
user_id=38388

Uploaded a revised patch. The test suite still fails -- it
would be nice if you could work this out; I don't want to
check the patch in before the test suite runs through
without failures.

Thanks.
msg37117 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-08-03 07:45
Logged In: YES 
user_id=21627

I've added an additional test.patch file, which only 
records the changes to Lib/test. With this patch, I get 
the following failures:
test_grammar test___all__ test_charmapcodec test_codecs 
test_gettext test_minidom test_pyexpat test_sax 
test_string test_ucn test_unicode test_unicodedata 
test_urllib test_zipfile1

I don't think this list cannot be reduced much further 
without seriously impacting the strength of the test suite.

To reduce the number of failures to this list, I also had 
to modify pickle.py to not use Unicode literals anymore. 
I'm not sure whether this is a good idea, as it impacts 
performance; the pickle.patch is attached separately.

msg37118 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2001-08-17 18:44
Logged In: YES 
user_id=21627

Committed as

Makefile.pre.in:1.53
configure:1.240
configure.in:1.248
setup.py:1.50
Include/intobject.h:2.22
Include/longobject.h:2.21
Include/object.h:2.86
Include/unicodeobject.h:2.31
Lib/ConfigParser.py:1.36
Lib/copy.py:1.20
Lib/site.py:1.35
Lib/types.py:1.20
Lib/test/pickletester.py:1.7
Lib/test/string_tests.py:1.10
Lib/test/test_b1.py:1.38
Lib/test/test_contains.py:1.8
Lib/test/test_format.py:1.12
Lib/test/test_iter.py:1.18
Lib/test/test_pprint.py:1.5
Lib/test/test_sre.py:1.27
Lib/test/test_support.py:1.25
Lib/test/test_winreg.py:1.10
Misc/NEWS:1.207
Modules/_codecsmodule.c:2.9
Modules/_sre.c:2.63
Modules/_tkinter.c:1.119
Modules/cPickle.c:2.62
Modules/pyexpat.c:2.48
Objects/abstract.c:2.72
Objects/complexobject.c:2.39
Objects/floatobject.c:2.86
Objects/intobject.c:2.62
Objects/longobject.c:1.92
Objects/object.c:2.139
Objects/stringobject.c:2.124
Python/bltinmodule.c:2.227
Python/compile.c:2.218
Python/getargs.c:2.62
Python/marshal.c:1.65
Python/modsupport.c:2.58
Python/pythonrun.c:2.147
Python/sysmodule.c:2.92

History
Date User Action Args
2022-04-10 16:04:15adminsetgithub: 34855
2001-07-29 21:13:36loewiscreate