|
msg72036 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-08-27 18:05 |
The explanation is quite simple: in Py_Main, the arguments are converted
from wide to byte strings, but the required length of the byte string is
assumed equal to that of the wide string.
Which gives:
$ ./python -c "print('à')"
Fatal Python error: not enough memory to copy -c argument
Erreur de segmentation (core dumped)
$ ./python -m à
Fatal Python error: not enough memory to copy -m argument
Erreur de segmentation (core dumped)
|
|
msg72040 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-08-27 19:17 |
Here is a patch which works under Linux. Under Windows it doesn't choke
when converting arguments anymore, but it fails later in the process (in
the parser for '-c', in the importing logic for '-m').
Here is an example:
$ ./python -c "print(ord('ሀ'))"
4608
$ cat > ሀ.py
print(__file__)
$ ./python -m ሀ
/home/antoine/py3k/mbstowcs/ሀ.py
$ ./python ሀ.py
ሀ.py
|
|
msg72682 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2008-09-06 18:41 |
Hmm. I suppose anything is better than segfaulting. I think the patch is
fine for now, though.
|
|
msg72692 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-09-06 20:47 |
Committed in r66269.
|
|
msg72714 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-06 22:20 |
This patch corrects the "-m" case on windows: the path has to be
decoded/recoded using the filesystem encoding, and not the default utf-8.
Review is needed, of course.
|
|
msg72720 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-09-06 22:50 |
Looks good and works under Linux.
One small nit, you could just as well use "NN(ssi)" for the
Py_BuildValue and remove Py_DECREF(fob), so as to be more consistent.
|
|
msg72771 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-08 08:59 |
Updated patch.
|
|
msg72773 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-08 12:24 |
./python -c "print('à')"
does not work on my Linux machine with latest py3k (r66303), certainly
because my terminal uses a latin-1 encoding: wcstombs will convert the
argument back to the terminal encoding, whereas PyRun_SimpleString
expects a UTF-8 string.
I join another patch, which propagates the wchar_t as far as possible,
and encodes it as utf-8; with test.
This also corrects the Windows case.
|
|
msg72800 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2008-09-08 22:31 |
I think the patch good; go ahead.
|
|
msg72826 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-09 07:07 |
Applied both patches as r66331.
|
|
msg72828 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-09 07:37 |
Unfortunately, my patch does not work: see the compile warnings in "main.c":
http://www.python.org/dev/buildbot/3.0/x86%20osx.5%203.0/builds/344/step-compile/0
I reverted the change, and will try something else...
|
|
msg73533 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-21 21:36 |
Today I learned something: wchar_t can be 2 or 4 bytes, PyUNICODE can be
2 or 4 bytes, and all combinations are possible.
My error was to use PyUnicode_FromUnicode on a wchar_t*; PyUnicode_FromWideChar is the obvious function to use.
Attached a new patch (command_unicode_2.patch) for review.
|
|
msg75764 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-11-11 22:23 |
Raising to release blocker, just to trigger another review...
|
|
msg75765 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2008-11-11 22:47 |
Go ahead.
|
|
msg75768 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-11-11 23:05 |
Fixed as r67190. Thanks for the review.
|
|
| Date |
User |
Action |
Args |
| 2008-11-11 23:05:34 | amaury.forgeotdarc | set | status: open -> closed resolution: fixed messages:
+ msg75768 |
| 2008-11-11 22:47:35 | benjamin.peterson | set | keywords:
- needs review messages:
+ msg75765 |
| 2008-11-11 22:23:05 | amaury.forgeotdarc | set | priority: critical -> release blocker messages:
+ msg75764 |
| 2008-09-21 21:36:56 | amaury.forgeotdarc | set | priority: high -> critical keywords:
+ needs review messages:
+ msg73533 files:
+ command_unicode_2.patch |
| 2008-09-09 07:37:16 | amaury.forgeotdarc | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg72828 |
| 2008-09-09 07:07:13 | amaury.forgeotdarc | set | status: open -> closed resolution: fixed messages:
+ msg72826 |
| 2008-09-08 22:31:28 | benjamin.peterson | set | messages:
+ msg72800 |
| 2008-09-08 12:24:52 | amaury.forgeotdarc | set | files:
+ command_unicode.patch messages:
+ msg72773 |
| 2008-09-08 09:00:00 | amaury.forgeotdarc | set | files:
+ find_module_unicode_2.patch messages:
+ msg72771 |
| 2008-09-06 22:50:45 | pitrou | set | messages:
+ msg72720 |
| 2008-09-06 22:20:19 | amaury.forgeotdarc | set | files:
+ find_module_unicode.patch nosy:
+ amaury.forgeotdarc messages:
+ msg72714 |
| 2008-09-06 20:47:48 | pitrou | set | priority: deferred blocker -> high type: crash -> behavior messages:
+ msg72692 title: py3k aborts if "-c" or "-m" is given a non-ascii value -> py3k fails under Windows if "-c" or "-m" is given a non-ascii value |
| 2008-09-06 18:41:36 | benjamin.peterson | set | keywords:
- needs review nosy:
+ benjamin.peterson messages:
+ msg72682 |
| 2008-09-04 01:20:10 | benjamin.peterson | set | priority: release blocker -> deferred blocker |
| 2008-08-28 00:18:27 | pitrou | set | keywords:
+ needs review nosy:
+ loewis |
| 2008-08-27 20:43:27 | amaury.forgeotdarc | set | priority: release blocker |
| 2008-08-27 19:17:16 | pitrou | set | files:
+ convert_args.patch keywords:
+ patch messages:
+ msg72040 |
| 2008-08-27 18:05:40 | pitrou | create | |