Message225235
This is how shell quoting in commands.mkarg() is implemented:
def mkarg(x):
if '\'' not in x:
return ' \'' + x + '\''
s = ' "'
for c in x:
if c in '\\$"`':
s = s + '\\'
s = s + c
s = s + '"'
return s
This is unfortunately not compatible with the way bash splits arguments in some locales.
The problem is that in a few East Asian encodings (at least BIG5, BIG5-HKSCS, GB18030, GBK), the 0x5C byte (backslash in ASCII) could be the second byte of a two-byte character; and bash apparently decodes the strings before splitting.
PoC:
$ sh --version | head -n1
GNU bash, version 4.3.22(1)-release (i486-pc-linux-gnu)
$ LC_ALL=C python test-mkargs.py
crw-rw-rw- 1 root root 1, 3 Aug 12 16:00 /dev/null
ls: cannot access " ; python -c 'import this' | grep . | shuf | head -n1 | cowsay -y ; ": No such file or directory
$ LC_ALL=zh_CN.GBK python test-mkargs.py
crw-rw-rw- 1 root root 1, 3 8月 12 16:00 /dev/null
ls: 无法访问乗: No such file or directory
________________________________
< Simple is better than complex. >
--------------------------------
\ ^__^
\ (..)\_______
(__)\ )\/\
||----w |
|| ||
sh: 乗: 未找到命令 |
|
Date |
User |
Action |
Args |
2014-08-12 18:13:05 | jwilk | set | recipients:
+ jwilk |
2014-08-12 18:13:05 | jwilk | set | messageid: <1407867185.09.0.871560290901.issue22187@psf.upfronthosting.co.za> |
2014-08-12 18:13:05 | jwilk | link | issue22187 messages |
2014-08-12 18:13:04 | jwilk | create | |
|