This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Setting long domain of locale.dgettext() crashes Python interpreter
Type: crash Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, serhiy.storchaka, xxm
Priority: normal Keywords:

Created on 2021-03-23 01:54 by xxm, last changed 2022-04-11 14:59 by admin.

Messages (4)
msg389363 - (view) Author: Xinmeng Xia (xxm) Date: 2021-03-23 01:54
Setting the first argument of locale.dgettext() long string, Python interpreter crashes. 

======================================================
Python 3.10.0a6 (default, Mar 19 2021, 11:45:56) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale;locale.dgettext('abs'*10000000,'')
Segmentation fault (core dumped)
======================================================

System: Ubuntu 16.04

BTW, the api of module locale seems to be inconsistent between Ubuntu and Mac OS.  E.g.  there is no dgettext() for Python on Mac OS.
msg390279 - (view) Author: Xinmeng Xia (xxm) Date: 2021-04-06 05:42
Attached testing results of gdb and valgrind. (No error is reported for locale.dgettext('abs'*10,''))


$gdb ./python
(gdb) run
>>> locale.dgettext('abs'*10000000,'')

Program received signal SIGSEGV, Segmentation fault.
__dcigettext (
    domainname=domainname@entry=0xadb030 "absabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsab"..., msgid1=msgid1@entry=0x7ffff7fc09a0 "", msgid2=msgid2@entry=0x0, 
    plural=plural@entry=0, n=n@entry=0, category=category@entry=5) at dcigettext.c:675
675	dcigettext.c: No such file or directory.
(gdb)


valgrind
~$ PYTHONMALLOC=malloc_debug valgrind python
Memcheck, a memory error detector
==4870== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4870== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==4870== Command: /home/xxm/Desktop/apifuzz/Python-3.10.0a6/python
==4870== 
Python 3.10.0a6 (default, Mar 19 2021, 11:45:56) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> locale.dgettext('abs'*10000000,'')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'locale' is not defined
>>> import locale
>>> locale.dgettext('abs'*10000000,'')
==4870== Warning: client switching stacks?  SP change: 0x1ffefff5c0 --> 0x1ffd363220
==4870==          to suppress, use: --max-stackframe=30000032 or greater
==4870== Invalid write of size 8
==4870==    at 0x5797E88: __dcigettext (dcigettext.c:675)
==4870==  Address 0x1ffd363218 is on thread 1's stack
==4870== 
==4870== 
==4870== Process terminating with default action of signal 11 (SIGSEGV)
==4870==  Access not within mapped region at address 0x1FFD363218
==4870==    at 0x5797E88: __dcigettext (dcigettext.c:675)
==4870==  If you believe this happened as a result of a stack
==4870==  overflow in your program's main thread (unlikely but
==4870==  possible), you can try to increase the size of the
==4870==  main thread stack using the --main-stacksize= flag.
==4870==  The main thread stack size used in this run was 8388608.
==4870== Invalid write of size 8
==4870==    at 0x4A2867A: _vgnU_freeres (vg_preloaded.c:57)
==4870==  Address 0x1ffd363210 is on thread 1's stack
==4870== 
==4870== 
==4870== Process terminating with default action of signal 11 (SIGSEGV)
==4870==  Access not within mapped region at address 0x1FFD363210
==4870==    at 0x4A2867A: _vgnU_freeres (vg_preloaded.c:57)
==4870==  If you believe this happened as a result of a stack
==4870==  overflow in your program's main thread (unlikely but
==4870==  possible), you can try to increase the size of the
==4870==  main thread stack using the --main-stacksize= flag.
==4870==  The main thread stack size used in this run was 8388608.
==4870== 
==4870== HEAP SUMMARY:
==4870==     in use at exit: 35,310,749 bytes in 35,706 blocks
==4870==   total heap usage: 87,221 allocs, 51,515 frees, 44,733,752 bytes allocated
==4870== 
==4870== LEAK SUMMARY:
==4870==    definitely lost: 0 bytes in 0 blocks
==4870==    indirectly lost: 0 bytes in 0 blocks
==4870==      possibly lost: 35,173,680 bytes in 34,899 blocks
==4870==    still reachable: 137,069 bytes in 807 blocks
==4870==         suppressed: 0 bytes in 0 blocks
==4870== Rerun with --leak-check=full to see details of leaked memory
==4870== 
==4870== For lists of detected and suppressed errors, rerun with: -s
==4870== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
msg390282 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-04-06 06:51
The crash occurs inside glibc's dgettext() implementation. Its man page does not list any limitation for domain or msgid length. This looks like a bug in glibc.

#0  0x00007ffff7c57a8f in __dcigettext () from /lib64/libc.so.6
#1  0x000000000058a235 in _locale_dgettext_impl (in=0x7fffea64d8e0 "", 
    domain=0x7fffe874e040 "absabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsab"..., module=<optimized out>) at ./Modules/_localemodule.c:662
msg390285 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-06 07:25
__dcigettext() contains:

  domainname_len = strlen (domainname);
  xdomainname = (char *) alloca (strlen (categoryname)
				 + domainname_len + 5);

It tries to allocate a buffer on stack, and for domain name causes stack overflow.

There is no portable way to restore after stack overflow or to check it ahead. We can add arbitrary limit for the length of domain name, but it does not guarantee anything. It is just yet one way to crash Python from Python code.
History
Date User Action Args
2022-04-11 14:59:43adminsetgithub: 87765
2021-04-06 07:25:32serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg390285
2021-04-06 06:51:18christian.heimessetnosy: + christian.heimes
messages: + msg390282
2021-04-06 05:42:07xxmsetmessages: + msg390279
2021-03-23 01:54:58xxmcreate