Author meador.inge
Recipients ajaksu2, akuchling, benjamin.peterson, brett.cannon, kristjan.jonsson, loewis, meador.inge
Date 2010-01-27.04:31:08
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1264566673.25.0.797987507369.issue3367@psf.upfronthosting.co.za>
In-reply-to
Content
I think this was fixed with checkins r76689 and r76230, made by Benjamin. Since we are using "exec ''" as the reproduction case, the token state is setup in 'PyTokenizer_FromString', which causes 'tok->inp == ""'.  The code before these checkins (see attached revert patch) caused the following else branch in 'tok_nextc' to be taken:
   char *end = strchr(tok->inp, '\n');
   if (end != NULL)
      end++;
   else {
      end = strchr(tok->inp, '\0');
      if (end == tok->inp) {
         tok->done = E_EOF;
	 return EOF;
      }
   }
   if (tok->start == NULL)
      tok->buf = tok->cur;
   tok->line_start = tok->cur;
   tok->lineno++;
   tok->inp = end;
   return Py_CHARMASK(*tok->cur++);
because under these circumstances 'tok->inp == ""'.  Thus 'tok->line_start' is not assigned. This trickled back out to 'parsetok:159' followed by 'parsetok:187' where 'tok->line_start' gets read unitialized.

After r76689 and r76230 the call to 'translate_newlines' was added in 
'decode_str' which is called from 'PyTokenizer_FromString' when the token state is created.  The 'translate_newlines' call adds a newline to the end of the input buffer which ends up causing 'tok->input == "\n"'.  Thus when 'tok_nextc' is called the initial if branch is taken instead of the else and 'tok->line_start' is initialized properly.

I also verified the current trunk with valgrind, which now shows no issue with this particular scenario:

euclid:trunk minge$ valgrind ./python.exe -c "exec ''"
==77940== Memcheck, a memory error detector
==77940== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==77940== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==77940== Command: ./python.exe -c exec\ ''
==77940== 
--77940-- ./python.exe:
--77940-- dSYM directory has wrong UUID; consider using --dsymutil=yes
==77940== Conditional jump or move depends on uninitialised value(s)
==77940==    at 0x29D99D: __setenv (in /usr/lib/libSystem.B.dylib)
==77940==    by 0x2E9354: putenv$UNIX2003 (in /usr/lib/libSystem.B.dylib)
==77940==    by 0x165217: posix_putenv (in ./python.exe)
==77940==    by 0x6E422: PyCFunction_Call (in ./python.exe)
==77940==    by 0x10E971: call_function (in ./python.exe)
==77940==    by 0x1095FE: PyEval_EvalFrameEx (in ./python.exe)
==77940==    by 0x10EE3B: fast_function (in ./python.exe)
==77940==    by 0x10EB47: call_function (in ./python.exe)
==77940==    by 0x1095FE: PyEval_EvalFrameEx (in ./python.exe)
==77940==    by 0x10C073: PyEval_EvalCodeEx (in ./python.exe)
==77940==    by 0x10EF3C: fast_function (in ./python.exe)
==77940==    by 0x10EB47: call_function (in ./python.exe)
==77940== 
[15652 refs]
==77940== 
==77940== HEAP SUMMARY:
==77940==     in use at exit: 590,354 bytes in 4,795 blocks
==77940==   total heap usage: 34,635 allocs, 29,840 frees, 6,689,168 bytes allocated
==77940== 
==77940== LEAK SUMMARY:
==77940==    definitely lost: 0 bytes in 0 blocks
==77940==    indirectly lost: 0 bytes in 0 blocks
==77940==      possibly lost: 451,997 bytes in 4,461 blocks
==77940==    still reachable: 137,793 bytes in 321 blocks
==77940==         suppressed: 564 bytes in 13 blocks
==77940== Rerun with --leak-check=full to see details of leaked memory
==77940== 
==77940== For counts of detected and suppressed errors, rerun with: -v
==77940== Use --track-origins=yes to see where uninitialised values come from
==77940== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
History
Date User Action Args
2010-01-27 04:31:13meador.ingesetrecipients: + meador.inge, loewis, akuchling, brett.cannon, kristjan.jonsson, ajaksu2, benjamin.peterson
2010-01-27 04:31:13meador.ingesetmessageid: <1264566673.25.0.797987507369.issue3367@psf.upfronthosting.co.za>
2010-01-27 04:31:11meador.ingelinkissue3367 messages
2010-01-27 04:31:09meador.ingecreate