This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author swgillespie
Recipients William Bowling, serhiy.storchaka, swgillespie
Date 2016-02-21.22:27:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Went ahead and did it since I had the time - the issue is that when doing a token of lookahead to see whether an 'async' at a top-level begins an 'async def' function or if it is an identifier. A shallow copy of the current token is made and given to another call to tok_get, which frees the token's buffer if a decoding error occurs. Since the shallow copy cloned the token's buffer pointer, the still-live token contains a freed pointer to its buffer that gets freed again later on.

By explicitly nulling-out the token's buffer pointer like tok_get does if the copied token's buffer pointer was nulled out, we avoid the double-free issue and present the correct syntax error:

$ ./python 
  File "", line 1
SyntaxError: Non-UTF-8 code starting with '\xef' in file on line 2, but no encoding declared; see for details

William Bowling's second program is also fixed with this change, with one additional wrinkle: if a token contains a null byte as the
first character, an invalid write occurs when we attempt to replace the null character with a newline. This fix checks to make sure
that this is not the case before performing the newline insertion.

With this change, both of William Bowling's programs pass valgrind and
present the appropriate syntax error. I tried to add this to the couroutine syntax tests, but any way to load the file outside of giving it to ./python itself fails (correctly) because the program contains a null byte.
Date User Action Args
2016-02-21 22:27:16swgillespiesetrecipients: + swgillespie, serhiy.storchaka, William Bowling
2016-02-21 22:27:15swgillespiesetmessageid: <>
2016-02-21 22:27:15swgillespielinkissue26000 messages
2016-02-21 22:27:15swgillespiecreate