classification
Title: unterminated string literal tokenization error messages could be better
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, alex, ammar2, benjamin.peterson, miss-islington, pablogsal, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-04-03 17:58 by benjamin.peterson, last changed 2021-01-20 21:56 by BTaskaya. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19346 merged BTaskaya, 2020-04-03 18:37
Messages (10)
msg365713 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2020-04-03 17:58
It has been pointed out to me that the errors the tokenizer produces for unterminated strings, "EOL while scanning string literal" and "EOF while scanning triple-quoted string literal", contain parsing jargon that make it difficult for new users to understand the problem, likely a missing quote.
msg365730 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-03 22:07
It could be even better. Inside the tokenizer we know where the string literal starts and what quotes it uses. The line and the offset of the *start* of the literal can be set in a SyntaxError.
msg365743 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2020-04-04 03:56
Here's my suggestion:

End of line reached without finding the end of string literal. Are you missing a closing quote?
msg365749 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2020-04-04 04:50
Just re-posting this here from the open PR. Rust's handling of this seems nice and beginner friendly:

  error: unterminated double quote string
   --> src/main.rs:2:19
    |
  2 |       let message = "Hello world
    |  ___________________^
  3 | |     println!(message);
  4 | | }
    | |_^

Like Serhiy suggested, it points to the /start/ of the string, rather than the EOL and the message seems nice too.
msg365765 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020-04-04 14:39
>>> message = "sadsa
  File "<stdin>", line 1
    message = "sadsa
              ^
SyntaxError: unterminated double quote
msg366036 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-09 07:46
I afraid there may be confusion between triple, double and single quoted string literals. So I suggest to change error messages to just "unterminated triple-quoted string literal" and "unterminated string literal" (or "unterminated single-quoted string literal"). Terms "triple-quoted" and "single-quoted" are used several times in the documentation. Term "double-quoted" is used only once, and I suppose in different meaning.
msg366303 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020-04-13 11:00
Fair point. I changed error messages to what you suggested

>>> (300) * 6 + ca(e, 2 +    "dsadsa)
  File "<stdin>", line 1
    (300) * 6 + ca(e, 2 +    "dsadsa)
                             ^
SyntaxError: unterminated string literal

>>> (300) * 6 + ca(e, 2 +    'dsadsa)
  File "<stdin>", line 1
    (300) * 6 + ca(e, 2 +    'dsadsa)
                             ^
SyntaxError: unterminated string literal


>>> (300) * 6 + ca(e, 2 +    """dsadsa
... 
  File "<stdin>", line 1
    (300) * 6 + ca(e, 2 +    """dsadsa
                             ^
SyntaxError: unterminated triple-quoted string literal
msg369541 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-05-21 20:42
New changeset 72e0aa2fd2b9c6da2caa5a9ef54f6495fc2890b0 by Batuhan Taskaya in branch 'master':
bpo-40176: Improve error messages for trailing comma on from import (GH-20294)
https://github.com/python/cpython/commit/72e0aa2fd2b9c6da2caa5a9ef54f6495fc2890b0
msg369542 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-05-21 21:05
New changeset 275d7e1080d0007a82965d1ac510abb0ae8d7821 by Pablo Galindo in branch '3.9':
[3.9] bpo-40176: Improve error messages for trailing comma on from import (GH-20294) (GH-20302)
https://github.com/python/cpython/commit/275d7e1080d0007a82965d1ac510abb0ae8d7821
msg385374 - (view) Author: miss-islington (miss-islington) Date: 2021-01-20 21:38
New changeset a698d52c3975c80b45b139b2f08402ec514dce75 by Batuhan Taskaya in branch 'master':
bpo-40176: Improve error messages for unclosed string literals (GH-19346)
https://github.com/python/cpython/commit/a698d52c3975c80b45b139b2f08402ec514dce75
History
Date User Action Args
2021-01-20 21:56:51BTaskayasetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.10, - Python 3.9
2021-01-20 21:38:57miss-islingtonsetnosy: + miss-islington
messages: + msg385374
2020-05-21 21:17:38pablogsalsetpull_requests: - pull_request19576
2020-05-21 21:05:02pablogsalsetmessages: + msg369542
2020-05-21 21:04:57pablogsalsetpull_requests: + pull_request19576
2020-05-21 20:58:18pablogsalsetpull_requests: - pull_request19573
2020-05-21 20:43:30pablogsalsetpull_requests: + pull_request19573
2020-05-21 20:42:01pablogsalsetnosy: + pablogsal
messages: + msg369541
2020-05-21 18:17:44BTaskayasetpull_requests: - pull_request19569
2020-05-21 18:01:35BTaskayasetpull_requests: + pull_request19569
2020-04-13 11:00:32BTaskayasetmessages: + msg366303
2020-04-09 07:46:22serhiy.storchakasetmessages: + msg366036
2020-04-04 14:39:21BTaskayasetmessages: + msg365765
2020-04-04 04:50:34ammar2setnosy: + ammar2
messages: + msg365749
2020-04-04 03:56:15alexsetnosy: + alex
messages: + msg365743
2020-04-03 22:07:34serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg365730
2020-04-03 18:37:31BTaskayasetkeywords: + patch
nosy: + BTaskaya

pull_requests: + pull_request18710
stage: patch review
2020-04-03 17:58:44benjamin.petersoncreate