msg269022 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-21 20:34 |
Attached patch deprecates invalid escape sequences in unicode strings. The point of this is to prevent issues such as #27356 (and possibly other similar ones) in the future.
Without the patch:
>>> "hello \world"
'hello \\world'
With the patch:
>>> "hello \world"
DeprecationWarning: invalid escape sequence 'w'
I'll need some help (patch isn't mergeable yet):
test_doctest fails on my machine with the patch (and -W), and I don't know how to fix it. test_ast fails an assertion (!PyErr_Occurred() in PyObject_Call in abstract.c) when -W is on, and I also don't know how to fix it (I don't even know what causes it).
Of course, I went ahead and fixed all instances of invalid escape sequences in the stdlib (that I could find) so that no DeprecationWarning is encountered.
Lastly, I thought about also doing this to bytes, but I ran into some issues with some invalid escapes such as \u, and _codecs.escape_decode would trigger the warning when passed br"\8" (for example). Ultimately, I decided to leave bytes alone for now, since it's mostly on the lower-level side of things. If there's interest I can add it back.
|
msg269114 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-06-23 14:41 |
Have you searched the python-dev and python-ideas archives for the previous discussions of this issue? I don't remember for sure, but I think Guido might have made a ruling (not that the discussion couldn't be reopened if he has, but, well...)
|
msg269119 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-23 15:26 |
Now I have! I found nothing on Python-Dev, but apparently it's been discussed on Python-ideas before: https://mail.python.org/pipermail/python-ideas/2015-August/035031.html Guido hasn't participated in that discussion, and most of it was "This will break people's code", with people both for and against the idea, without an apparent consensus.
Should I try a second round on Python-ideas, to try and get a consensus (or a BDFL ruling)?
|
msg269122 - (view) |
Author: Antti Haapala (ztane) * |
Date: 2016-06-23 15:59 |
it is handy to be able to use `\w` and `\d` in non-raw-string *regular expressions*, without too much backslashitis. Seems to be in use in Python standard library as well, for example in csv.py
|
msg269152 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-24 02:59 |
Yes, it's in use in an awful lot of places (see my patch). The proper fix is to use raw strings, or, if you need actual escapes in the same string, manually escape them. However, as you'll see by looking at the patch, the vast majority of cases are fixed by prepending a single 'r' to the front of the string. In fact, only csv.py and html/parser.py needed more finer-grained escaping.
I think that the argument "It works in non-raw strings" is weak. I've always used raw strings for regular expressions, and this patch would simply move this from being a style issue to being a syntax one (and I think it's fine :).
|
msg269155 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-06-24 04:22 |
There was a long discussion on Python-Dev. [1] Guido taken part in it.
[1] http://comments.gmane.org/gmane.comp.python.devel/151612
|
msg269156 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-24 04:43 |
Thanks, didn't find that one. Apparently Guido's stance is "Make this a silent warning, then we can discuss about preventing it later", which happens to be what I'm doing here.
|
msg269158 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-24 05:46 |
I found the cause of the failed assertion, an invalid escape sequence slipped through in a file. Patch attached (also with Serhiy's comments).
It worries me a little though that pure Python code can cause a hard crash. Ok, it worries me a lot. Please don't merge this until it's fixed. I'm guessing this is a combination of unittest catching warnings and compiling the faulty source file. As to why a malformed node (i.e. one that raised a DeprecationWarning) managed to pass through unharmed is beyond me.
|
msg269322 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2016-06-26 22:19 |
I am okay with making it a silent warning.
Can we do it in two stages though? It doesn't have to be two releases, I just mean two separate commits: (1) fix all places in the stdlib that violate this principle; (2) separately commit the code that causes the silent deprecation (and tests for it).
What exactly was the hard crash you got? Do you think it was a bug in your own C code or in existing C code?
|
msg269323 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-26 23:04 |
I originally considered making two different patches, so there you go. deprecate_invalid_escapes_only_1.patch has the deprecation plus a test, and invalid_stdlib_escapes_1.patch fixes all invalid escapes in the stdlib.
My code was the cause, although no directly; it was 'assert(!PyErr_Occurred())' at the beginning of PyObject_Call in Objects/abstract.c which failed.
This happened when I ran the whole test suite (although just running test_ast was fine to reproduce it) with the '-W error' command line switch. One stdlib module (I don't remember which one) had one single invalid escape sequence in it, and then test_ast.ASTValidatorTests.test_stdlib_validates triggered the failed assertion. Fixing the invalid escape removes the failure and all tests pass.
One can reliably reproduce the crash with the patch by adding a string with an invalid escape in any of the stdlib files (and running with '-W error'):
No invalid sequence:
>>> import unittest, test.test_ast
>>> unittest.main(test.test_ast)
..............................................................................
----------------------------------------------------------------------
Ran 78 tests in 5.538s
OK
With an invalid sequence in a file:
>>> import unittest, test.test_ast
>>> unittest.main(test.test_ast)
............................................Fatal Python error: a function returned a result with an error set
DeprecationWarning: invalid escape sequence 'w'
During handling of the above exception, another exception occurred:
SystemError: <built-in function compile> returned a result with an error set
Current thread 0x00001ba0 (most recent call first):
File "E:\GitHub\cpython\lib\ast.py", line 35 in parse
File "E:\GitHub\cpython\lib\test\test_ast.py", line 944 in test_stdlib_validates
File "E:\GitHub\cpython\lib\unittest\case.py", line 600 in run
File "E:\GitHub\cpython\lib\unittest\case.py", line 648 in __call__
File "E:\GitHub\cpython\lib\unittest\suite.py", line 122 in run
File "E:\GitHub\cpython\lib\unittest\suite.py", line 84 in __call__
File "E:\GitHub\cpython\lib\unittest\suite.py", line 122 in run
File "E:\GitHub\cpython\lib\unittest\suite.py", line 84 in __call__
File "E:\GitHub\cpython\lib\unittest\runner.py", line 176 in run
File "E:\GitHub\cpython\lib\unittest\main.py", line 255 in runTests
File "E:\GitHub\cpython\lib\unittest\main.py", line 94 in __init__
File "<stdin>", line 1 in <module>
Then I get the usual "Python has stopped working" Windows prompt (strangely enough, before I'd get a prompt saying "Assertion failed" with the line, but not this time).
I'm not sure where the error lies exactly. Should I open another issue for that?
|
msg269326 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2016-06-27 00:05 |
Hm, if you manage to trigger an assert() in the C code by writing some evil
Python code, the C code is considered broken (unless it was using ctypes or
one or two other explicit "void-the-warranty" exceptions).
Maybe someone who has worked more with the C code recently could help you
dig into this more; my memory is unreliable when it comes to these details.
Maybe assert() calls are disabled by default? In general the error "...
returned a result with an error set" means there's a problem at the C level
where a function should have either returned an object or returned NULL
with the per-thread exception state set, but it was found to return an
object *and* set the exception state. IIRC only debug mode checks for that,
so such a bug occasionally creeps into the code. But you shouldn't assume
everything is fine until you've tracked down the cause.
|
msg269329 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-27 00:26 |
Ah right, assert() is only enabled in debug mode, I forgot that. My (very uneducated) guess is that compile() got the error (which was a warning) but then decided to return a value anyway, and the next thing that tries to call anything crashes Python. I opened #27394 to get some experts' advice.
|
msg269332 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-27 01:40 |
Aaand I feel pretty stupid; I didn't check the return value of PyErr_WarnFormat, so it was my mistake. Attached new patch, actually done right this time.
|
msg269333 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-06-27 01:42 |
Hello Emanual, I think I have fixed your problem with -Werror, by handling the exception returned by PyErr_WarnFormat() (see my patch). Thanks for separating the actual change from the escape violation fixes; it made it easier to spot the real problem :)
Also, I like the general idea of the change. It would be good to update the documentation as well (e.g. What’s New, and <https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals>).
It would be good to do the same for byte string literals, at least to keep things consistent. What did you try so far? Do you have a partial patch for it?
|
msg269334 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-06-27 01:43 |
Hah, we posted the same fix almost at the same time :)
|
msg269335 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-27 01:53 |
Indeed, we did, thanks for letting me know my mistake :) I didn't get very far into making bytes literal disallow invalid sequences, as I ran into issues with _codecs.escape_decode throwing the warning even when the literal was fine, and I think I stopped there and figured I'd at least post that patch and see if people are interested in extending that modification to bytes (turns out so).
I forgot about docs, will do so soon, but I'll try to extend the patch for bytes first. I'll see if I can make literals warn but not e.g. _codecs.escape_decode (or anything else, really).
Thanks!
|
msg269340 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-06-27 02:59 |
Code samples in the documentation should also be fixed, like at <https://docs.python.org/3.6/library/re.html#re.split>. I think you can run “make -C Doc doctest” or something similar, which may help find some of these.
Also, playing with your current patch, it seems to affect the “unicode-escape” codec. Not sure if that is a problem, but it probably deserves also documenting the change.
|
msg269358 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-06-27 08:20 |
Guido: "I am okay with making it a silent warning."
The current patch raises a DeprecationWarning which is silent by default, but seen using python3 -Wd. What is the "long term" plan: always raise an *exception* in Python 3.7? Which exception?
Another option is to always emit a SyntaxWarning, but don't raise an exception in long term. It is possible to get an exception using python3 -Werror.
There is also FutureWarning: "Base class for warnings about constructs that will change semantically in the future" or RuntimeWarning "Base class for warnings about dubious runtime behavior".
|
msg269368 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-06-27 11:00 |
DeprecationWarning is used when we want to remove a feature. It becomes an error in the future. FutureWarning is used when we want change the meaning of a feature instead of removing it. For example re.split(':*', 'a:bc') emits a FutureWarning and returns ['a', 'bc'] because there is a plan to make it returning ['', 'a', 'b', 'c', ''].
I think "a silent warning" means that it should emit a DeprecationWarning or a PendingDeprecationWarning. Since there is no haste, we should use 2-releases deprecation period. After this a deprecation can be changed to a SynataxWarning in 3.8 and to a UnicodeDecodeError (for strings) and a ValueError (for bytes) in 4.0. The latter are converted to SyntaxError by parser. At the end we should get the same behavior as for truncated \x and \u escapes.
>>> '\u'
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
>>> b'\x'
File "<stdin>", line 1
SyntaxError: (value error) invalid \x escape at position 0
Maybe change a parser to convert warnings to a SyntaxWarning?
|
msg269372 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-27 11:30 |
I think ultimately a SyntaxError should be fine. I don't know *when* it becomes appropriate to change a warning into an error; I was thinking 3.7 but, as Serhiy said, there's no rush. I think waiting five release cycles is overkill though, that means the error won't be until 8 years from now (assuming release cycle periods don't change)! I think at most 3.8 should be fine for making this a full-on syntax error.
|
msg269373 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-06-27 12:45 |
@ebarry: To move faster, you should also worker with linters (pylint, pychecker, pyflakes, pycodestyle, flake8, ...) to log a warning to help projects to be prepared this change. linters are used on Python 2-only projects, so it will help them to be prepared to the final Python 3.<n> which will raise an exception.
|
msg269376 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-06-27 13:28 |
Yes, this change is likely to break a lot of code, so an extended deprecation period (certainly longer than 3.7, which Guido has already mandated) is the minimum). Guido hasn't agreed to making it an error yet, as far as I can see ;)
|
msg269382 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2016-06-27 15:00 |
I think ultimately it has to become an error (otherwise I wouldn't
have agreed to the warning, silent or not). But because there's so
much 3rd party code that depends on it we indeed need to take
"several" releases before we go there.
Contacting the PyCQA folks would also be a great idea -- can anyone
volunteer to do so?
|
msg269388 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-27 17:02 |
Easing transition is always a good idea. I'll contact the PyCQA people later today when I'm back home.
On afterthought, it makes sense to wait more than two release cycles before making this an error. I don't really have a strong opinion when exactly that should happen.
|
msg269413 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-06-28 01:15 |
Just brought this to the attention of the code-quality mailing list, so linter maintainers should (hopefully!) catch up soon.
Also new patch, I forgot to add '\c' in the tests.
|
msg269416 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-06-28 02:39 |
Forgot to say I reviewed invalid_stdlib_escapes_1.patch the other day and can’t see any problems.
|
msg270765 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-07-18 15:50 |
Here's a new patch which also deprecates invalid escape sequences in bytes. Tests included with test_codecs.
Patch includes and supersedes deprecate_invalid_escapes_only_3.patch, and I have not found a single instance of an invalid escape sequence other than in test_codecs, so this should be fine now.
|
msg272439 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-08-11 11:56 |
I am trying out your patch at the moment. There are plenty of test suite failures; I ran the test suite with approximately the following:
./python -bWerror -m test -Wr -j0 -u network -x test_{mailbox,shelve,faulthandler,multiprocessing_main_handling,venv,warnings}
Importing modules sometimes fails or generates the warning, but this goes away if the file is not out of date. E.g. run “touch Lib/test/test_codecs.py”, and then make sure you next import that module with -Wall or -Werror enabled.
374 tests OK.
10 tests failed:
test___all__ test_ast test_codecs test_doctest test_fstring
test_idle test_strlit test_trace test_unicode
test_zipimport_support
I started pasting some of the failures here, but gave up as more and more failed. Let me know if you want the full details.
======================================================================
ERROR: test_coverage (test.test_trace.TestCoverage)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test_trace.py", line 312, in test_coverage
self._coverage(tracer)
File "/media/disk/home/proj/python/cpython/Lib/test/test_trace.py", line 307, in _coverage
r.write_results(show_missing=True, summary=True, coverdir=TESTFN)
File "/media/disk/home/proj/python/cpython/Lib/trace.py", line 284, in write_results
lnotab = _find_executable_linenos(filename)
File "/media/disk/home/proj/python/cpython/Lib/trace.py", line 403, in _find_executable_linenos
code = compile(prog, filename, "exec")
DeprecationWarning: invalid escape sequence 'w'
**********************************************************************
File "/media/disk/home/proj/python/cpython/Lib/test/test_doctest.py", line 288, in test.test_doctest.test_DocTest
Failed example:
docstring = '''
>>> print(12)
12
Non-example text.
>>> print('another\example')
another
example
'''
Exception raised:
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/doctest.py", line 1330, in __run
compileflags, 1), test.globs)
DeprecationWarning: invalid escape sequence 'e'
**********************************************************************
[Many subsequent NameError exceptions from test_doctest]
**********************************************************************
File "/tmp/tmphzbypj98/test_zip.zip/test_zipped_doctest.py", line 288, in test_zipped_doctest.test_DocTest
Failed example:
docstring = '''
>>> print(12)
12
Non-example text.
>>> print('another\example')
another
example
'''
Exception raised:
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/doctest.py", line 1330, in __run
compileflags, 1), test.globs)
DeprecationWarning: invalid escape sequence 'e'
**********************************************************************
[More failures]
======================================================================
FAIL: test_all (test.test___all__.AllTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test___all__.py", line 105, in test_all
self.check_all(modname)
File "/media/disk/home/proj/python/cpython/Lib/test/test___all__.py", line 28, in check_all
raise FailedImport(modname)
File "/media/disk/home/proj/python/cpython/Lib/contextlib.py", line 89, in __exit__
next(self.gen)
File "/media/disk/home/proj/python/cpython/Lib/test/support/__init__.py", line 1130, in _filterwarnings
raise AssertionError("unhandled warning %s" % reraise[0])
AssertionError: unhandled warning {message : DeprecationWarning("invalid escape sequence '('",), category : 'DeprecationWarning', filename : '/media/disk/home/proj/python/cpython/Lib/importlib/_bootstrap.py', lineno : 222, line : None}
======================================================================
ERROR: test_escape_order (test.test_fstring.TestCase) (str='f\'{"a"\\!r}\'')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test_fstring.py", line 20, in assertAllRaise
eval(str)
DeprecationWarning: invalid escape sequence '!'
======================================================================
ERROR: test_escape (test.test_codecs.EscapeDecodeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test_codecs.py", line 1218, in test_escape
decode(b"\\" + b)
OverflowError: character argument not in range(0x110000)
======================================================================
ERROR: test_escape_decode (test.test_codecs.UnicodeEscapeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test_codecs.py", line 2467, in test_escape_decode
check(br"[\8]", r"[\8]")
File "/media/disk/home/proj/python/cpython/Lib/test/test_codecs.py", line 26, in check
self.assertEqual(coder(input), (expect, len(input)))
DeprecationWarning: invalid escape sequence '8'
test test_unicode crashed -- Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/libregrtest/runtest.py", line 167, in runtest_inner
the_module = importlib.import_module(abstest)
File "/media/disk/home/proj/python/cpython/Lib/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 996, in _gcd_import
File "<frozen importlib._bootstrap>", line 979, in _find_and_load
File "<frozen importlib._bootstrap>", line 968, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 663, in exec_module
File "<frozen importlib._bootstrap_external>", line 770, in get_code
File "<frozen importlib._bootstrap_external>", line 730, in source_to_code
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
DeprecationWarning: invalid escape sequence '?'
|
msg272441 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-08-11 12:15 |
Hmm, that's odd, I recall some of the failures from testing, and thought I fixed them. Some of these are brand new, though, so thanks! I'll run and fix the tests (and modules as well); should likely have a patch by the weekend :)
|
msg272696 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-08-14 21:16 |
Here's a new pair of patches for this. There are some small tweaks to the tests, and I properly fixed all instances of invalid escapes (I also made some strings into raw-strings at some places where it's not needed, solely for consistency with surrounding lines or functions). The patch that fixes the invalid escapes is four times larger than the previous one.
I would also advise to add to PEP 8 a bit recommending that strings used in regular expressions alwaus be raw-strings, even if there's no need to, as a lot (at least 70%) of the invalid escapes fixed were used in regexes.
|
msg274119 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-01 12:41 |
Ping. I'd like to get this merged in time for 3.6. Is there anything I can do to speed up the review?
Since the change itself is very straightforward, I think this would make sense to merge it now and then fix the invalid escapes that are found during the beta phase.
|
msg274120 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-09-01 13:01 |
I think "invalid escape sequence '\?'" would look cleaner than "invalid escape sequence '?'".
|
msg274126 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-01 13:19 |
Thanks Serhiy; it does look better to me too!
|
msg274332 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-09-04 03:29 |
Left some comments for invalid_stdlib_escapes_2.patch
|
msg274475 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-06 00:13 |
Updated and rebased patch. There's a few file tweaks here and there to stay up to date, otherwise it's mostly the same.
Martin, it may look like I've ignored your comments, but I'm trying to keep the patches as simple as possible, and so I don't want to go further than to make strings into raw strings (also the alignment issue you pointed out). I'd rather have the other issues addressed in another issue, as I want to get this merged in time for the feature freeze. The other issues (some which were already present) can be taken care of during the beta phase.
|
msg274806 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-07 12:45 |
Rebased patch after Victor's commit in #16334. Also regenerated invalid_stdlib_escapes_3 in the hopes that Rietveld picks it up.
|
msg274837 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2016-09-07 17:01 |
+1 on getting this in. Who can help reviewing and merging before beta 1?
|
msg274999 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-08 11:26 |
Thank you R. David for the review, here's a new patch with the one change.
|
msg275009 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-09-08 12:50 |
I suggest to not change fixcid.py. It is not correct and there is special issue for this (issue27952).
|
msg275010 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-08 13:00 |
All right, since you'll work on it I'm leaving it out. Removed it and test_bytes (which you already fixed, thanks!) from new patch.
|
msg275084 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-09-08 18:00 |
New changeset b4cc62473c13 by R David Murray in branch 'default':
#27364: fix "incorrect" uses of escape character in the stdlib.
https://hg.python.org/cpython/rev/b4cc62473c13
|
msg275111 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-09-08 18:46 |
Here's a copy of Emanuel's deprecation patch with a versionchanged note in the lexical docs and a whatsnew entry.
|
msg275123 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-09-08 19:34 |
New changeset 38802c38cfe1 by R David Murray in branch 'default':
#27364: Deprecate invalid escape strings in str/byutes.
https://hg.python.org/cpython/rev/38802c38cfe1
|
msg275124 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-08 19:35 |
Thank you David for taking the time to review and commit this :)
|
msg275125 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-09-08 19:36 |
Thanks Emanuel. No bets on how much hate mail we get for this :)
|
msg275219 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2016-09-08 23:51 |
Thank you all for persisting on this. I have seen numerous beginners be puzzled why normal (cooked) strings using '\' for Windows paths sometimes work and sometimes 'mysteriously' do not, as in the initially referenced issue. I also think it better to consistently use 'r' for REs with '\' intended to be passed through to re. (And I pushed some of the IDLE code that was patched.)
|
msg275237 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-09-09 02:37 |
New changeset 60085c8f01fe by R David Murray in branch 'default':
#27364: Credit Emanuel Barry in NEWS item.
https://hg.python.org/cpython/rev/60085c8f01fe
|
msg275298 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-09-09 09:55 |
New changeset 98a57845c8cc by Martin Panter in branch 'default':
Issue #27364: Raw strings to avoid deprecated escaping in com2ann.py
https://hg.python.org/cpython/rev/98a57845c8cc
|
msg275757 - (view) |
Author: (yan12125) * |
Date: 2016-09-11 09:35 |
Currently the deprecation message is not so useful when fixing lots of files in a large project. For example, I have two files foo.py and bar.py:
# foo.py
import bar
# bar.py
print('\d')
It gives:
$ python3.6 -W error foo.py
Traceback (most recent call last):
File "foo.py", line 1, in <module>
import bar
DeprecationWarning: invalid escape sequence '\d'
Things are worse when __import__, imp or importlib are involved. I have to add some codes to show which module is imported.
It would be better to have at least filenames and line numbers:
$ ./python -W error foo.py
Traceback (most recent call last):
File "foo.py", line 1, in <module>
import bar
File "/home/yen/Projects/cpython/build/bar.py", line 1
print('\d')
^
SyntaxError: (deprecated usage) invalid escape sequence '\d'
I have a naive try that prints more information. Raising SyntaxError may not be a good idea, anyway.
|
msg276016 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2016-09-12 10:28 |
Fair enough, but please open a new issue for that.
@Terry - you're welcome; that's exactly the reason I pushed for it :)
|
msg276287 - (view) |
Author: (yan12125) * |
Date: 2016-09-13 15:14 |
Opened a new issue at Issue28128.
|
msg298112 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2017-07-11 03:39 |
One consequence of this change is that now any string that has a backslash needs to be escaped or raw, leading to changes like this on (https://github.com/cherrypy/cherrypy/pull/1610/commits/1d8c03ea8c5fe90f29bbea267300b97c78391c24#diff-be33a4f55d59dfc70fc6452482f3a7a4) where the diagram in the docstring is the culprit. An escaped backslash is not viable in this case, so a raw string is required.
This particular example strikes me as counter-intuitive, though maybe I just need to adjust my intuition.
Was the intention for a docstring like above to use raw strings?
|
msg298114 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2017-07-11 04:14 |
Yes.
|
msg298115 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-07-11 04:15 |
Yes, this was the intention. One of often errors is using "\n" in non-raw docstrings. This change doesn't prevent this error, but increases chances of catching it when there are other backslashes in the docstring.
|
msg298170 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2017-07-11 17:53 |
Also note that we have fixed a number of bugs in the stdlib code where a raw string was not used for a docstring when it should have been. And when I say bugs, I mean both formatting problems in pydoc, and doctest bugs. There may even have been a case where it produced a code bug, but I'm not sure I'm recalling that correctly :)
So yes, requiring that a docstring containing backslashes be marked as a raw string is very intentional.
|
msg312576 - (view) |
Author: Anilyka Barry (abarry) * |
Date: 2018-02-22 18:36 |
I have created Issue32912 as a follow-up to this issue for 3.8.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:32 | admin | set | github: 71551 |
2018-02-22 18:36:32 | abarry | set | messages:
+ msg312576 |
2017-07-11 17:53:19 | r.david.murray | set | messages:
+ msg298170 |
2017-07-11 04:15:13 | serhiy.storchaka | set | messages:
+ msg298115 |
2017-07-11 04:14:52 | gvanrossum | set | messages:
+ msg298114 |
2017-07-11 03:39:49 | jaraco | set | nosy:
+ jaraco messages:
+ msg298112
|
2016-09-13 15:14:02 | yan12125 | set | messages:
+ msg276287 |
2016-09-12 10:28:13 | abarry | set | messages:
+ msg276016 |
2016-09-11 09:35:50 | yan12125 | set | files:
+ verbose-deprecation.diff nosy:
+ yan12125 messages:
+ msg275757
|
2016-09-09 09:55:37 | python-dev | set | messages:
+ msg275298 |
2016-09-09 02:37:46 | python-dev | set | messages:
+ msg275237 |
2016-09-08 23:51:40 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg275219
|
2016-09-08 19:36:22 | r.david.murray | set | messages:
+ msg275125 |
2016-09-08 19:35:58 | abarry | set | status: open -> closed resolution: fixed messages:
+ msg275124
stage: patch review -> resolved |
2016-09-08 19:34:36 | python-dev | set | messages:
+ msg275123 |
2016-09-08 18:46:32 | r.david.murray | set | files:
+ deprecate_invalid_escapes_both_5.patch
messages:
+ msg275111 |
2016-09-08 18:00:28 | python-dev | set | nosy:
+ python-dev messages:
+ msg275084
|
2016-09-08 13:00:52 | abarry | set | files:
+ invalid_stdlib_escapes_5.patch
messages:
+ msg275010 |
2016-09-08 12:50:10 | serhiy.storchaka | set | messages:
+ msg275009 |
2016-09-08 11:26:36 | abarry | set | files:
+ invalid_stdlib_escapes_4.patch
messages:
+ msg274999 |
2016-09-08 01:58:16 | abarry | set | files:
+ invalid_stdlib_escapes_3_rebased_2.patch |
2016-09-07 17:01:25 | gvanrossum | set | messages:
+ msg274837 |
2016-09-07 12:49:20 | abarry | set | files:
+ invalid_stdlib_escapes_3_regenerated.patch |
2016-09-07 12:49:02 | abarry | set | files:
- invalid_stdlib_escapes_3_regen.patch |
2016-09-07 12:45:18 | abarry | set | files:
+ invalid_stdlib_escapes_3_regen.patch |
2016-09-07 12:45:05 | abarry | set | files:
+ deprecate_invalid_escapes_both_4.patch
messages:
+ msg274806 |
2016-09-06 00:13:39 | abarry | set | files:
+ invalid_stdlib_escapes_3.patch
messages:
+ msg274475 title: Deprecate invalid unicode escape sequences -> Deprecate invalid escape sequences in str/bytes |
2016-09-04 03:29:06 | martin.panter | set | messages:
+ msg274332 |
2016-09-01 13:19:49 | abarry | set | files:
+ deprecate_invalid_escapes_both_3.patch
messages:
+ msg274126 |
2016-09-01 13:01:26 | serhiy.storchaka | set | messages:
+ msg274120 |
2016-09-01 12:41:46 | abarry | set | messages:
+ msg274119 |
2016-08-23 07:20:44 | jayvdb | set | nosy:
+ jayvdb
|
2016-08-14 21:17:11 | abarry | set | files:
+ deprecate_invalid_escapes_both_2.patch |
2016-08-14 21:16:57 | abarry | set | files:
+ invalid_stdlib_escapes_2.patch
messages:
+ msg272696 |
2016-08-11 12:15:41 | abarry | set | messages:
+ msg272441 |
2016-08-11 11:56:28 | martin.panter | set | messages:
+ msg272439 |
2016-07-18 15:50:03 | abarry | set | files:
+ deprecate_invalid_escapes_both_1.patch
messages:
+ msg270765 |
2016-06-28 02:39:31 | martin.panter | set | messages:
+ msg269416 |
2016-06-28 01:15:04 | abarry | set | files:
+ deprecate_invalid_escapes_only_3.patch
messages:
+ msg269413 |
2016-06-27 17:02:24 | abarry | set | messages:
+ msg269388 |
2016-06-27 15:00:28 | gvanrossum | set | messages:
+ msg269382 |
2016-06-27 13:28:36 | r.david.murray | set | messages:
+ msg269376 |
2016-06-27 12:45:51 | vstinner | set | messages:
+ msg269373 |
2016-06-27 11:30:10 | abarry | set | messages:
+ msg269372 |
2016-06-27 11:00:51 | serhiy.storchaka | set | messages:
+ msg269368 |
2016-06-27 08:20:44 | vstinner | set | messages:
+ msg269358 |
2016-06-27 02:59:05 | martin.panter | set | messages:
+ msg269340 |
2016-06-27 01:53:54 | abarry | set | messages:
+ msg269335 |
2016-06-27 01:43:55 | martin.panter | set | messages:
+ msg269334 |
2016-06-27 01:42:52 | martin.panter | set | files:
+ deprecate_invalid_escapes_only_2.patch nosy:
+ martin.panter messages:
+ msg269333
|
2016-06-27 01:40:30 | abarry | set | files:
+ deprecate_invalid_escapes_only_2.patch
messages:
+ msg269332 |
2016-06-27 00:26:59 | abarry | set | messages:
+ msg269329 |
2016-06-27 00:05:31 | gvanrossum | set | messages:
+ msg269326 |
2016-06-26 23:04:39 | abarry | set | files:
+ invalid_stdlib_escapes_1.patch |
2016-06-26 23:04:24 | abarry | set | files:
+ deprecate_invalid_escapes_only_1.patch
messages:
+ msg269323 |
2016-06-26 22:19:53 | gvanrossum | set | messages:
+ msg269322 |
2016-06-25 05:34:47 | serhiy.storchaka | set | nosy:
+ gvanrossum
|
2016-06-24 05:47:03 | abarry | set | files:
+ deprecate_invalid_unicode_escapes_2.patch
messages:
+ msg269158 |
2016-06-24 04:43:28 | abarry | set | messages:
+ msg269156 |
2016-06-24 04:22:39 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg269155
|
2016-06-24 02:59:45 | abarry | set | messages:
+ msg269152 |
2016-06-23 15:59:29 | ztane | set | nosy:
+ ztane messages:
+ msg269122
|
2016-06-23 15:26:16 | abarry | set | messages:
+ msg269119 |
2016-06-23 14:41:14 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg269114
|
2016-06-21 20:34:19 | abarry | create | |