This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: python core in string substring search
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.6, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ammar2, pashkasan, rhettinger, ronaldoussoren
Priority: normal Keywords:

Created on 2018-10-01 05:55 by pashkasan, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (8)
msg326764 - (view) Author: (pashkasan) Date: 2018-10-01 05:55
find substring in string 
its correct behavior?
sample code

str = """
Content-Length: 3192
Connection: close
Cookie: _secure_admin_session_id=2a5dc26329de17ca4eafxxxxxxxxxxxxe;

-----------------------------1477319126846
Content-Disposition: form-data; name="utf8"
"""
str2 = """
xxxx

zzzzz


tttttt
"""

if "\r\n" in str:
        print ("str found")
else:
        print ("str not found")


if "\r\n" in str2:
        print ("str2 found")
else:
        print ("str2 not found")


if str.find("\n\r"):
        print ("str found")
else:
        print ("str not found")

output

[root@scw-6ec0de ~]# python a.py
str not found
str2 not found
str found
[root@scw-6ec0de ~]# python3 a.py
str not found
str2 not found
str found
[root@scw-6ec0de ~]# python --version
Python 2.7.15
[root@scw-6ec0de ~]# python3 --version
Python 3.6.4
[root@scw-6ec0de ~]#
msg326768 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2018-10-01 08:36
What do you think the problem is? The output of the script is what I'd expect it to be.

Note that str.find() returns -1 when the needle is not present (and the first offset where the needle is found when it is present).
msg326773 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-10-01 09:21
The test code for the third case looks incorrect in two places:

Given:       if str.find("\n\r"):                  
                                ^-- should compare to -1     
                          ^^^^----- these are reversed

Corrected:   if str.find("\r\n") != -1:

Also note that *str* is a confusing variable name because it shadows the builtin *str*.
msg326774 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2018-10-01 09:42
I suspect that there may also be confusion about the line ending in multiline strings. Is there any documentation on the fact that this is always "\n" and not "\r\n" (even on Windows)?  

The closest I've found is the Lexical Analysis part of the language reference (<https://docs.python.org/3.7/reference/lexical_analysis.htm>) which states that lines in source code are separated by NEWLINE (that is "\n").
msg326776 - (view) Author: (pashkasan) Date: 2018-10-01 09:57
>I suspect that there may also be confusion about the line ending in multiline strings. Is there any documentation on the fact that this is always "\n" and not "\r\n" (even on Windows)?  

the problem is that "\r\n" not found in source multiline strings str, str2, but exists
i tested on windows , unix
msg326777 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2018-10-01 10:01
@pahskazan: It is correct that '\r\n' is not found in those strings, Python source code is always parsed after conversion to "Unix" line endings with "\n" between lines.  This is basicly the same as the "universal newlines" feature of the open function <https://docs.python.org/3/library/functions.html#open>.

The output you get is therefore expected behaviour. See also Raymond's note about how to use str.find() and the other typo in that line.
msg326791 - (view) Author: (pashkasan) Date: 2018-10-01 10:42
str2 = open('sample.txt', 'rU').read()


if "\n" in str2:
        print ("\\n found")
else:
        print ("\\n not found")

if "\r" in str2:
        print ("\\r found")
else:
        print ("\\r not found")


if "\r\n" in str2:
        print ("\\r\\n found")
else:
        print ("\\r\\n not found")

print str2

output
http://prntscr.com/l0sc11

strange that print() has \r\n
do i have to open file in binary mode or something to have \r not stripped?
msg326800 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2018-10-01 13:47
Please read this excerpt from the documentation Ronald linked for open:

newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
History
Date User Action Args
2022-04-11 14:59:06adminsetgithub: 79040
2018-10-01 13:47:48ammar2setstatus: open -> closed


messages: + msg326800
nosy: + ammar2
2018-10-01 10:42:44pashkasansetmessages: + msg326791
2018-10-01 10:01:11ronaldoussorensettype: behavior
resolution: not a bug
messages: + msg326777
stage: resolved
2018-10-01 09:57:00pashkasansetmessages: + msg326776
2018-10-01 09:42:16ronaldoussorensetmessages: + msg326774
2018-10-01 09:21:21rhettingersetnosy: + rhettinger
messages: + msg326773
2018-10-01 08:36:25ronaldoussorensetnosy: + ronaldoussoren
messages: + msg326768
2018-10-01 05:55:43pashkasancreate