classification
Title: re.compile fails with some bytes patterns
Type: behavior Stage:
Components: Regular Expressions Versions: Python 3.0
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: benjamin.peterson, gvanrossum, pitrou
Priority: release blocker Keywords: patch

Created on 2008-06-28 23:11 by pitrou, last changed 2008-07-22 17:54 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
rebytes.patch pitrou, 2008-06-28 23:25
Messages (5)
msg68925 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-28 23:11
Some patterns can be compiled in str form but not in bytes form. This
was overlooked because the test suite wasn't correctly adapted for py3k:

>>> re.compile('[\\1]')
<_sre.SRE_Pattern object at 0xb7be1410>
>>> re.compile('\\09')
<_sre.SRE_Pattern object at 0xb7c4f2f0>
>>> re.compile('\\n')
<_sre.SRE_Pattern object at 0xb7be1f50>

but:

>>> re.compile(b'[\\1]')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 188, in compile
return _compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 240, in _compile
p = sre_compile.compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_compile.py", line 497, in
compile
p = sre_parse.parse(p, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 685, in parse
p = _parse_sub(source, pattern, 0)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 320, in
_parse_sub
itemsappend(_parse(source, state))
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 409, in _parse
this = sourceget()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 215, in get
self.__next()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 204, in __next
char = char + c
TypeError: Can't convert 'int' object to str implicitly
>>> re.compile(b'\\09')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 188, in compile
return _compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 240, in _compile
p = sre_compile.compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_compile.py", line 497, in
compile
p = sre_parse.parse(p, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 678, in parse
source = Tokenizer(str)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 187, in
__init__
self.__next()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 204, in __next
char = char + c
TypeError: Can't convert 'int' object to str implicitly
>>> re.compile(b'\\n')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 188, in compile
return _compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/re.py", line 240, in _compile
p = sre_compile.compile(pattern, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_compile.py", line 497, in
compile
p = sre_parse.parse(p, flags)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 678, in parse
source = Tokenizer(str)
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 187, in
__init__
self.__next()
  File "/home/antoine/py3k/reunicode/Lib/sre_parse.py", line 204, in __next
char = char + c
TypeError: Can't convert 'int' object to str implicitly
msg68927 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-28 23:25
Here is a patch fixing both the bug and the test suite.
msg69699 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-07-15 18:10
Can we make sure this is fixed for beta 3? (Beta 2 would be great too,
but it's getting late.)
msg69700 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-07-15 18:28
Antoine's patch (with a 3 character fix) looks just fine to me. Guido,
I'm assigning this to you because svn annotate tells me, you made this
change.
msg70160 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-22 17:54
I think the fix is trivial enough. Committed in r65185.
History
Date User Action Args
2008-07-22 17:54:31pitrousetstatus: open -> closed
resolution: fixed
messages: + msg70160
2008-07-18 03:45:55barrysetpriority: deferred blocker -> release blocker
2008-07-15 18:28:05benjamin.petersonsetassignee: gvanrossum
messages: + msg69700
nosy: + benjamin.peterson
2008-07-15 18:10:15gvanrossumsetpriority: deferred blocker
nosy: + gvanrossum
messages: + msg69699
2008-06-28 23:25:22pitrousetfiles: + rebytes.patch
keywords: + patch
messages: + msg68927
2008-06-28 23:11:55pitroucreate