Title: Importing bs4 fails with -3 option in Python 2.7.15
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: fschulze, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2018-05-25 09:17 by fschulze, last changed 2018-05-31 04:42 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7119 merged serhiy.storchaka, 2018-05-25 15:35
Messages (7)
msg317664 - (view) Author: Florian Schulze (fschulze) Date: 2018-05-25 09:17
Since Python 2.7.15 import bs4 (BeautifulSoup4) fails when using the -3 option on the Python binary.

    'You are trying to run the Python 2 version of Beautiful Soup under Python 3. This will not work.'<>'You need to convert the code, either by installing it (`python install`) or by running 2to3 (`2to3 -w bs4`).'
SyntaxError: unknown parsing error

With 2.7.14 this works fine and I get the expected deprecation warning.

A workaround is to import the package without -3 first, so the .pyc file is generated and used in subsequent imports.
msg317675 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-25 14:03
The Python interpreter itself does not have references to Beautiful Soup. This error is generated by third-party code. Report a bug on the Beautiful Soup bug tracker.
msg317678 - (view) Author: Florian Schulze (fschulze) Date: 2018-05-25 14:31
Yes, it's third party code, but that worked up until Python 2.7.14 and only broke with 2.7.15, which is why I reported it here.

The expected behaviour is not a SyntaxError, but a DeprecationWarning, which is what you get pre Python 2.7.15. So this is a regression.
msg317681 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-25 15:34
Ah, I was fooled by the message about Beautiful Soup. It is not a message of an error, it is a part of source line, printed for a SyntaxError. Seems sources of Beautiful Soup intentionally contain a code invalid in Python 3.

There is a bug in Python. The tokenizer returns error E_OK, but it should never return such error code. The AST parser is confused and raises a SyntaxError with the message "unknown parsing error".

This bug is reproduced when run Python wish both options -3 and -We and parse the "<>" operator.

$ ./python -3 -We -c '[] <> []'
  File "<string>", line 1
    [] <> []
SyntaxError: unknown parsing error

But it is reproduced with 2.7.14 too.
msg317697 - (view) Author: Florian Schulze (fschulze) Date: 2018-05-25 18:27
You are right, this actually happens with Python 2.7.14 as well. I was fooled by a warnings.filterwarnings matching to a path in my codebase. That one did match for my Python 2.7.15 testing but didn't for my Python 2.7.14 testing, because those were done in a temporary path which didn't match. I should have tried to narrow it down further first.

I'm glad you found the necessary -We option.

Thanks for looking into this and for the patch!

I'm currently unable to test the patch though.
msg318245 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-31 04:35
New changeset d5e7556e522f4662ad34b35924b6c76895df340e by Serhiy Storchaka in branch '2.7':
bpo-33645: Fix an "unknown parsing error" in the parser. (GH-7119)
msg318248 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-31 04:42
I'm not sure that this will fix your case (there may be other bug in BeautifulSoup4 or other changes in your environment), but it fixed a bug in Python. Thank you for your report.
Date User Action Args
2018-05-31 04:42:58serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg318248

stage: patch review -> resolved
2018-05-31 04:35:41serhiy.storchakasetmessages: + msg318245
2018-05-25 18:27:07fschulzesetmessages: + msg317697
2018-05-25 15:35:55serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request6753
2018-05-25 15:34:31serhiy.storchakasetstatus: closed -> open
resolution: third party -> (no value)
messages: + msg317681

stage: resolved -> (no value)
2018-05-25 14:31:59fschulzesetmessages: + msg317678
2018-05-25 14:03:32serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg317675

resolution: third party
stage: resolved
2018-05-25 09:17:42fschulzecreate