Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

venv activate.bat is UTF-8 encoded but uses current console codepage #76590

Closed
Jac0 mannequin opened this issue Dec 22, 2017 · 18 comments
Closed

venv activate.bat is UTF-8 encoded but uses current console codepage #76590

Jac0 mannequin opened this issue Dec 22, 2017 · 18 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes OS-windows stdlib Python modules in the Lib dir topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@Jac0
Copy link
Mannequin

Jac0 mannequin commented Dec 22, 2017

BPO 32409
Nosy @pfmoore, @vsajip, @vstinner, @tjguk, @ezio-melotti, @zware, @eryksun, @zooba, @pablogsal, @miss-islington, @samstagern
PRs
  • bpo-32409: Ensures activate.bat can handle Unicode contents #5757
  • [3.7] bpo-32409: Ensures activate.bat can handle Unicode contents (GH-5757) #5765
  • [3.7] bpo-32409: Ensures activate.bat can handle Unicode contents (GH-5757) #5765
  • [3.6] bpo-32409: Ensures activate.bat can handle Unicode contents (GH-5757) #5766
  • bpo-32409 Ensures activate.bat can handle Unicode contents #10295
  • [3.7] bpo-32409: Fix regression in activate.bat on international Windows (GH-10295) #10377
  • bpo-32409: Revert "bpo-32409: Fix regression in activate.bat on international Windows (GH-10295)" #10403
  • [3.7] Revert "bpo-32409: Fix regression in activate.bat on international Windows (GH-10295)" (GH-10403) #10405
  • bpo-34144: Fix of venv acvtivate.bat for win 10 #8321
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/zooba'
    closed_at = <Date 2019-07-26.16:39:28.243>
    created_at = <Date 2017-12-22.10:36:15.658>
    labels = ['type-bug', '3.8', 'OS-windows', '3.7', 'library', 'expert-unicode']
    title = 'venv activate.bat is UTF-8 encoded but uses current console codepage'
    updated_at = <Date 2019-07-26.16:39:28.243>
    user = 'https://bugs.python.org/Jac0'

    bugs.python.org fields:

    activity = <Date 2019-07-26.16:39:28.243>
    actor = 'steve.dower'
    assignee = 'steve.dower'
    closed = True
    closed_date = <Date 2019-07-26.16:39:28.243>
    closer = 'steve.dower'
    components = ['Library (Lib)', 'Unicode', 'Windows']
    creation = <Date 2017-12-22.10:36:15.658>
    creator = 'Jac0'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32409
    keywords = ['patch']
    message_count = 18.0
    messages = ['308931', '308934', '312356', '312389', '312390', '312393', '329141', '329142', '329425', '329431', '329440', '329447', '329450', '329451', '329464', '329481', '332324', '348505']
    nosy_count = 12.0
    nosy_names = ['paul.moore', 'vinay.sajip', 'vstinner', 'tim.golden', 'ezio.melotti', 'zach.ware', 'eryksun', 'steve.dower', 'pablogsal', 'Jac0', 'miss-islington', 'Martin Bijl-Schwab']
    pr_nums = ['5757', '5765', '5765', '5766', '10295', '10377', '10403', '10405', '8321']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue32409'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @Jac0
    Copy link
    Mannequin Author

    Jac0 mannequin commented Dec 22, 2017

    Let's say I have a folder c:\test-ä in Windows

    Now if I run: py -m venv env
    and activate: env\scripts\activate
    and check: where python

    the result is incorrectly just: C:\Users\Username\AppData\Local\Programs\Python\Python36\python.exe

    If I run: path
    the result is: PATH=C:\test-ä\env\Scripts;...

    So clearly the encoding is broken for the folder name.

    I can fix this by changing activate.bat character encoding to OEM-US and then replacing "test-├ż" by "test-ä".

    If I now activate and run: where python
    the result is (as should be):
    C:\test-ä\env\Scripts\python.exe
    C:\Users\Username\AppData\Local\Programs\Python\Python36\python.exe

    By running: path
    I get: PATH=C:\test-ä\env\Scripts;...

    So looks good here as well.

    I suggest that what ever is creating activate.bat file, is using incorrect character encoding for the creation of the file. If this is somehow platform specific, there could be a guide in the venv documentation about how to fix this.

    @Jac0 Jac0 mannequin added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Dec 22, 2017
    @eryksun
    Copy link
    Contributor

    eryksun commented Dec 22, 2017

    The CMD shell decodes batch scripts using the attached console's output codepage, which defaults to OEM. OTOH, venv writes the replacement values for the template activate.bat as UTF-8 (codepage 65001), which is correct and should not be downgraded to OEM.

    Instead, the batch script could temporarily switch the console to codepage 65001. Then restore the previous codepage at the end. For example:

    @echo off
    for /f "tokens=2 delims=:" %%a in ('"%SystemRoot%\System32\chcp.com"') do (
        set "CODEPAGE=%%a"
    )
    "%SystemRoot%\System32\chcp.com" 65001 > nul
    

    [rest of script]

    "%SystemRoot%\System32\chcp.com" %CODEPAGE% > nul
    set "CODEPAGE="
    :END
    

    @eryksun eryksun added 3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-unicode OS-windows and removed extension-modules C modules in the Modules dir labels Dec 22, 2017
    @eryksun eryksun changed the title venv activation doesn't work, if project is in a Windows folder that has latin-1 supplement characters (such as ä,ö,å) in its path venv activate.bat is UTF-8 encoded but uses current console codepage Dec 22, 2017
    @zooba
    Copy link
    Member

    zooba commented Feb 19, 2018

    Eryk's solution seems to be best, so I'll add that.

    @zooba zooba added the 3.8 only security fixes label Feb 19, 2018
    @zooba zooba self-assigned this Feb 19, 2018
    @zooba
    Copy link
    Member

    zooba commented Feb 20, 2018

    New changeset 6240917 by Steve Dower in branch 'master':
    bpo-32409: Ensures activate.bat can handle Unicode contents (GH-5757)
    6240917

    @zooba
    Copy link
    Member

    zooba commented Feb 20, 2018

    New changeset a3d6c1b by Steve Dower (Miss Islington (bot)) in branch '3.7':
    bpo-32409: Ensures activate.bat can handle Unicode contents (GH-5765)
    a3d6c1b

    @zooba zooba closed this as completed Feb 20, 2018
    @zooba
    Copy link
    Member

    zooba commented Feb 20, 2018

    New changeset 8e149ff by Steve Dower (Miss Islington (bot)) in branch '3.6':
    bpo-32409: Ensures activate.bat can handle Unicode contents (GH-5766)
    8e149ff

    @samstagern
    Copy link
    Mannequin

    samstagern mannequin commented Nov 2, 2018

    it does not work as expected on swiss german (and likely other internationalised) windows systems. See https://bugs.python.org/issue32409

    @samstagern
    Copy link
    Mannequin

    samstagern mannequin commented Nov 2, 2018

    I meant https://bugs.python.org/issue35148

    @vsajip
    Copy link
    Member

    vsajip commented Nov 7, 2018

    New changeset c64583b by Vinay Sajip (samstagern) in branch 'master':
    bpo-32409: Fix regression in activate.bat on international Windows (GH-10295)
    c64583b

    @vsajip
    Copy link
    Member

    vsajip commented Nov 7, 2018

    New changeset 881e273 by Vinay Sajip (Miss Islington (bot)) in branch '3.7':
    bpo-32409: Fix regression in activate.bat on international Windows (GH-10295) (GH-10377)
    881e273

    @pablogsal
    Copy link
    Member

    @pablogsal pablogsal reopened this Nov 7, 2018
    @samstagern
    Copy link
    Mannequin

    samstagern mannequin commented Nov 7, 2018

    I will have a look.

    @pablogsal
    Copy link
    Member

    New changeset 6843ffe by Pablo Galindo in branch 'master':
    Revert "bpo-32409: Fix regression in activate.bat on international Windows (GH-10295)" (GH-10403)
    6843ffe

    @miss-islington
    Copy link
    Contributor

    New changeset 3ba5e25 by Miss Islington (bot) in branch '3.7':
    Revert "bpo-32409: Fix regression in activate.bat on international Windows (GH-10295)" (GH-10403)
    3ba5e25

    @vstinner
    Copy link
    Member

    vstinner commented Nov 8, 2018

    Pablo Galindo reverted the change because it broke Windows buildbots, and we have a policy to revert a change breaking buildbots if the regression cannot be fixed "quickly" event:
    https://pythondev.readthedocs.io/ci.html#revert-on-fail

    Is someone working on investigating the bug? Do you need help to reproduce the bug?

    Copy of the test_venv error:

    ======================================================================
    ERROR: test_unicode_in_batch_file (test.test_venv.BasicTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_venv.py", line 302, in test_unicode_in_batch_file
        out, err = check_output(
      File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_venv.py", line 37, in check_output
        raise subprocess.CalledProcessError(
    TypeError: __init__() takes from 3 to 5 positional arguments but 6 were given

    https://buildbot.python.org/all/#/builders/32/builds/1707

    @zooba
    Copy link
    Member

    zooba commented Nov 8, 2018

    That error is a bug in the test, but it only shows up on an error path anyway. Without removing the extra None we don't get to see the actual error output.

    I can't look into this over the next week or two, but a quick glance at the original PR looks like a lot of quotes are missing around executable paths, so maybe it was that?

    @vsajip
    Copy link
    Member

    vsajip commented Dec 22, 2018

    See also bpo-35558.

    @zooba
    Copy link
    Member

    zooba commented Jul 26, 2019

    This appears to be completely resolved now, but if it's not, please ping with the details and we'll reopen.

    @zooba zooba closed this as completed Jul 26, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes OS-windows stdlib Python modules in the Lib dir topic-unicode type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants