Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py3k fails under Windows if "-c" or "-m" is given a non-ascii value #47955

Closed
pitrou opened this issue Aug 27, 2008 · 15 comments
Closed

py3k fails under Windows if "-c" or "-m" is given a non-ascii value #47955

pitrou opened this issue Aug 27, 2008 · 15 comments
Labels
release-blocker type-bug An unexpected behavior, bug, or error

Comments

@pitrou
Copy link
Member

pitrou commented Aug 27, 2008

BPO 3705
Nosy @loewis, @amauryfa, @pitrou, @benjaminp
PRs
  • bpo-35705: Added support of libffi in _ctypes for windows 10 ARM64 #11497
  • bpo-35705: Added support of libffi in _ctypes for windows 10 ARM64 #11497
  • bpo-35705: Added support of libffi in _ctypes for windows 10 ARM64 #11497
  • Files
  • convert_args.patch
  • find_module_unicode.patch
  • find_module_unicode_2.patch
  • command_unicode.patch
  • command_unicode_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-11-11.23:05:34.430>
    created_at = <Date 2008-08-27.18:05:40.472>
    labels = ['type-bug', 'release-blocker']
    title = 'py3k fails under Windows if "-c" or "-m" is given a non-ascii value'
    updated_at = <Date 2019-01-10.12:01:30.676>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2019-01-10.12:01:30.676>
    actor = 'ossdev07'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-11-11.23:05:34.430>
    closer = 'amaury.forgeotdarc'
    components = []
    creation = <Date 2008-08-27.18:05:40.472>
    creator = 'pitrou'
    dependencies = []
    files = ['11273', '11411', '11422', '11424', '11545']
    hgrepos = []
    issue_num = 3705
    keywords = ['patch']
    message_count = 15.0
    messages = ['72036', '72040', '72682', '72692', '72714', '72720', '72771', '72773', '72800', '72826', '72828', '73533', '75764', '75765', '75768']
    nosy_count = 4.0
    nosy_names = ['loewis', 'amaury.forgeotdarc', 'pitrou', 'benjamin.peterson']
    pr_nums = ['11497', '11497', '11497']
    priority = 'release blocker'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3705'
    versions = ['Python 3.0']

    @pitrou
    Copy link
    Member Author

    pitrou commented Aug 27, 2008

    The explanation is quite simple: in Py_Main, the arguments are converted
    from wide to byte strings, but the required length of the byte string is
    assumed equal to that of the wide string.

    Which gives:

    $ ./python -c "print('à')"
    Fatal Python error: not enough memory to copy -c argument
    Erreur de segmentation (core dumped)
    $ ./python -m à
    Fatal Python error: not enough memory to copy -m argument
    Erreur de segmentation (core dumped)

    @pitrou pitrou added the type-crash A hard crash of the interpreter, possibly with a core dump label Aug 27, 2008
    @pitrou
    Copy link
    Member Author

    pitrou commented Aug 27, 2008

    Here is a patch which works under Linux. Under Windows it doesn't choke
    when converting arguments anymore, but it fails later in the process (in
    the parser for '-c', in the importing logic for '-m').

    Here is an example:

    $ ./python -c "print(ord('ሀ'))"
    4608
    $ cat > ሀ.py
    print(__file__)
    
    $ ./python -m ሀ
    /home/antoine/py3k/mbstowcs/ሀ.py
    $ ./python ሀ.py 
    ሀ.py

    @benjaminp
    Copy link
    Contributor

    Hmm. I suppose anything is better than segfaulting. I think the patch is
    fine for now, though.

    @pitrou
    Copy link
    Member Author

    pitrou commented Sep 6, 2008

    Committed in r66269.

    @pitrou pitrou changed the title py3k aborts if "-c" or "-m" is given a non-ascii value py3k fails under Windows if "-c" or "-m" is given a non-ascii value Sep 6, 2008
    @pitrou pitrou added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Sep 6, 2008
    @amauryfa
    Copy link
    Member

    amauryfa commented Sep 6, 2008

    This patch corrects the "-m" case on windows: the path has to be
    decoded/recoded using the filesystem encoding, and not the default utf-8.
    Review is needed, of course.

    @pitrou
    Copy link
    Member Author

    pitrou commented Sep 6, 2008

    Looks good and works under Linux.
    One small nit, you could just as well use "NN(ssi)" for the
    Py_BuildValue and remove Py_DECREF(fob), so as to be more consistent.

    @amauryfa
    Copy link
    Member

    amauryfa commented Sep 8, 2008

    Updated patch.

    @amauryfa
    Copy link
    Member

    amauryfa commented Sep 8, 2008

    ./python -c "print('à')"
    does not work on my Linux machine with latest py3k (r66303), certainly
    because my terminal uses a latin-1 encoding: wcstombs will convert the
    argument back to the terminal encoding, whereas PyRun_SimpleString
    expects a UTF-8 string.

    I join another patch, which propagates the wchar_t as far as possible,
    and encodes it as utf-8; with test.

    This also corrects the Windows case.

    @benjaminp
    Copy link
    Contributor

    I think the patch good; go ahead.

    @amauryfa
    Copy link
    Member

    amauryfa commented Sep 9, 2008

    Applied both patches as r66331.

    @amauryfa amauryfa closed this as completed Sep 9, 2008
    @amauryfa
    Copy link
    Member

    amauryfa commented Sep 9, 2008

    Unfortunately, my patch does not work: see the compile warnings in "main.c":
    http://www.python.org/dev/buildbot/3.0/x86%20osx.5%203.0/builds/344/step-compile/0

    I reverted the change, and will try something else...

    @amauryfa amauryfa reopened this Sep 9, 2008
    @amauryfa
    Copy link
    Member

    Today I learned something: wchar_t can be 2 or 4 bytes, PyUNICODE can be
    2 or 4 bytes, and all combinations are possible.
    My error was to use PyUnicode_FromUnicode on a wchar_t*; PyUnicode_FromWideChar is the obvious function to use.

    Attached a new patch (command_unicode_2.patch) for review.

    @amauryfa
    Copy link
    Member

    Raising to release blocker, just to trigger another review...

    @benjaminp
    Copy link
    Contributor

    Go ahead.

    @amauryfa
    Copy link
    Member

    Fixed as r67190. Thanks for the review.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    release-blocker type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants