This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Argparse on Python 3.7.1 (Windows) appends double quotes to string if it ends with backward slash
Type: behavior Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: 888xray999, eryksun, paul.j3, rhettinger
Priority: normal Keywords:

Created on 2020-03-04 08:40 by 888xray999, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg363339 - (view) Author: Ion Cebotari (888xray999) Date: 2020-03-04 08:40
I have this code for a tool that copies files from one directory to another:

parser = argparse.ArgumentParser()
parser.add_argument('src_dir')
parser.add_argument('dest_dir')
args = parser.parse_args()

It works fine on Unix, but on Windows, if the destination path ends with a backward slash, it seems that argparse parses the string as it would escape a double quote and returns the string with the double quote appended.

For example, calling the script:
(base) PS Z:\test> python.exe .\main.py -d Z:\tmp\test\DJI\ 'C:\unu doi\'
will create the destination path string: C:\unu doi"
The source path, even though it ends with the backslash as well, isn't modified by argparse.

I've worked around this issue by using this validation function for the arguments:

def is_valid_dir_path(string):
    """
    Checks if the path is a valid path

    :param string: The path that needs to be validated
    :return: The validated path
    """
    if sys.platform.startswith('win') and string.endswith('"'):
        string = string[:-1]
    if os.path.isdir(string):
        return string
    else:
        raise NotADirectoryError(string)

parser = argparse.ArgumentParser()
parser.add_argument('src_dir', type=is_valid_dir_path)
parser.add_argument('dest_dir', type=is_valid_dir_path)
args = parser.parse_args()
msg363362 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2020-03-04 17:06
Could you show the sys.argv (for both the linux and windows cases)?

print(args) (without your correction) might also help, just to see the 'raw' Namespace.

(I'd have to restart my computer to explore the Windows behavior myself).
msg363423 - (view) Author: Ion Cebotari (888xray999) Date: 2020-03-05 09:50
Yes, the problem seems to be with sys.argv.:

Windows output:
python .\main.py -d Z:\tmp\test\DJI\ 'C:\unu doi\'
sys.argv:
['.\\CamCardOrganizer.py', '-d', 'Z:\\tmp\\test\\DJI\\', 'C:\\unu doi"']
args namespace:
Namespace(capturedatefmt='%y.%m.%d', capturetimefmt_full='%y.%m.%d_%H-%M-%S', ca
pturetimefmt_short='%y.%m.%d_%H', dest_dir='C:\\unu doi"', dry_run=True, output_
file='stdout', processes=2, src_dir='Z:\\tmp\\test\\DJI\\', tl_interval_threshol
d=30, tl_numimages_threshold=75, verbose=False, version=None)

Linux output:
python main.py --verbose ~/tmp/test/DJI/ /tmp/123
sys.argv:
['CamCardOrganizer.py', '--verbose', '/home/ion/tmp/test/DJI/', '/tmp/123']
args namespace:
Namespace(capturedatefmt='%y.%m.%d', capturetimefmt_full='%y.%m.%d_%H-%M-%S', capturetimefmt_short='%y.%m.%d_%H', dest_dir='/tmp/123', dry_run=False, output_file='stdout', processes=2, src_dir='/home/ion/tmp/test/DJI/', tl_interval_threshold=30, tl_numimages_threshold=75, verbose=True, version=None)
msg363453 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2020-03-05 20:33
Then this isn't an argparse issue.  Probably not even a Python one.  The windows shell (which one? cmd.exe?  power? some batch) is making the substitution. 

I see lots of discussion about Windows use of backslash, both as directory separator and escape.  None seems to exactly apply.

For your application the use of a custom 'type' function might well be appropriate, but it's not something we want to add to argparse.  It's a user specific patch.
msg364246 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-03-15 16:39
PowerShell translates single quotes to double quotes when they're used to delimit a string in the command line, which complies with VC++ command-line parsing and CommandLineToArgvW [1]. But PowerShell 5 has a bug here. It translates 'C:\unu doi\' into "C:\unu doi\". A double quote preceded by a backslash is parsed as a literal double quote. It should escape the trailing backslash as two backslashes, i.e. "C:\unu doi\\". PowerShell 6 (pwsh) implements it correctly.

[1]: https://docs.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=vs-2019
History
Date User Action Args
2022-04-11 14:59:27adminsetgithub: 84026
2020-03-15 16:39:14eryksunsetstatus: open -> closed

nosy: + eryksun
messages: + msg364246

resolution: third party
stage: resolved
2020-03-05 20:33:13paul.j3setmessages: + msg363453
2020-03-05 09:50:14888xray999setmessages: + msg363423
2020-03-04 17:06:39paul.j3setmessages: + msg363362
2020-03-04 15:12:25rhettingersetnosy: + rhettinger, paul.j3
2020-03-04 08:40:21888xray999create