Title: [doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator.
Components: Documentation, Library (Lib)
Nosy List: docs@python, jack__d, lockywolf, lukasz.langa, miss-islington, ztane
Created on 2016-08-13 06:49 by lockywolf, last changed 2022-04-11 14:58 by admin.

Author: lockywolf Date: 2016-08-13 06:49
Hello, everyone.

I want to report a minor usability issue:

I wanted to use the csv module to load CSV's and the documentation says that the default dialect for reading CSVs is 'excel'.

However, the delimiter used with this dialect in Python is a comma (','), whereas in fact (even though is's called _comma_ separated values) MS Excel (2016) uses a semicolon (';') as a delimiter.
Therefore, the Python's 'excel' actually doesn't read Excel generated files.
Author: Antti Haapala (ztane) Date: 2016-08-13 07:16
Excel's behaviour has always been locale-dependent. If the user's locale uses , as the decimal mark , then ; has been used as the column separator in "C"SV. However, even if you use autodetection with sniff, it is impossible to detect with 100 % accuracy, e.g, is the following csv row comma or semicolon separated:


The dialect could be documented better though, as currently it simply says:

    The excel class defines the usual properties of an Excel-generated CSV file. It is registered with the dialect name 'excel'.

And there really should be a separate dialect for Excel-semicolon separated values, as a couple billion people would see ; in their CSV.
Author: Jack DeVries (jack__d) Date: 2021-06-18 22:13
If you need semicolon delimiters, can't you just pass ``delimiter=';'`` to the reader or writer? I don't think there's a need for a separate dialect class for that, since dialect classes should only provide a baseline for the most broad use cases. Users have plenty of options for extending or customizing behavior without adding more dialect classes.

I also think the docs around dialects are confusing. I remember being confused by them when I was learning! I made quite a few changes to try to add clarity around dialects to the documentation. Let me know if anybody has feedback!
Author: Łukasz Langa (lukasz.langa) Date: 2021-08-06 20:05
New changeset 0ffdced3b64ba5886fcde64266a31a15712da284 by Jack DeVries in branch 'main':
bpo-27752: improve documentation of csv.Dialect (GH-26795)
Author: miss-islington Date: 2021-08-06 20:31
New changeset 2fd1f21db46b165cf603cf4524b4d14ab41ed1cc by Miss Islington (bot) in branch '3.10':
bpo-27752: improve documentation of csv.Dialect (GH-26795)
Author: Łukasz Langa (lukasz.langa) Date: 2021-08-06 20:33
New changeset 62bce24e32a9c754a23e758a32a7e0ca49602fc5 by Miss Islington (bot) in branch '3.9':
bpo-27752: improve documentation of csv.Dialect (GH-26795) (GH-27644)
Author: Łukasz Langa (lukasz.langa) Date: 2021-08-06 20:34
Thanks for the patch, Jack! ✨ 🍰 ✨
