classification
Title: [doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator.
Type: behavior Stage: resolved
Components: Documentation, Library (Lib) Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, jack__d, lockywolf, lukasz.langa, miss-islington, ztane
Priority: normal Keywords: patch

Created on 2016-08-13 06:49 by lockywolf, last changed 2021-08-06 20:34 by lukasz.langa. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 26795 merged jack__d, 2021-06-18 22:04
PR 27643 merged miss-islington, 2021-08-06 20:05
PR 27644 merged miss-islington, 2021-08-06 20:05
Messages (7)
msg272579 - (view) Author: lockywolf (lockywolf) Date: 2016-08-13 06:49
Hello, everyone.

I want to report a minor usability issue:

I wanted to use the csv module to load CSV's and the documentation says that the default dialect for reading CSVs is 'excel'.

However, the delimiter used with this dialect in Python is a comma (','), whereas in fact (even though is's called _comma_ separated values) MS Excel (2016) uses a semicolon (';') as a delimiter.
Therefore, the Python's 'excel' actually doesn't read Excel generated files.
msg272580 - (view) Author: Antti Haapala (ztane) * Date: 2016-08-13 07:16
Excel's behaviour has always been locale-dependent. If the user's locale uses , as the decimal mark , then ; has been used as the column separator in "C"SV. However, even if you use autodetection with sniff, it is impossible to detect with 100 % accuracy, e.g, is the following csv row comma or semicolon separated:

    1,2;3;4,5;6,7;8;9

The dialect could be documented better though, as currently it simply says:

    The excel class defines the usual properties of an Excel-generated CSV file. It is registered with the dialect name 'excel'.

And there really should be a separate dialect for Excel-semicolon separated values, as a couple billion people would see ; in their CSV.
msg396103 - (view) Author: Jack DeVries (jack__d) * Date: 2021-06-18 22:13
If you need semicolon delimiters, can't you just pass ``delimiter=';'`` to the reader or writer? I don't think there's a need for a separate dialect class for that, since dialect classes should only provide a baseline for the most broad use cases. Users have plenty of options for extending or customizing behavior without adding more dialect classes.

I also think the docs around dialects are confusing. I remember being confused by them when I was learning! I made quite a few changes to try to add clarity around dialects to the documentation. Let me know if anybody has feedback!
msg399137 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-08-06 20:05
New changeset 0ffdced3b64ba5886fcde64266a31a15712da284 by Jack DeVries in branch 'main':
bpo-27752: improve documentation of csv.Dialect (GH-26795)
https://github.com/python/cpython/commit/0ffdced3b64ba5886fcde64266a31a15712da284
msg399141 - (view) Author: miss-islington (miss-islington) Date: 2021-08-06 20:31
New changeset 2fd1f21db46b165cf603cf4524b4d14ab41ed1cc by Miss Islington (bot) in branch '3.10':
bpo-27752: improve documentation of csv.Dialect (GH-26795)
https://github.com/python/cpython/commit/2fd1f21db46b165cf603cf4524b4d14ab41ed1cc
msg399142 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-08-06 20:33
New changeset 62bce24e32a9c754a23e758a32a7e0ca49602fc5 by Miss Islington (bot) in branch '3.9':
bpo-27752: improve documentation of csv.Dialect (GH-26795) (GH-27644)
https://github.com/python/cpython/commit/62bce24e32a9c754a23e758a32a7e0ca49602fc5
msg399143 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-08-06 20:34
Thanks for the patch, Jack! ✨ 🍰 ✨
History
Date User Action Args
2021-08-06 20:34:41lukasz.langasetstatus: open -> closed
resolution: fixed
messages: + msg399143

stage: patch review -> resolved
2021-08-06 20:33:29lukasz.langasetmessages: + msg399142
2021-08-06 20:31:58miss-islingtonsetmessages: + msg399141
2021-08-06 20:05:27miss-islingtonsetpull_requests: + pull_request26138
2021-08-06 20:05:23lukasz.langasetnosy: + lukasz.langa
messages: + msg399137
2021-08-06 20:05:23miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request26137
2021-06-18 22:13:19jack__dsetmessages: + msg396103
2021-06-18 22:04:11jack__dsetkeywords: + patch
nosy: + jack__d

pull_requests: + pull_request25378
stage: patch review
2021-06-18 16:00:25iritkatrielsetnosy: + docs@python
title: CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. -> [doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator.
assignee: docs@python
versions: + Python 3.11, - Python 2.7
components: + Documentation
2016-08-13 07:16:04ztanesetnosy: + ztane
messages: + msg272580
2016-08-13 06:49:56lockywolfcreate