[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. #71939

lockywolf · 2016-08-13T06:49:56Z

BPO	27752
Nosy	@ambv, @ztane, @lockywolf, @miss-islington, @jdevries3133
PRs	bpo-27752: improve documentation of csv.Dialect #26795 [3.10] bpo-27752: improve documentation of csv.Dialect (GH-26795) #27643 [3.9] bpo-27752: improve documentation of csv.Dialect (GH-26795) #27644

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2021-08-06.20:34:41.107>
created_at = <Date 2016-08-13.06:49:56.128>
labels = ['3.11', 'type-bug', 'library', 'docs']
title = "[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator."
updated_at = <Date 2021-08-06.20:34:41.106>
user = 'https://github.com/lockywolf'

bugs.python.org fields:

activity = <Date 2021-08-06.20:34:41.106>
actor = 'lukasz.langa'
assignee = 'docs@python'
closed = True
closed_date = <Date 2021-08-06.20:34:41.107>
closer = 'lukasz.langa'
components = ['Documentation', 'Library (Lib)']
creation = <Date 2016-08-13.06:49:56.128>
creator = 'lockywolf'
dependencies = []
files = []
hgrepos = []
issue_num = 27752
keywords = ['patch']
message_count = 7.0
messages = ['272579', '272580', '396103', '399137', '399141', '399142', '399143']
nosy_count = 5.0
nosy_names = ['docs@python', 'lukasz.langa', 'ztane', 'lockywolf', 'miss-islington', 'jack__d']
pr_nums = ['26795', '27643', '27644']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue27752'
versions = ['Python 3.11']

lockywolf · 2016-08-13T06:49:56Z

Hello, everyone.

I want to report a minor usability issue:

I wanted to use the csv module to load CSV's and the documentation says that the default dialect for reading CSVs is 'excel'.

However, the delimiter used with this dialect in Python is a comma (','), whereas in fact (even though is's called _comma_ separated values) MS Excel (2016) uses a semicolon (';') as a delimiter.
Therefore, the Python's 'excel' actually doesn't read Excel generated files.

ztane · 2016-08-13T07:16:04Z

Excel's behaviour has always been locale-dependent. If the user's locale uses , as the decimal mark , then ; has been used as the column separator in "C"SV. However, even if you use autodetection with sniff, it is impossible to detect with 100 % accuracy, e.g, is the following csv row comma or semicolon separated:

1,2;3;4,5;6,7;8;9

The dialect could be documented better though, as currently it simply says:

The excel class defines the usual properties of an Excel-generated CSV file. It is registered with the dialect name 'excel'.

And there really should be a separate dialect for Excel-semicolon separated values, as a couple billion people would see ; in their CSV.

jdevries3133 · 2021-06-18T22:13:19Z

If you need semicolon delimiters, can't you just pass delimiter=';' to the reader or writer? I don't think there's a need for a separate dialect class for that, since dialect classes should only provide a baseline for the most broad use cases. Users have plenty of options for extending or customizing behavior without adding more dialect classes.

I also think the docs around dialects are confusing. I remember being confused by them when I was learning! I made quite a few changes to try to add clarity around dialects to the documentation. Let me know if anybody has feedback!

ambv · 2021-08-06T20:05:23Z

New changeset 0ffdced by Jack DeVries in branch 'main':
bpo-27752: improve documentation of csv.Dialect (GH-26795)
0ffdced

miss-islington · 2021-08-06T20:31:58Z

New changeset 2fd1f21 by Miss Islington (bot) in branch '3.10':
bpo-27752: improve documentation of csv.Dialect (GH-26795)
2fd1f21

ambv · 2021-08-06T20:33:30Z

New changeset 62bce24 by Miss Islington (bot) in branch '3.9':
bpo-27752: improve documentation of csv.Dialect (GH-26795) (GH-27644)
62bce24

ambv · 2021-08-06T20:34:41Z

Thanks for the patch, Jack! ✨ 🍰 ✨

lockywolf mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Aug 13, 2016

iritkatriel added 3.11 only security fixes docs Documentation in the Doc dir labels Jun 18, 2021

iritkatriel changed the title ~~CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator.~~ [doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. Jun 18, 2021

iritkatriel assigned docspython Jun 18, 2021

ambv closed this as completed Aug 6, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. #71939

[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. #71939

lockywolf mannequin commented Aug 13, 2016

lockywolf mannequin commented Aug 13, 2016

ztane mannequin commented Aug 13, 2016

jdevries3133 mannequin commented Jun 18, 2021

ambv commented Aug 6, 2021

miss-islington commented Aug 6, 2021

ambv commented Aug 6, 2021

ambv commented Aug 6, 2021

[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. #71939

[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator. #71939

Comments

lockywolf mannequin commented Aug 13, 2016

lockywolf mannequin commented Aug 13, 2016

ztane mannequin commented Aug 13, 2016

jdevries3133 mannequin commented Jun 18, 2021

ambv commented Aug 6, 2021

miss-islington commented Aug 6, 2021

ambv commented Aug 6, 2021

ambv commented Aug 6, 2021