classification
Title: csv.writer converts None to '""\n' when it is first line, otherwise '\n'
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: licht-t, nitishch, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-12-08 14:43 by licht-t, last changed 2017-12-12 10:56 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4769 merged licht-t, 2017-12-09 15:18
PR 4810 merged python-dev, 2017-12-12 09:57
Messages (12)
msg307851 - (view) Author: Licht Takeuchi (licht-t) * Date: 2017-12-08 14:43
Inconsistent behavior while reading a single column CSV.
I have the patch and waiting for the CLA response.

# Case 1
## Input
```
import csv
fp = open('test.csv', 'w')
w = csv.writer(fp)
w.writerow([''])
w.writerow(['1'])
fp.close()
```
## Output
```
""
1
```

# Case 2
## Input
```
import csv
fp = open('test.csv', 'w')
w = csv.writer(fp)
w.writerow(['1'])
w.writerow([''])
fp.close()
```
## Output
```
1

```
msg307939 - (view) Author: Nitish (nitishch) * Date: 2017-12-10 03:18
Which scenario you think is the wrong behaviour in this case? First one or second one?

I don't know much about csv module, but I thought it was a deliberate choice made to quote all empty lines and hence considered the second scenario as buggy. But your pull requests seems to fix the first case. Am I missing something here?
msg307940 - (view) Author: Licht Takeuchi (licht-t) * Date: 2017-12-10 05:06
I think the first one is buggy and there are two reasons.

1. The both are valid CSV. The double quoting is unnecessary. Some other applications, eg. Excel, does not use the double quoting.
Also, the current implementation make to quote only if the string is '' and the output is at the first line.

2. '' is not quoted when the two columns case.
## Input:
```
import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=None)
w.writerow(['', ''])
w.writerow(['3', 'a'])
fp.close()
```
## Output:
```
,
3,a
```

These seem inconsistent and the quoting is unnecessary in this case.

# References
http://www.ietf.org/rfc/rfc4180.txt
msg307941 - (view) Author: Licht Takeuchi (licht-t) * Date: 2017-12-10 05:15
The current implementation does not quote in most case. IOW, the patch which makes all '' is quoted is the breaking change (Note that there are some applications does not use quoting).
msg307984 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-10 20:29
The second case is indeed the bug, as can be seen by running the examples against python2.7.  It looks like this was probably broken by 7901b48a1f89 from issue 23171.
msg307986 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-10 20:31
Serhiy, since it was your patch that probably introduced this bug, can you take a look?  Obviously it isn't a very high priority bug, since no one has reported a problem (even this issue isn't reporting the change in behavior as a *problem* :)
msg307997 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-10 22:25
For restoring the 3.4 behavior the single empty field must be quoted. This allows to distinguish a 1-element row with the single empty field from an empty row.
msg308009 - (view) Author: Licht Takeuchi (licht-t) * Date: 2017-12-11 00:20
Thanks for your investigation!
Would you mind if I create a new patch?
msg308050 - (view) Author: Licht Takeuchi (licht-t) * Date: 2017-12-11 15:05
PR is now fixed so as to follow the behavior on Python 2.7!
msg308102 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-12 09:57
New changeset 2001900b0c02a397d8cf1d776a7cc7fcb2a463e3 by Serhiy Storchaka (Licht Takeuchi) in branch 'master':
bpo-32255: Always quote a single empty field when write into a CSV file. (#4769)
https://github.com/python/cpython/commit/2001900b0c02a397d8cf1d776a7cc7fcb2a463e3
msg308103 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-12 09:58
Thank you for your contribution Licht!
msg308109 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-12 10:56
New changeset ce5a3cd9b15c9379753aefabd696bff11495cbbb by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':
bpo-32255: Always quote a single empty field when write into a CSV file. (GH-4769) (#4810)
https://github.com/python/cpython/commit/ce5a3cd9b15c9379753aefabd696bff11495cbbb
History
Date User Action Args
2017-12-12 10:56:58serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-12-12 10:56:43serhiy.storchakasetmessages: + msg308109
2017-12-12 09:58:03serhiy.storchakasetmessages: + msg308103
2017-12-12 09:57:18python-devsetstage: needs patch -> patch review
pull_requests: + pull_request4705
2017-12-12 09:57:09serhiy.storchakasetmessages: + msg308102
2017-12-11 15:05:12licht-tsetmessages: + msg308050
2017-12-11 00:20:24licht-tsetmessages: + msg308009
2017-12-10 22:25:42serhiy.storchakasetmessages: + msg307997
2017-12-10 20:31:03r.david.murraysetnosy: + serhiy.storchaka
messages: + msg307986
2017-12-10 20:29:10r.david.murraysetversions: - Python 2.7, Python 3.4, Python 3.5, Python 3.8
nosy: + r.david.murray

messages: + msg307984

components: + Library (Lib), - IO
stage: patch review -> needs patch
2017-12-10 05:15:33licht-tsetmessages: + msg307941
2017-12-10 05:06:24licht-tsetmessages: + msg307940
2017-12-10 03:18:20nitishchsetmessages: + msg307939
2017-12-09 15:18:18licht-tsetkeywords: + patch
stage: patch review
pull_requests: + pull_request4672
2017-12-08 18:50:14nitishchsetnosy: + nitishch
2017-12-08 14:43:54licht-tcreate