classification
Title: csv writer doesn't escape escapechar
Type: behavior Stage: patch review
Components: Extension Modules Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: catalin.iacob, ebreck, eric.araujo, skip.montanaro
Priority: normal Keywords: needs review, patch

Created on 2011-05-25 18:27 by ebreck, last changed 2011-07-07 23:31 by eric.araujo.

Files
File name Uploaded Description Edit
pybug.zip ebreck, 2011-05-25 18:27
0eb420ce6567.diff catalin.iacob, 2011-07-06 21:21 review
Repositories containing patches
http://bitbucket.org/cataliniacob/cpython#issue12178
Messages (4)
msg136881 - (view) Author: Eric Breck (ebreck) Date: 2011-05-25 18:27
Consider the attached two files.  A reader and writer with the same dialect parameters (escapechar \ quotechar " doublequote False) read, then write a CSV cell that looks like "C\\".  It's written "C\".  The problem is, when doublequote=False, the escapechar isn't used to escape itself, and the writer writes something that in the same dialect would be understood differently (\" isn't \ then end of string, it's an escaped quotechar within the string).

Execute python err.py first.csv to see.
msg136973 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-05-26 14:59
Thanks for the report.  It would be best if you could attach files as plain text instead of archives.
msg139953 - (view) Author: Catalin Iacob (catalin.iacob) Date: 2011-07-06 21:20
I looked at this and tried to provide a patch + tests. Please review.

The bug is that a writer can use writerow on some input data but if a reader with the same dialect reads them back they are different from the input ones. This happens when the input data contains escapechar. Contrary to msg136881, this happens regardless whether doublequote is True or False.

The docs say "On reading, the escapechar removes any special meaning from the following character". Therefore, I understand that on writing, escapechar must always be escaped by itself. If that doesn't happen, when reading it back, escapechar alters the thing that follows it instead of counting as escapechar which is precisely what this bug is about.
msg140002 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-07 23:31
Thanks for the patch.  The tests look good at first glance.  I can’t comment on the C code, I don’t know C.  Hopefully someone will do it, otherwise if you don’t get feedback in say four weeks you can ask for a review on python-dev.
History
Date User Action Args
2011-07-07 23:31:42eric.araujosetkeywords: + needs review

stage: patch review
messages: + msg140002
versions: - Python 3.1
2011-07-06 21:21:25catalin.iacobsetfiles: + 0eb420ce6567.diff
keywords: + patch
2011-07-06 21:20:31catalin.iacobsetnosy: + catalin.iacob

messages: + msg139953
hgrepos: + hgrepo38
2011-05-26 14:59:20eric.araujosetversions: + Python 3.1, Python 3.2, Python 3.3, - Python 2.6
nosy: + eric.araujo, skip.montanaro

messages: + msg136973

components: + Extension Modules, - None
2011-05-25 18:27:10ebreckcreate