classification
Title: csv doesn't handle escaped characters properly
Type: behavior Stage:
Components: Extension Modules Versions: Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: est_python_tracker, vdupras
Priority: normal Keywords:

Created on 2009-11-18 19:11 by est_python_tracker, last changed 2009-11-20 21:52 by terry.reedy. This issue is now closed.

Messages (2)
msg95441 - (view) Author: Eric Torstenson (est_python_tracker) Date: 2009-11-18 19:11
When I use CSV with a separator, if there is an escaped separator in the
field, it causes the next field to become part of the current one:

file = csv.reader(open(filename), delimiter='\t', quotechar="'")
for words in file:
    print words[0-8]

If, say line 3 contains: '1709'	'PF01322'	'Cytochrom_C_2'
'Cytochrome_C_2; '	'Cytochrome C\''	'Finn RD, Bateman A'	'anon'	'Sarah
Teichmann'

Column 4 will be printed as:
Cytochrome C\'\tFinn RD, Bateman A'

I've checked this with a spreadsheet application, and it opened this
line just fine, but when I used csv to parse, I had to remove that
escaped single quote to get my columns to work out properly for that line.
msg95549 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2009-11-20 14:37
You have to tell the reader how to handle escaping. In your case, you 
should send escapechar="\\" in reader()'s kwargs.
History
Date User Action Args
2009-11-20 21:52:31terry.reedysetstatus: open -> closed
resolution: not a bug
2009-11-20 14:37:12vduprassetnosy: + vdupras
messages: + msg95549
2009-11-18 19:11:54est_python_trackercreate