classification
Title: csv.Sniffer.sniff on data with doublequotes doesn't set up the dialect properly
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 2.4, Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: skip.montanaro Nosy List: jtate, r.david.murray, skip.montanaro, twb
Priority: normal Keywords: patch

Created on 2009-07-30 20:24 by jtate, last changed 2009-09-28 14:19 by jtate. This issue is now closed.

Files
File name Uploaded Description Edit
foo.py jtate, 2009-07-30 20:24 Test case showing broken dialect detection
test_csv.py.diff twb, 2009-07-31 04:16 Tests csv doublequote sniffing
csv.py.diff twb, 2009-07-31 04:22 Adds support for sniffing doublequote property
Messages (14)
msg91109 - (view) Author: Joseph Tate (jtate) Date: 2009-07-30 20:24
Given the attached code, the Sniffer.sniff routine does not set the
doublequote property.  This results in errors during reader operations.
 If the doublequote property is set in the dialect, the data is read
properly.

The data was created using oocalc, forcing it to use ascii quotes rather
than u'\u201c\u201d'.
msg91110 - (view) Author: Joseph Tate (jtate) Date: 2009-07-30 20:25
Note that no exceptions are raised, the reader just returns improperly
parsed records.
msg91122 - (view) Author: Thomas W. Barr (twb) Date: 2009-07-31 03:35
The Sniffer.sniff routine doesn't set the doublequote property at all
right now. I'm working on a patch to see if I can add this functionality.
msg91123 - (view) Author: Thomas W. Barr (twb) Date: 2009-07-31 04:16
Test for this issue.
msg91124 - (view) Author: Thomas W. Barr (twb) Date: 2009-07-31 04:17
Patch for the issue. Looks for extraneous quote inside the quotes by the
delimiters.
msg91125 - (view) Author: Thomas W. Barr (twb) Date: 2009-07-31 04:19
The documentation doesn't actually say what parameters are sniffed, so
technically, that doesn't need to be changed. Should this be added?
msg91126 - (view) Author: Thomas W. Barr (twb) Date: 2009-07-31 04:22
Reformatted line in patch.
msg91143 - (view) Author: Thomas W. Barr (twb) Date: 2009-07-31 19:26
Patch uploaded to rietveld: http://codereview.appspot.com/96202/show
msg93113 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2009-09-25 16:41
Thanks.  I don't know how to use Reitveld.  What am I supposed to
do with that?

S
msg93184 - (view) Author: Thomas W. Barr (twb) Date: 2009-09-27 20:37
I'm not actually sure where we go from here. This is my first attempted 
patch to this project, and I was hoping that someone else would be more knowledgeable about the process;-)
msg93187 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-09-28 01:23
Thomas, is the patch you uploaded to rietveld the same as the patches
attached to the ticket?  If so, Skip can just ignore the rietveld and
work from the patch files. Rietveld is really more useful for reviews of
longer patches than this one; for patches this size we generally just
use the tracker.
msg93188 - (view) Author: Thomas W. Barr (twb) Date: 2009-09-28 02:08
Got it. Yes, they're the same patch.
msg93189 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2009-09-28 02:13
Applied to trunk as rev 75102.
msg93210 - (view) Author: Joseph Tate (jtate) Date: 2009-09-28 14:19
Thank you, Thomas, for the patch, and Skip, for applying it.
History
Date User Action Args
2009-09-28 14:19:19jtatesetmessages: + msg93210
2009-09-28 02:13:18skip.montanarosetstatus: open -> closed
resolution: accepted
messages: + msg93189
2009-09-28 02:08:03twbsetmessages: + msg93188
2009-09-28 01:23:46r.david.murraysetpriority: normal
nosy: + r.david.murray
messages: + msg93187

2009-09-27 20:37:05twbsetmessages: + msg93184
2009-09-25 16:41:08skip.montanarosetmessages: + msg93113
2009-09-25 01:20:18rhettingersetassignee: skip.montanaro

nosy: + skip.montanaro
2009-07-31 19:26:50twbsetmessages: + msg91143
2009-07-31 04:22:31twbsetfiles: + csv.py.diff

messages: + msg91126
2009-07-31 04:20:49twbsetfiles: - csv.py.diff
2009-07-31 04:19:41twbsetmessages: + msg91125
2009-07-31 04:17:23twbsetfiles: + csv.py.diff

messages: + msg91124
2009-07-31 04:16:15twbsetfiles: + test_csv.py.diff
keywords: + patch
messages: + msg91123
2009-07-31 03:35:30twbsettype: behavior -> enhancement

messages: + msg91122
nosy: + twb
2009-07-30 20:25:39jtatesetmessages: + msg91110
2009-07-30 20:24:51jtatecreate