classification
Title: CSV parser fails to iterate properly on 2.6.6
Type: Stage: resolved
Components: None Versions: Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: r.david.murray, sleepycal, tim.golden
Priority: normal Keywords:

Created on 2012-07-20 15:43 by sleepycal, last changed 2012-07-20 17:32 by sleepycal. This issue is now closed.

Messages (7)
msg165938 - (view) Author: Cal Leeming (sleepycal) Date: 2012-07-20 15:43
Getting some extremely strange behavior when attempting to parse a fairly standard CSV in Python 2.6.6.

I've tried a whole different mixture of dialects, quoting options, line terminators etc, and none seem to get a happy ending.

Spent about 2 hours banging my head against a brick wall on this, and struggling to see how the CSV libs could be so fundamentally broken, given that I couldn't find any other related bugs.

I have attempted to parse the following CSV data:

"First","Middle","Last","Nickname","Email","Category"
"Moe","","Howard","Moe","moe@3stooges.com","actor"
"Jerome","Lester","Howard","Curly","curly@3stooges.com","actor"
"Larry","","Fine","Larry","larry@3stooges.com","musician"
"Jerome","","Besser","Joe","joe@3stooges.com","actor"
"Joe","","DeRita","CurlyJoe","curlyjoe@3stooges.com","actor"
"Shemp","","Howard","Shemp","shemp@3stooges.com","actor"

The code used to parse was this:

datx = open("data.txt", "rb").read()
rows = csv.reader( datx , dialect="wat")
for row in rows:
    print x

The output given is this:

['First']
['', '']
['Middle']
['', '']
['Last']
['', '']
['Nickname']
['', '']
['Email']
['', '']
['Category']
[]
['Moe']
['', '']
['']
['', '']
['Howard']
['', '']
['Moe']
['', '']
['moe@3stooges.com']
['', '']
['actor']
[]
['Jerome']
['', '']
['Lester']
['', '']
['Howard']
['', '']
['Curly']
['', '']
['curly@3stooges.com']
['', '']
['actor']
[]
['Larry']
['', '']
['']
['', '']
['Fine']
['', '']
['Larry']
['', '']
['larry@3stooges.com']
['', '']
['musician']
[]
['Jerome']
['', '']
['']
['', '']
['Besser']
['', '']
['Joe']
['', '']
['joe@3stooges.com']
['', '']
['actor']
[]
['Joe']
['', '']
['']
['', '']
['DeRita']
['', '']
['CurlyJoe']
['', '']
['curlyjoe@3stooges.com']
['', '']
['actor']
[]
['Shemp']
['', '']
['']
['', '']
['Howard']
['', '']
['Shemp']
['', '']
['shemp@3stooges.com']
['', '']
['actor']
[]
msg165939 - (view) Author: Cal Leeming (sleepycal) Date: 2012-07-20 15:45
Sorry, accidently pasted the wrong code snippet previously. The correct code snippet is:

datx = open("data.txt", "rb").read()
rows = csv.reader( datx )
for row in rows:
    print x
msg165940 - (view) Author: Cal Leeming (sleepycal) Date: 2012-07-20 15:46
This bug also seems to be showing in 2.7.3
msg165941 - (view) Author: Cal Leeming (sleepycal) Date: 2012-07-20 15:53
Okay, just found the reason for this.. It's because I was putting a .read() on the file descriptor..

I really think that the CSVReader should raise an assertion in the event that it is passed an object which has no iterator, or if it is given a string, as this is a fairly easy mistake to make.
msg165944 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2012-07-20 16:06
It already produces a TypeError with a specific message if the input is
not iterable. You seem to be using a homegrown dialect; with the
conventional list (csv.reader("the quick brown fox")) you very quickly
see that you're iterating over a string.
msg165951 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-07-20 17:29
We don't generally do that kind of type checking.
msg165952 - (view) Author: Cal Leeming (sleepycal) Date: 2012-07-20 17:32
@david Gotcha - I had a feeling that would be the case. Thank you for the quick replies anyway guys! Hopefully this will help others in the future :)
History
Date User Action Args
2012-07-20 17:32:06sleepycalsetmessages: + msg165952
2012-07-20 17:30:11r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg165951

resolution: not a bug
stage: resolved
2012-07-20 16:07:00tim.goldensetnosy: + tim.golden
messages: + msg165944
2012-07-20 15:53:40sleepycalsetmessages: + msg165941
2012-07-20 15:46:20sleepycalsetmessages: + msg165940
2012-07-20 15:45:22sleepycalsetmessages: + msg165939
2012-07-20 15:43:01sleepycalcreate