This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: CSV Sniffer fails to report mismatch of column counts
Type: behavior Stage: test needed
Components: Extension Modules Versions: Python 2.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: andrewmcnamara Nosy List: andrewmcnamara, skip.montanaro, tekkaman, vvrsalob
Priority: normal Keywords:

Created on 2006-02-13 22:47 by vvrsalob, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg27509 - (view) Author: Vinko (vvrsalob) Date: 2006-02-13 22:47
If one line of a CSV file is missing one or more
commas, the delimiter detection code of the Sniffer
class fails, setting delimiter to an empty string.

This leads to a totally misleading error when using
has_header(). 

This code shows the problem (Python 2.4.2, FC3 and
Ubuntu Breezy):

import csv

str1 = "a,b,c,d\r\n1,2,foo bar,dead
beef\r\nthis,line,is,ok\r\n"
str2 = "a,b,c,d\r\n1,2,foo bar,dead beef\r\nthis,line
is,not\r\n"

s = csv.Sniffer()

d1 = s.sniff(str1)
d2 = s.sniff(str2)

for line in str1.split('\r\n'):
    print line.count(',')

print d1.delimiter
print s.has_header(str1)

for line in str2.split('\r\n'):
    print line.count(',')

print d2.delimiter
print s.has_header(str2)
msg27510 - (view) Author: Simone Leo (tekkaman) Date: 2007-04-11 15:59
Problem is still there as of Python 2.4.3.

Trying to read in a file whose lines have a different number of fields, I get:

Traceback (most recent call last):
  File "../myscript.py", line 59, in ?
    main()
  File "../myscript.py", line 30, in main
    reader = csv.reader(fin, dialect)
TypeError: bad argument type for built-in operation

where "dialect" has been sniffed by feeding the first two lines to the Sniffer.

What I expect is to either:
  1. get different sized rows, with no exception raised
  2. get a csv.Error instead of the TypeError above

Thanks
msg57101 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2007-11-04 16:19
This appears to work better in 2.5 and 2.6 (it doesn't crash, though it 
gets the delimiter wrong) but does indeed fail in 2.4.
msg84138 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2009-03-25 00:59
Closing as won't fix.  There are bound to be limits to how the Sniffer
class works.  I'm not sure it's worth the effort necessary to fix this
corner case.

(Andrew, reopen if you want to tackle this.)
History
Date User Action Args
2022-04-11 14:56:15adminsetgithub: 42898
2009-03-25 00:59:37skip.montanarosetstatus: open -> closed
resolution: wont fix
messages: + msg84138
2009-03-25 00:52:55skip.montanarosetmessages: - msg84137
2009-03-25 00:52:46skip.montanarosetmessages: + msg84137
2009-03-20 21:46:53ajaksu2setstage: test needed
type: behavior
versions: + Python 2.6, - Python 2.4
2007-11-04 16:19:49skip.montanarosetnosy: + skip.montanaro
messages: + msg57101
2006-02-13 22:47:32vvrsalobcreate