This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author keef604
Recipients Mariatta, keef604, r.david.murray
Date 2017-04-11.01:11:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1491873119.94.0.245999605587.issue30034@psf.upfronthosting.co.za>
In-reply-to
Content
As you say, David, however much we would like the world to stick to a given CSV standard, the reality is that people don't, which is all the more reason for making the csv reader flexible and forgiving.

The csv module can and should be used for more than just "comma-separated-values" files.  I use it for all sorts of different delimited files, and it works very well.  Pandas uses it, as I'm sure do many other packages.  It's such a good module, it would be a pity to restrict its scope to just Excel-related scenarios.  Parsing delimited files is undoubtedly complex, and painfully slow if done with pure Python, so the more that can be done in C the better.

I'm no C programmer, but my guesstimate is that the coding changes I'm proposing are relatively modest.  In the IN_QUOTED_FIELD section (https://github.com/python/cpython/blob/master/Modules/_csv.c#L690), it would mean checking for newline characters if the new "multiline" attribute is False (and probably "strict" is False too).  Of course there is more to this change than just that, but I'm guessing not that much more.
History
Date User Action Args
2017-04-11 01:12:00keef604setrecipients: + keef604, r.david.murray, Mariatta
2017-04-11 01:11:59keef604setmessageid: <1491873119.94.0.245999605587.issue30034@psf.upfronthosting.co.za>
2017-04-11 01:11:59keef604linkissue30034 messages
2017-04-11 01:11:58keef604create