Message 248254 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	peter.otten
Recipients	Tiago Wright, peter.otten, r.david.murray, skip.montanaro
Date	2015-08-08.07:49:07
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1439020148.66.0.277716626142.issue24787@psf.upfronthosting.co.za>
In-reply-to

Content
Have you considered writing your own little sniffer? Getting it right for your actual data is usually easier to achieve than a general solution. The following simplistic sniffer should work with your samples: def make_dialect(delimiter): class Dialect(csv.excel): pass Dialect.delimiter = delimiter return Dialect def sniff(sample): count, delimiter = max( ((sample.count(delim), delim) for delim in ",\t\|;"), key=operator.itemgetter(0)) if count == 0: if " " in sample: delimiter = " " else: raise csv.Error("Could not determine delimiter") return make_dialect(delimiter) Tiago, If you want to follow that path we should take the discussion to the general python mailing list.

Have you considered writing your own little sniffer? Getting it right for your actual data is usually easier to achieve than a general solution.

The following simplistic sniffer should work with your samples:

def make_dialect(delimiter):
    class Dialect(csv.excel):
        pass
    Dialect.delimiter = delimiter
    return Dialect

def sniff(sample):
    count, delimiter = max(
        ((sample.count(delim), delim) for delim in ",\t|;"),
        key=operator.itemgetter(0))
    if count == 0:
        if " " in sample:
            delimiter = " "
        else:
            raise csv.Error("Could not determine delimiter")
    return make_dialect(delimiter)

Tiago, If you want to follow that path we should take the discussion to the general python mailing list.

History
Date	User	Action	Args
2015-08-08 07:49:08	peter.otten	set	recipients: + peter.otten, skip.montanaro, r.david.murray, Tiago Wright
2015-08-08 07:49:08	peter.otten	set	messageid: <1439020148.66.0.277716626142.issue24787@psf.upfronthosting.co.za>
2015-08-08 07:49:08	peter.otten	link	issue24787 messages
2015-08-08 07:49:07	peter.otten	create