This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author peter.otten
Recipients Tiago Wright, peter.otten, r.david.murray, skip.montanaro
Date 2015-08-08.07:49:07
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1439020148.66.0.277716626142.issue24787@psf.upfronthosting.co.za>
In-reply-to
Content
Have you considered writing your own little sniffer? Getting it right for your actual data is usually easier to achieve than a general solution.

The following simplistic sniffer should work with your samples:

def make_dialect(delimiter):
    class Dialect(csv.excel):
        pass
    Dialect.delimiter = delimiter
    return Dialect

def sniff(sample):
    count, delimiter = max(
        ((sample.count(delim), delim) for delim in ",\t|;"),
        key=operator.itemgetter(0))
    if count == 0:
        if " " in sample:
            delimiter = " "
        else:
            raise csv.Error("Could not determine delimiter")
    return make_dialect(delimiter)

Tiago, If you want to follow that path we should take the discussion to the general python mailing list.
History
Date User Action Args
2015-08-08 07:49:08peter.ottensetrecipients: + peter.otten, skip.montanaro, r.david.murray, Tiago Wright
2015-08-08 07:49:08peter.ottensetmessageid: <1439020148.66.0.277716626142.issue24787@psf.upfronthosting.co.za>
2015-08-08 07:49:08peter.ottenlinkissue24787 messages
2015-08-08 07:49:07peter.ottencreate