This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author skip.montanaro
Recipients amaury.forgeotdarc, jplaverdure, skip.montanaro, tds333
Date 2008-03-29.15:15:26
SpamBayes Score 0.07401999
Marked as misclassified No
Message-id <18414.23817.630359.900652@montanaro-dyndns-org.local>
In-reply-to <1206799282.37.0.305030349564.issue2078@psf.upfronthosting.co.za>
Content
>> It works entirely based on chracter frequencies.

    Amaury> Does it make sense to restrict delimiters to a reasonable set of
    Amaury> characters? Usual punctuations, spaces, tabs... what else?

There is an optional delimiters argument to the sniff() method which
defaults to None.  I would be happier if it was "the usual suspects"
(NeoOffice refuses to gues, but offers TAB, space, semicolon and comma as
the default separators when importing a CSV file - Excel seems to just
figure it out).  That would change the behavior though.  With no delimiter
set it's generally going to find something, just pick incorrectly.  With a
non-existent delimiter set it's going to raise an exception.  I'm not sure
this would be a good tradeoff and would just break existing code.

Skip
History
Date User Action Args
2008-03-29 15:15:27skip.montanarosetspambayes_score: 0.07402 -> 0.07401999
recipients: + skip.montanaro, amaury.forgeotdarc, tds333, jplaverdure
2008-03-29 15:15:27skip.montanarolinkissue2078 messages
2008-03-29 15:15:26skip.montanarocreate