Author Antoon.Pardon
Recipients Antoon.Pardon, GhislainHivon, dmi.baranov
Date 2014-03-25.09:30:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I had a look at this and have the following remarks.

1) the file no longer works with python 3.3. It now produces the folowing traceback:

Traceback (most recent call last):
  File "", line 36, in <module>
  File "", line 23, in create_file
TypeError: 'str' does not support the buffer interface

2) The problem seems to be in the _guess_quote_and_delimiter method. If you always call _guess_delimiter, the sniffer give the correct result.

3) As far as I understand the problem is the first regular expression:
(?P<delim>[^\w\n"\'])(?P<space> ?)(?P<quote>["\']).*?(?P=quote)(?P=delim)

Now if we have a line as the following

273:MVREGR1:ByEuPo:"Baryton ""Euphonium"" populaire"

The delim group will match the space, the space group will match nothing the quote group will match " the non-group pattern will match "Euphonium" followed by the quote group matching " again and the delim group matching the space.

And so we get the wrong delimiter.
Date User Action Args
2014-03-25 09:30:30Antoon.Pardonsetrecipients: + Antoon.Pardon, GhislainHivon, dmi.baranov
2014-03-25 09:30:30Antoon.Pardonsetmessageid: <>
2014-03-25 09:30:30Antoon.Pardonlinkissue17829 messages
2014-03-25 09:30:29Antoon.Pardoncreate