This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Yhojann Aguilera
Recipients Yhojann Aguilera, ezio.melotti, vstinner
Date 2019-08-29.22:17:03
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1567117024.15.0.215995088673.issue37984@roundup.psfhosted.org>
In-reply-to
Content
Unable parse a csv with latin iso charset.

with open('./exported.csv', newline='') as csvFileHandler:
            csvHandler = csv.reader(csvFileHandler, delimiter=';', quotechar='"')
            for line in csvHandler:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 1032: invalid continuation byte

I try using a binary mode on open() but says: binary mode doesn't take a newline argument. Ok, replace newline to binary char: newline=b'', but says: open() argument 6 must be str or None, not bytes. Ok, remove newline argument: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?).

Ok, csv module no support binary read mode. Try use latin iso:

with open('./exported.csv', mode='r', encoding='ISO-8859', newline='') as csvFileHandler:

UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: character maps to <undefined>

But the charset is latin iso:

$ file exported.csv 
exported.csv: ISO-8859 text, with very long lines, with CRLF line terminators

Ok, change to ISO-8859-8:

UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: character maps to <undefined>

Unable load the file. Why not give the option to work binary? the delimiters can be represented with binary values.
History
Date User Action Args
2019-08-29 22:17:04Yhojann Aguilerasetrecipients: + Yhojann Aguilera, vstinner, ezio.melotti
2019-08-29 22:17:04Yhojann Aguilerasetmessageid: <1567117024.15.0.215995088673.issue37984@roundup.psfhosted.org>
2019-08-29 22:17:04Yhojann Aguileralinkissue37984 messages
2019-08-29 22:17:03Yhojann Aguileracreate