Message 369869 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	remi.lapeyre
Recipients	remi.lapeyre, serhiy.storchaka, sidhant
Date	2020-05-25.09:46:23
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1590399983.6.0.367390116823.issue40762@roundup.psfhosted.org>
In-reply-to

Content
> As an example, if I write character "A" as a byte, i.e b'A' in a csv file But you can't write b'A' in a csv file, what you can't do is write `b'a'.decode()` or `b'a'.decode('latin1')` or `b'a'.decode('whatever')` but the string representation of a byte string is dependant on the character encoding and it's not possible to guess it, which is why bytes and string were separated in Python3 in the first place. Since csv can't write bytes, it gets a string representation of the by calling str(), as it would with any other objects. > which is not what the user would have wanted in majority of the use-cases If you try to guess the encoding, you will obligatory fail in some case and the resulting file will be corrupted. The only way here is for your users to fix their program and decode thee byte string using the correct encoding before giving them to csv.

> As an example, if I write character "A" as a byte, i.e b'A' in a csv file

But you can't write b'A' in a csv file, what you can't do is write `b'a'.decode()` or `b'a'.decode('latin1')` or `b'a'.decode('whatever')` but the string representation of a byte string is dependant on the character encoding and it's not possible to guess it, which is why bytes and string were separated in Python3 in the first place.

Since csv can't write bytes, it gets a string representation of the by calling str(), as it would with any other objects.

> which is not what the user would have wanted in majority of the use-cases

If you try to guess the encoding, you will obligatory fail in some case and the resulting file will be corrupted. The only way here is for your users to fix their program and decode thee byte string using the correct encoding before giving them to csv.

History
Date	User	Action	Args
2020-05-25 09:46:23	remi.lapeyre	set	recipients: + remi.lapeyre, serhiy.storchaka, sidhant
2020-05-25 09:46:23	remi.lapeyre	set	messageid: <1590399983.6.0.367390116823.issue40762@roundup.psfhosted.org>
2020-05-25 09:46:23	remi.lapeyre	link	issue40762 messages
2020-05-25 09:46:23	remi.lapeyre	create