This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sidhant
Recipients remi.lapeyre, serhiy.storchaka, sidhant
Date 2020-05-25.11:57:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1590407850.23.0.373517096169.issue40762@roundup.psfhosted.org>
In-reply-to
Content
Hi Remi,

Currently a code like this:
```
with open("abc.csv", "w", encoding='utf-8') as f:
    data = [b'\x41']
    w = csv.writer(f)
    w.writerow(data)
with open("abc.csv", "r") as f:
    rows = csv.reader(f)
    for row in rows:
        print(row[0]) # prints b'A'
```
Is able to write the string "b'A'" in a CSV file. You are correct that the ideal way should indeed be to decode the byte first.

However if a user does not decode the byte then the CSV module calls the str() method on the byte object as you said, but in real-life that b-prefixed string is just not readable by another program in an easy way (they will need to first chop off the b-prefix and single quotes around the string) and has turned out to be a pain point in one of the pandas issue I referred to in my first message.

Also I am not sure if you have taken a look at my PR, but my approach to fix this problem does NOT involve guessing the encoding scheme used, instead we simply use the encoding scheme that the user provided when they open the file object. So if you open it with `open("abc.csv", "w", encoding="latin1")` then it will try to decode the byte using "latin1". Incase it fails to decode using that, then it will throw a UnicodeDecodeError (So there is no unknowing file corruption, a UnicodeDecode error is thrown when this happens). You can refer to the tests + NEWS.d in the PR to confirm the same.
History
Date User Action Args
2020-05-25 11:57:30sidhantsetrecipients: + sidhant, serhiy.storchaka, remi.lapeyre
2020-05-25 11:57:30sidhantsetmessageid: <1590407850.23.0.373517096169.issue40762@roundup.psfhosted.org>
2020-05-25 11:57:30sidhantlinkissue40762 messages
2020-05-25 11:57:29sidhantcreate