This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add a "strict" parameter to csv.writer and csv.DictWriter
Type: enhancement Stage:
Components: Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: corona10, eric.smith, serhiy.storchaka
Priority: normal Keywords:

Created on 2020-05-30 12:31 by eric.smith, last changed 2022-04-11 14:59 by admin.

Messages (10)
msg370380 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-05-30 12:31
Currently, the csv library calls str() on each value it writes. This can lead to surprising behavior, see issue40762 for example.

On the other hand, for writing the documentation says that the values must be strings or numbers.

The proposed "strict" argument would raise a TypeError if the supplied values are not strings, numbers, or None.

See https://github.com/python/cpython/blob/ba1c2c85b39fbcb31584c20f8a63fb87f9cb9c02/Modules/_csv.c#L1203 for where str() is called.

The documentation should be changed to note that None is allowed. Currently, None results in an empty string. I'm not proposing to change this, just document it as one of the allowed types.

How to check for "value is a number" needs to be decided.
msg370382 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-05-30 12:32
For backward compatibility, strict would have to default to False.
msg370393 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-05-30 17:25
I am +1 on with strict mode.
But I want to hear other core developers opinions.
msg370402 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-05-30 20:10
I guess an isinstance check against numbers.Number would be the best way to check if an argument is a number. I'm not sure how convenient that is from C code.
msg370424 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-31 11:23
There is PyNumber_Check(). It is not direct analog of isinstance(obj, numbers.Number), it checks that the object can be explicitly converted to the real number (int or float). UUID and IPv4Address pass this check.

As a narrow check we can use isinstance(obj, (str, int, float)). It does not accept Fraction, Decimal and numpy numbers, but it is what such modules like json or plistlib accept.
msg370485 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-05-31 15:22
Losing Decimal would be a problem. I use those a lot in CSV files, and I assume others do, too.
msg370486 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-31 15:25
We can check only nb_index and nb_float. It will include Fraction, Decimal and NumPy numbers and exclude UUID and IPv4Address.
msg370490 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-05-31 17:17
That seems like a good plan.
msg370498 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-31 19:03
Yes, converting Decimal to float can lose precision, so we cannot require this. PyNumber_Check() is already used for QUOTE_NONNUMERIC, so it would be logical to use it in determining that the object is a number in the cvs module.

But now the problem is with determining what is a "string". There is no way to check whether the object is "string-like", because virtually all objects can be converted to string. The only standard exception is bytes and bytearray for which str() may emit BytesWarning, so this conversion is not reliable. If restrict it only to an instance of str or its subclasses, it may break other user cases, for example writing Path in CVS.
msg370613 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-06-02 17:01
I wouldn't have a problem with isinstance(obj, str) for a string check in strict mode. If you want to write something like a Path, convert it to a string yourself. That's exactly the behavior I'd like enforced by strict: only accept numbers and actual strings.
History
Date User Action Args
2022-04-11 14:59:31adminsetgithub: 85002
2020-06-02 17:01:51eric.smithsetmessages: + msg370613
2020-05-31 19:03:35serhiy.storchakasetmessages: + msg370498
2020-05-31 17:17:35eric.smithsetmessages: + msg370490
2020-05-31 15:25:03serhiy.storchakasetmessages: + msg370486
2020-05-31 15:22:18eric.smithsetmessages: + msg370485
2020-05-31 11:23:51serhiy.storchakasetmessages: + msg370424
2020-05-30 20:10:05eric.smithsetmessages: + msg370402
2020-05-30 17:25:47corona10setnosy: + serhiy.storchaka, corona10
messages: + msg370393
2020-05-30 12:32:47eric.smithsetmessages: + msg370382
2020-05-30 12:31:33eric.smithcreate