This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author josh.r
Recipients Shane Smith, josh.r, martin.panter
Date 2019-03-04.18:38:45
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1551724726.41.0.951100146642.issue36172@roundup.psfhosted.org>
In-reply-to
Content
Unless someone disagrees soon, I'm going to close this as documented behavior/not a bug. AFAICT, the only "fixes" available for this are:

1. Changing the default dialect from 'excel' to something else. Problem: Breaks correct code dependent on the excel dialect, but code could explicitly opt back in.

2. Change the 'excel' dialect. Problem: Breaks correct code dependent on the excel dialect, with no obvious way to opt back in.

3. Per #10954, check the file object to ensure it's not translating newlines and raise an exception otherwise. Problem: AFAICT, there is no documented API to check this (the result of calling open, with or without passing newline='', looks identical initially, never changes in write mode, and even in read mode, only exposes the newlines observed through the .newlines attribute, not whether or not they were translated), adding one wouldn't change all other file-like objects, so the change would need to propagate to all other built-in and third-party file APIs, and for some file-like objects, it wouldn't make sense to have this API at all (io.StringIO, being purely in memory, doesn't need to do translation of any kind)

4. (Extreme solution) Add io APIs (or add arguments to APIs) for reading/writing without newline translation (that is, whether or not newline is passed to open, you can read/write without translation), e.g. read(size) becomes read(size, translate_newlines=None) where None indicates default behavior, or we add read_untranslated(size) as an independent API. Problem: Like #3, this requires us to create new, mandatory APIs in the io module that would then need to propagate to all other built-in and third-party file APIs.

Point is, the simple solutions (1/2) break correct code, and the complex solutions (3/4) involve major changes to the io module (and all other file-like object producers) and/or the csv module.

Even then, nothing shy of #4 would make broken code just work, they just fail loudly. Both #3 and #4 would require cascading changes to every file-like object (both built-in and third-party) to make them work; for the file-like objects that aren't updated, we're stuck choosing between issuing a warning that most folks won't see, then ignoring the problem, or making those file-like objects without the necessary API cause true exceptions (making them unusable until the third party package is updated).

If a fix is needed, I think my suggestion would be to do one or both of:

1. Emphasize the newline='' warning in the csv.reader/writer/DictReader/DictWriter docs (right now it's just one more unemphasized line in a fairly long wall of text for each)

2. Put a large, top-of-module warning about this at the top of the csv module docs, so people reading the basic module description are exposed to the warning before they even reach the API.

Might help a few folks who are skimming without reading for detail.
History
Date User Action Args
2019-03-04 18:38:46josh.rsetrecipients: + josh.r, martin.panter, Shane Smith
2019-03-04 18:38:46josh.rsetmessageid: <1551724726.41.0.951100146642.issue36172@roundup.psfhosted.org>
2019-03-04 18:38:46josh.rlinkissue36172 messages
2019-03-04 18:38:45josh.rcreate