msg94437 - (view) |
Author: Bob Cannon (zacktu) |
Date: 2009-10-24 22:08 |
I used csv.writer to open a file for writing with comma as separator and
dialect='excel'. I used writerow to write each row into the file. When
I execute under linux, each line is terminated by '\r\n'. When I
execute under windows, each line is terminated by '\r\r\n'. Thus, under
MS Windows, when I read the csv file, there is a blank line between each
significant line. I have dropped cvs.writer and now build each line
manually and terminate it with '\n'. When the line is written in
windows, it is terminated by '\r\n'. That's what should happen.
As I see it, writerow with dialect='excel' should only terminate a line
with '\n'. Windows will automatically place a '\r' in front of the '\n'.
|
msg94438 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2009-10-24 22:52 |
Your output file should be opened in binary mode. Sounds like you
opened it in text mode.
|
msg94441 - (view) |
Author: Bob Cannon (zacktu) |
Date: 2009-10-24 23:10 |
Probably so. I'm sorry to report this as a bug if it's not. I asked
abut this on a Python group on IRC and got no suggestions. Thanks for
taking a look.
|
msg111694 - (view) |
Author: Andreas Balogh (baloan) |
Date: 2010-07-27 11:12 |
I encountered the same problem. It is unclear that using binary mode for the file is solving the problem. I suggest to add a hint to the documentation.
|
msg111773 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-07-28 08:07 |
Can you provide me with a concrete example which fails for you?
I don't have ready access to a Windows machine with Python on
it but should be able to arrange something at work, however before
going through the exercise of spending admin time to install
Python I would like to look at code which fails for you first.
|
msg111787 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2010-07-28 11:03 |
Bob, can you give us some code to reproduce the problem, in the form or a unit test or even just a regular function? It will help confirm the bug and fix it.
|
msg111792 - (view) |
Author: Bob Cannon (zacktu) |
Date: 2010-07-28 11:46 |
Eric,
This issue was resolved for me by Skip Montanaro's response less than an
hour after I posted it. I didn't understand why a text file had to be
binary, but I no longer had a problem with extraneous. In looking back
at my message 94441, I think that it was ambiguous and that I should
have made it clear that I no longer had a problem. Perhaps in my
ignorance as a newbie I didn't close the issue properly.
I don't know that I can reproduce the problem any more. I think that I
was writing snippets of code to try to isolate the problem and when I
used Skip's solution I changed the program and deleted the test code.
Please let me know what I can do to help you now.
Bob
Éric Araujo wrote:
> Éric Araujo <merwok@netwok.org> added the comment:
>
> Bob, can you give us some code to reproduce the problem, in the form or a unit test or even just a regular function? It will help confirm the bug and fix it.
>
> ----------
> nosy: +merwok
> stage: -> unit test needed
> title: csv.writer -> Extraneous newlines with csv.writer on Windows
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue7198>
> _______________________________________
>
|
msg111793 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2010-07-28 12:14 |
If the documentation is not clear enough about requiring binary, it is a doc bug.
(P.S. Please strip unneeded quotes. Thanks)
|
msg111832 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-07-28 17:19 |
I got access to Python 2.6.5 on Windows and ran this simple
example:
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
****************************************************************
Personal firewall software may warn about the connection IDLE
makes to its subprocess using this computer's internal loopback
interface. This connection is not visible on any external
interface and no data is sent to or received from the Internet.
****************************************************************
IDLE 2.6.5
>>> f = open("H:sample.csv", "wb")
>>> import csv
>>> writer = csv.writer(f)
>>> writer.writerow([1,2,3])
>>> writer.writerow(['a', 'b', 'c'])
>>> del writer
>>> f.close()
>>>
I then looked at the CSV file which it generated.
Looked find to me. Each of the two rows was terminated
by a single CRLF pair.
Then I repeated the "test", opening the file in text
mode:
>>> f = open("H:sample2.csv", "w")
>>> writer = csv.writer(f)
>>> writer.writerow([1,2,3])
>>> writer.writerow(['a', 'b', 'c'])
>>> del writer
>>> f.close()
>>>
That output does indeed terminate each line with
CRCRLF and when viewed in a spreadsheet program
such as OpenOffice Calc (probably Excel as well),
displays a blank line between the 123 row and the
abc row.
I've removed the "unit test needed" attribute from the
ticket as there is a test_writerows test case in the
Python test suite. Also closing again and marking
invalid. If you still believe there is actually a
problem, feel free to reopen this issue, but also
please send me (skip@pobox.com) a short example and
the erroneous output it produces for you (attach your
two files - don't just embed them in your mail msg).
|
msg111910 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-07-29 11:10 |
> If the documentation is not clear enough about requiring binary, it is
> a doc bug.
The documentation for both csv.reader and csv.writer state (this is from the
Python 2.7 version):
If *csvfile* is a file object, it must be opened with the 'b' flag on
platforms where that makes a difference.
I suppose we could be explicit and mention Windows here, but the wording is
quite clear. There is really no harm in always opening the file in binary
mode, and I do that myself even though I only program on Unix or Mac
platforms where it's safe to open the file in text mode.
This all changed in Python 3. There, the choice of line ending is up to the
programmer, so file objects for use by the csv module are opened with
newline='' and when writing CSV data the writer object takes complete
control of proper line termination according to the programmer's stated
choice of lineterminator.
Skip
|
msg124580 - (view) |
Author: John Machin (sjmachin) |
Date: 2010-12-24 00:52 |
Please re-open this. The binary/text mode problem still exists with Python 3.X on Windows. Quite simply, there is no option available to the caller to open the output file in binary mode, because the module is throwing str objects at the file. The module's idea of "taking control" in the default case appears to be to write \r\n which is then processed by the Windows runtime and becomes \r\r\n.
Python 3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> f = open('terminator31.csv', 'w')
>>> row = ['foo', None, 3.14159]
>>> writer = csv.writer(f)
>>> writer.writerow(row)
14
>>> writer.writerow(row)
14
>>> f.close()
>>> open('terminator31.csv', 'rb').read()
b'foo,,3.14159\r\r\nfoo,,3.14159\r\r\n'
>>>
And it's not just a row terminator problem; newlines embedded in fields are likewise expanded to \r\n by the Windows runtime.
|
msg124598 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-12-24 16:38 |
John,
The API for the open() builtin function has changed. You should open
the output file with newline="" instead of using the default. Take a
look at the documentation for open() and csv.reader:
http://docs.python.org/py3k/library/functions.html?highlight=open#open
http://docs.python.org/py3k/library/csv.html?highlight=csv.reader#csv.reader
Note the form of the open() call in the csv.reader example. This one
snuck by me as well. Python 3 underwent a lot of change in the I/O
subsystem. This was one of them. If changing the form of the open()
call doesn't fix the problem, let me know.
Skip
|
msg124678 - (view) |
Author: John Machin (sjmachin) |
Date: 2010-12-26 20:52 |
Skip, I'm WRITING, not reading.. Please read the 3.1 documentation for csv.writer. It does NOT mention newline='', and neither does the example. Please fix.
Other problems with the examples: (1) They encourage a bad habit (open inside the call to reader/writer); good practice is to retain the reference to the file handle (preferably with a "with" statement) so that it can be closed properly. (2) delimiter=' ' is very unrealistic.
The documentation for both 2.x and 3.x should be much more explicit about what is needed in open() for csv to work properly and portably:
2.x read: use mode='rb' -- otherwise fail on Windows
2.x write: use mode='wb' -- otherwise fail on Windows
3.x read: use newline='' -- otherwise fail unconditionally(?)
3.x write: use newline='' -- otherwise fail on Windows
The 2.7 documentation says """If csvfile is a file object, it must be opened with the 'b' flag on platforms where that makes a difference""" ... in my experience, people are left asking "what platforms? what difference?"; Windows should be mentioned explicitly.
|
msg124682 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2010-12-26 22:52 |
OK, I'm reopening this as a doc issue, since currently the Python3 writer docs do not mention newline='', and it is indeed required on Windows. John, would you care to suggest a doc patch?
I agree with Skip that "where it makes a difference" is more precise than specifically mentioning Windows, even if less useful in this context. That is how the 'b' mode is documented in the open documentation. To fix the problem with the CSV docs, the recommendation to use 'b' can simply be made unconditional, as it is for newline='' in python3.
|
msg126593 - (view) |
Author: John Machin (sjmachin) |
Date: 2011-01-20 06:43 |
"docpatch" for 3.x csv docs:
In the csv.writer docs, insert the sentence "If csvfile is a file object, it should be opened with newline=''." immediately after the sentence "csvfile can be any object with a write() method."
In the closely-following example, change the open call from "open('eggs.csv', 'w')" to "open('eggs.csv', 'w', newline='')".
In section 13.1.5 Examples, there are 2 reader cases and 1 writer case that likewise need inserting ", newline=''" in the open call.
|
msg131400 - (view) |
Author: John Machin (sjmachin) |
Date: 2011-03-19 07:55 |
Can somebody please review my "doc patch" submitted 2 months ago?
|
msg131417 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2011-03-19 13:59 |
John> John Machin <sjmachin@lexicon.net> added the comment:
John> Can somebody please review my "doc patch" submitted 2 months ago?
My apologies. I have it in my sandbox, but a combination of the switch to
Mercurial and lack of round tuits has conspired to keep me from checking it
in.
Skip
|
msg131418 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2011-03-19 14:06 |
Actually, I was thinking of another doc patch for the csv module.
Your changes (or something very like them) made it into the 3.2
release, as you can see here:
http://docs.python.org/py3k/library/csv.html
S
|
msg131443 - (view) |
Author: John Machin (sjmachin) |
Date: 2011-03-19 21:27 |
Skip, The changes that I suggested have NOT been made. Please re-read the doc page you pointed to. The "writer" paragraph does NOT mention that newline='' is required when writing. The "writer" examples do NOT include newline=''. The examples have NOT been enhanced by using a "with" statement and not using space as an example delimiter.
PLEASE RE-OPEN THIS ISSUE.
|
msg131468 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-03-20 02:36 |
New changeset ab27f16f707a by R David Murray in branch 'default':
#7198: add newlines='' to csv.writer docs.
http://hg.python.org/cpython/rev/ab27f16f707a
New changeset 959f666470cc by R David Murray in branch 'default':
Merge #7198 doc fix.
http://hg.python.org/cpython/rev/959f666470cc
New changeset 9d1b1a95bc8f by R David Murray in branch 'default':
Merge #7198 doc fix.
http://hg.python.org/cpython/rev/9d1b1a95bc8f
|
msg131469 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-20 02:38 |
Fixed now. Thanks, and sorry for the delay, and the confusion.
|
msg131475 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-20 04:02 |
Gah, I messed up the push. Now I have to backport the doc fix :(
|
msg131495 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-03-20 14:30 |
New changeset 9201455f950b by R David Murray in branch '3.1':
#7198: really add newline='' to csv.writer docs.
http://hg.python.org/cpython/rev/9201455f950b
New changeset fa0563f3b7f7 by R David Murray in branch '3.2':
Really merge #7198
http://hg.python.org/cpython/rev/fa0563f3b7f7
New changeset ed0d1e07ce79 by R David Murray in branch 'default':
Dummy merge #7198
http://hg.python.org/cpython/rev/ed0d1e07ce79
|
msg131496 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-20 14:31 |
OK, now it's really done (I hope!).
|
msg131498 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2011-03-20 14:55 |
John> Skip, The changes that I suggested have NOT been made. Please
John> re-read the doc page you pointed to. The "writer" paragraph does
John> NOT mention that newline='' is required when writing. The "writer"
John> examples do NOT include newline=''. The examples have NOT been
John> enhanced by using a "with" statement and not using space as an
John> example delimiter.
I copied the statement about using newline= from the reader() doc to the
writer() doc. All the examples I see (I'm looking at the cpython repo -
that is, what will be 3.3) use the with statement and open files using
newline=''. I don't think more changes are necessary. I will consult with
other Python developers about merging these changes to other active
branches. I simply don't understand the new Mercurial workflow well enough
to do it properly.
Skip
|
msg131501 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-03-20 15:40 |
New changeset 88876a264ebe by R David Murray in branch '3.1':
Markup fixes for #7198 patch.
http://hg.python.org/cpython/rev/88876a264ebe
New changeset d0d1235cb66e by R David Murray in branch '3.2':
Merge markup fixes for #7198 patch.
http://hg.python.org/cpython/rev/d0d1235cb66e
New changeset 2a8580f4897c by R David Murray in branch 'default':
Markup fixes for #7198 patch.
http://hg.python.org/cpython/rev/2a8580f4897c
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:54 | admin | set | github: 51447 |
2011-03-20 15:40:06 | python-dev | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev messages:
+ msg131501 |
2011-03-20 14:55:27 | skip.montanaro | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev messages:
+ msg131498 |
2011-03-20 14:31:23 | r.david.murray | set | status: open -> closed nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev messages:
+ msg131496
|
2011-03-20 14:30:19 | python-dev | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev messages:
+ msg131495 |
2011-03-20 04:02:40 | r.david.murray | set | status: closed -> open assignee: skip.montanaro -> r.david.murray nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev |
2011-03-20 04:02:18 | r.david.murray | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev messages:
+ msg131475 |
2011-03-20 02:38:59 | r.david.murray | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu, python-dev messages:
+ msg131469 resolution: accepted -> fixed stage: needs patch -> resolved |
2011-03-20 02:36:47 | python-dev | set | nosy:
+ python-dev messages:
+ msg131468
|
2011-03-19 21:27:04 | sjmachin | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu messages:
+ msg131443 |
2011-03-19 14:06:04 | skip.montanaro | set | status: open -> closed
messages:
+ msg131418 resolution: accepted nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu |
2011-03-19 13:59:18 | skip.montanaro | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu messages:
+ msg131417 |
2011-03-19 07:55:31 | sjmachin | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu messages:
+ msg131400 |
2011-01-20 06:43:34 | sjmachin | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, r.david.murray, zacktu messages:
+ msg126593 |
2010-12-26 22:52:14 | r.david.murray | set | status: closed -> open
components:
+ Documentation versions:
- Python 3.3 nosy:
+ r.david.murray
messages:
+ msg124682 resolution: not a bug -> (no value) stage: needs patch |
2010-12-26 20:52:49 | sjmachin | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, zacktu messages:
+ msg124678 versions:
+ Python 2.7, Python 3.2, Python 3.3 |
2010-12-24 16:38:56 | skip.montanaro | set | nosy:
skip.montanaro, sjmachin, baloan, eric.araujo, zacktu messages:
+ msg124598 |
2010-12-24 00:52:41 | sjmachin | set | nosy:
+ sjmachin
messages:
+ msg124580 versions:
+ Python 3.1, - Python 2.6 |
2010-07-29 11:10:30 | skip.montanaro | set | messages:
+ msg111910 |
2010-07-28 17:19:36 | skip.montanaro | set | status: open -> closed resolution: not a bug messages:
+ msg111832
stage: test needed -> (no value) |
2010-07-28 12:14:33 | eric.araujo | set | messages:
+ msg111793 |
2010-07-28 11:46:53 | zacktu | set | messages:
+ msg111792 |
2010-07-28 11:03:56 | eric.araujo | set | nosy:
+ eric.araujo title: csv.writer -> Extraneous newlines with csv.writer on Windows messages:
+ msg111787
stage: test needed |
2010-07-28 08:07:06 | skip.montanaro | set | status: closed -> open assignee: skip.montanaro messages:
+ msg111773
|
2010-07-27 11:12:12 | baloan | set | nosy:
+ baloan messages:
+ msg111694
|
2009-10-24 23:11:18 | zacktu | set | status: open -> closed |
2009-10-24 23:10:41 | zacktu | set | messages:
+ msg94441 |
2009-10-24 22:52:30 | skip.montanaro | set | nosy:
+ skip.montanaro messages:
+ msg94438
|
2009-10-24 22:08:39 | zacktu | set | messages:
+ msg94437 |
2009-10-24 22:07:03 | zacktu | create | |