Issue 1767398: test_csv struni fixes + unicode support in _csv

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/45275

classification

Title:	test_csv struni fixes + unicode support in _csv
Type:		Stage:
Components:	None	Versions:	Python 3.0

process

Status:	closed	Resolution:	accepted
Dependencies:		Superseder:
Assigned To:	gvanrossum	Nosy List:	gvanrossum, hupp, skip.montanaro
Priority:	normal	Keywords:	patch

Created on 2007-08-04 00:11 by hupp, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
py3k-struni-csv.patch	hupp, 2007-08-04 00:11

Messages (4)
msg52984 - (view)	Author: Adam Hupp (hupp)	Date: 2007-08-04 00:11
This patch fixes test_csv.py for the struni branch and modifies _csv.c to support unicode strings. Changes: 1. The test_csv.py failures caused by bytes/str conflicts have been resolved. 2. Uses of mkstemp have been replaced with TemporaryFile in a 'with' block. 3. The _csv.c module now uses unicode for string handling. I've uncommented the unicode read tests in test_csv.py, and added tests for writing unicode content and a unicode delimiter. All tests are now passing on my system (linux).
msg52985 - (view)	Author: Skip Montanaro (skip.montanaro) *	Date: 2007-08-05 13:07
Adam, I've spent some time looking at this patch. Bear in mind this is my first foray into Py3k. Still, I'm confused about what's going on here. I'm hoping you can help me understand the changes. In parse_save_field, you replaced PyString_FromStringAndSize with PyUnicode_FromUnicode, however in get_nullchar_as_None you replaced it with PyUnicode_DecodeASCII. When I execute the csv tests there are a number of assertion errors related to the default delimiter. The traceback goes something like this: FAIL: test_writer_kw_attrs (__main__.Test_Csv) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_csv.py", line 88, in test_writer_kw_attrs self._test_kw_attrs(csv.writer, StringIO()) File "Lib/test/test_csv.py", line 75, in _test_kw_attrs self.assertEqual(obj.dialect.delimiter, ':') AssertionError: s'\x00' != ':' Any idea how to solve that? It looks to me like some Unicode buffer might be getting interpreted as a char *, but I'm not sure. Skip
msg52986 - (view)	Author: Adam Hupp (hupp)	Date: 2007-08-05 16:39
Skip, I think the error you're seeing is being caused by a conversion from Py_UNICODE -> char -> unicode through get_nullchar_as_None. That function should look like this: static PyObject * get_nullchar_as_None(Py_UNICODE c) { if (c == '\0') { Py_INCREF(Py_None); return Py_None; } else return PyUnicode_FromUnicode((Py_UNICODE*)&c, 1); } Unfortunately I'm on the road right now so I can't test it. Is there something I need to do with my build to trigger those assertions? I didn't see them.
msg52987 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2007-08-06 19:33
This looked good enough to submit. I had to clean up the whitespace use in the C code. Please next time set your tabs to 8 spaces when editing C code. Also try to conform to the surrounding code's use of spaces or tab (unfortunately this file is inconsistent and sometimes uses spaces, other times tabs -- that's worth a separate cleanup). Committed revision 56777.

History
Date	User	Action	Args
2022-04-11 14:56:25	admin	set	github: 45275
2008-01-06 22:29:45	admin	set	keywords: - py3k versions: + Python 3.0
2007-08-04 00:11:43	hupp	create