classification
Title: Abort in _csv module
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: skrah Nosy List: Arfrever, asvetlov, georg.brandl, larry, pitrou, python-dev, r.david.murray, rogerbinns, serhiy.storchaka, skrah, vstinner
Priority: release blocker Keywords: 3.3regression, patch

Created on 2012-10-05 18:48 by Arfrever, last changed 2013-03-25 16:19 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
issue16145.diff skrah, 2012-10-06 14:29 review
Messages (28)
msg172110 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2012-10-05 18:48
Commit f2adbb1065eb introduced abort in _csv module in debug builds in Python 3.3, when using APSW shell.

(You need to have SQLite >=3.7.14 installed.)

$ cd /tmp
$ wget http://apsw.googlecode.com/files/apsw-3.7.14-r2.zip
$ unzip apsw-3.7.14-r2.zip
$ cd apsw-3.7.14-r2
$ python3.3dm setup.py build
...
$ PYTHONPATH="$(ls -d build/lib*)" python3.3dm tools/shell.py
SQLite version 3.7.14.1 (APSW 3.7.14-r2)
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .output output_file
sqlite> .mode csv
sqlite> select "a";
python3.3dm: /tmp/cpython/Modules/_csv.c:1111: join_append: Assertion `(((PyASCIIObject*)field)->state.ready)' failed.
Aborted
$ 

In bash shell:
$ PYTHONPATH="$(ls -d build/lib*)" python3.3dm tools/shell.py <<< $'.output output_file\n.mode csv\nselect "a";'
python3.3dm: /tmp/cpython/Modules/_csv.c:1111: join_append: Assertion `(((PyASCIIObject*)field)->state.ready)' failed.
Aborted
$
msg172114 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-05 19:00
Well, can you try to enable core dumps and get a complete stack trace?
msg172130 - (view) Author: Roger Binns (rogerbinns) Date: 2012-10-05 20:20
I'm the APSW author.  You do not need SQLite installed - APSW's setup can fetch the current SQLite and use it privately not affecting the rest of the system.

An easier way of testing is:

  python3 setup.py fetch --sqlite --version 3.7.14 build_ext --inplace --force

Create test.sql with these contents

  .output output_file
  .mode csv
  select 3, '4';

Then run:

  env PYTHONPATH=. python3 tools/shell.py
  .read test.sql

The actual code where the problem happens does the following:

- Create a StringIO
- Initialize csv.writer using the StringIO
- Write one row with two values (this is when the crash happens in Python 3.3 on the first write)
- Copy the current contents of the StringIO to the actual output file changing encoding as needed
- Truncate StringIO and seek back to offset zero ready for the next line

The relevant code is in tools/shell.py in the output_csv function.  Writing just that sequence of code above doesn't result in the assertion.  valgrind doesn't show any problems either (using pydebug, without pymalloc and all the freelists set to zero).

A stack trace is here:  https://code.google.com/p/apsw/issues/detail?id=132#c4

If not using a debug/nopymalloc build then you get a crash happening later.
msg172134 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-10-05 20:39
In general APSW does not look ready for the Python 3.3 Unicode
implementation. Example: src/statementcache.c:217

    Py_UNICODE *out;
    PyObject *res=PyUnicode_FromUnicode(NULL, size);


Python 3.3 docs:

"If the buffer is NULL, PyUnicode_READY() must be called once the string content has been filled before using any of the access macros such as PyUnicode_KIND()."


The abort() in the traceback occurs precisely because PyUnicode_READY()
hasn't been called, so I'm closing this as invalid.
msg172136 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-05 20:41
How do you get values to write? Most likely, the problem has nothing to do with csv module, but written values are broken. Possible error in sqlite module, or in any conversion function, or in you APSW C-code. Can you dump this values before sending to writerow?
msg172137 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-05 20:43
Oh, I see the issue has already been solved.
msg172139 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-10-05 20:45
My understanding was that code that calls the public APIs should continue to work.  Is this a new requirement in 3.3, or is it that it has always been a requirement but code could get away without the ready before?  

Either way we need to add an note to the Porting section of What's New.
msg172140 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2012-10-05 20:48
Apparently http://docs.python.org/py3k/whatsnew/3.3.html#pep-393-flexible-string-representation says:
"On the C API side, PEP 393 is fully backward compatible."
msg172141 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-10-05 20:56
I'm going to reopen this at least until we have more information.
msg172143 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-10-05 20:59
R. David Murray <report@bugs.python.org> wrote:
> My understanding was that code that calls the public APIs should continue to
> work. Is this a new requirement in 3.3, or is it that it has always been a
> requirement but code could get away without the ready before?  

It's a new requirement.

> Either way we need to add an note to the Porting section of What's New.

Good point. The main hint I find is here:

http://docs.python.org/py3k/whatsnew/3.3.html#deprecated-functions-and-types-of-the-c-api

But you have to click through to:

http://docs.python.org/py3k/c-api/unicode.html#PyUnicode_FromUnicode

PEP 393 on the other hand has this paragraph:

"Extension modules using the legacy API may inadvertently call PyUnicode_READY,
 by calling some API that requires that the object is ready, and then continue
 accessing the (now invalid) Py_UNICODE pointer. Such code will break with this
 PEP. The code was already flawed in 3.2, as there is was no explicit guarantee
 that the PyUnicode_AS_UNICODE result would stay valid after an API call (due
 to the possibility of string resizing). Modules that face this issue need to
 re-fetch the Py_UNICODE pointer after API calls; doing so will continue to
 work correctly in earlier Python versions."
msg172151 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-10-05 21:31
David is right, the existing (APSW) code should continue to work.
I think we need to add a couple of PyUnicode_READY() to some
functions in Modules/_csv.c.
msg172153 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-10-05 22:26
"I think we need to add a couple of PyUnicode_READY() to some functions in Modules/_csv.c."

Oh yes, the _csv module was in my TODO list. I forgot to fix it :-/
msg172154 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-10-05 22:30
"In general APSW does not look ready for the Python 3.3 Unicode implementation."

I bet that most Python modules implemented in C use the "legacy" (old) Unicode API (Py_UNICODE*). Python 3.3 must call PyUnicode_READY() everywhere (where the new API is used to get attribute of a string, ex: PyUnicode_GET_LENGTH).
msg172161 - (view) Author: Roger Binns (rogerbinns) Date: 2012-10-06 02:18
(APSW author here).  I haven't ported to the new Python 3.3 Unicode handling yet but it is on my todo list.  The PEPs and doc said the C API would remain backwards compatible.  The APSW code supports Python 2.3 onwards.

SQLite APIs support both UTF-8 and UTF-16.  Py_UNICODE_SIZE is used to select between two code paths - if 2 then the UTF-16 APIs are used and if 4 then a UTF-8 conversion is made and those APIs used.  Python 3.3 makes Py_UNICODE_SIZE meaningless and will be quite a bit of work to integrate as a third branch.

It should be noted that at no point is there any dependence of the underlying bytes/pointer living a long time.  They are only used at the point which data needs to be transferred between Python types and SQLite APIs (both ways) and not held outside of that period.
msg172199 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-10-06 14:29
Here's a patch.


Victor, I guess I have a feature request for fusil: It would be nice
if fusil also generated legacy strings (or does it already do so?).
msg172200 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-10-06 14:34
> The PEPs and doc said the C API would remain backwards compatible.

Roger: Yes, you don't need to change anything. I was misled by the
PyUnicode_AsUnicode() docs, thinking that the responsibility to call
PyUnicode_READY() also applies if the generated Unicode Object is
passed to the C-API.
msg172231 - (view) Author: Roger Binns (rogerbinns) Date: 2012-10-06 18:23
Roughly how long will it be before Python 3.3.1 comes out?  This issue means my users will get garbage or crashes, so I'll need to work around it if it will be quite a while till 3.3.1.
msg172233 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-10-06 18:30
3.3.1 will not be too long: maybe 4 weeks from now.
msg172811 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-10-13 16:54
LGTM
msg174478 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-01 21:26
LGTM. But I left some minor comments in Rietveld.
msg174481 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-01 21:54
Hmm.  Is it a bugtracker bug?  I don't know how I added Larry to the nosy list and now I can not remove him.
msg174482 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-11-01 21:57
I don't know.  I was able to remove him.  (I have javascript turned off, don't know if that makes any difference).
msg174483 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2012-11-01 21:58
Probably Versions="Python 3.4" + Priority="release blocker" results in addition of 3.4 Release Manager (Larry Hastings) to nosy list.
msg174485 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-11-01 22:53
Roundup knows I'm the release manager for 3.4?  It's well-informed! ;-)
msg174514 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-02 14:01
New changeset 0b5f00ccf907 by Stefan Krah in branch '3.3':
Issue #16145: Support legacy strings in the _csv module.
http://hg.python.org/cpython/rev/0b5f00ccf907
msg174516 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-11-02 14:04
Andrew and Serhiy, thanks for the reviews. Should be fixed now.
msg185172 - (view) Author: Roger Binns (rogerbinns) Date: 2013-03-25 00:57
So is 3.3.1 with the fix ever going to be released?  Georg did predict mid-November and we are 4 months after that.
msg185206 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-25 16:19
It won't be too much longer.
History
Date User Action Args
2013-03-25 16:19:57r.david.murraysetmessages: + msg185206
2013-03-25 00:57:16rogerbinnssetmessages: + msg185172
2012-11-02 14:04:28skrahsetstatus: open -> closed
messages: + msg174516

assignee: skrah
resolution: fixed
stage: needs patch -> resolved
2012-11-02 14:01:43python-devsetnosy: + python-dev
messages: + msg174514
2012-11-01 22:53:09larrysetmessages: + msg174485
2012-11-01 21:58:05Arfreversetmessages: + msg174483
2012-11-01 21:57:53r.david.murraysetmessages: + msg174482
2012-11-01 21:57:21r.david.murraysetnosy: georg.brandl, pitrou, vstinner, larry, rogerbinns, Arfrever, r.david.murray, asvetlov, skrah, serhiy.storchaka
2012-11-01 21:54:19serhiy.storchakasetnosy: georg.brandl, pitrou, vstinner, larry, rogerbinns, Arfrever, r.david.murray, asvetlov, skrah, serhiy.storchaka
messages: + msg174481
2012-11-01 21:51:38serhiy.storchakasetnosy: georg.brandl, pitrou, vstinner, larry, rogerbinns, Arfrever, r.david.murray, asvetlov, skrah, serhiy.storchaka
2012-11-01 21:50:40serhiy.storchakasetnosy: georg.brandl, pitrou, vstinner, larry, rogerbinns, Arfrever, r.david.murray, asvetlov, skrah, serhiy.storchaka
2012-11-01 21:26:22serhiy.storchakasetnosy: + larry
messages: + msg174478
2012-10-13 16:54:23asvetlovsetnosy: + asvetlov
messages: + msg172811
2012-10-06 18:30:07georg.brandlsetnosy: + georg.brandl
messages: + msg172233
2012-10-06 18:27:06Arfreversetpriority: normal -> release blocker
components: + Extension Modules
2012-10-06 18:23:18rogerbinnssetmessages: + msg172231
2012-10-06 14:34:34skrahsetmessages: + msg172200
2012-10-06 14:29:57skrahsetfiles: + issue16145.diff
keywords: + patch
messages: + msg172199
2012-10-06 02:18:24rogerbinnssetmessages: + msg172161
2012-10-05 22:30:57vstinnersetmessages: + msg172154
2012-10-05 22:26:58vstinnersetnosy: + vstinner
messages: + msg172153
2012-10-05 21:31:44skrahsetmessages: + msg172151
2012-10-05 20:59:21skrahsetmessages: + msg172143
2012-10-05 20:56:40r.david.murraysetstatus: closed -> open
type: behavior
messages: + msg172141

resolution: not a bug -> (no value)
stage: resolved -> needs patch
2012-10-05 20:48:33Arfreversetmessages: + msg172140
2012-10-05 20:45:57r.david.murraysetnosy: + r.david.murray
messages: + msg172139
2012-10-05 20:43:03serhiy.storchakasetmessages: + msg172137
2012-10-05 20:41:01serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg172136
2012-10-05 20:39:01skrahsetstatus: open -> closed

nosy: + skrah
messages: + msg172134

resolution: not a bug
stage: resolved
2012-10-05 20:20:54rogerbinnssetmessages: + msg172130
2012-10-05 19:00:12pitrousetmessages: + msg172114
2012-10-05 18:48:17Arfrevercreate