This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: sqlite3 row factory and multiprocessing map
Type: crash Stage: resolved
Components: Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ned.deily, sbt, tokeefe
Priority: normal Keywords:

Created on 2013-08-19 15:33 by tokeefe, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg195641 - (view) Author: Timothy O'Keefe (tokeefe) Date: 2013-08-19 15:33
If you run this code, you will get a segfault. If you a) remove the row factory from the following code or b) use the standard library map() instead of multiprocessing.Pool.map, then the code does not crash.

#!/usr/bin/env python

import sqlite3
import multiprocessing as mp

def main():
    ## --- create a database
    conn = sqlite3.connect(':memory:')
    conn.row_factory = sqlite3.Row
    c = conn.cursor()

    ## --- create example table similar to python docs
    c.execute('''CREATE TABLE stocks (date text, trans text, symbol text, qty real, price real)''')
    c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','GOOG',100,869.29)")
    c.execute("INSERT INTO stocks VALUES ('1992-01-06','SELL','AAPL',20,512.99)")
    c.execute("SELECT * FROM stocks")

    ## --- map fun over cursor iterable (fun does nothing)
    pool = mp.Pool(processes=mp.cpu_count())
    features = pool.map(fun, c)

def fun(row):
    return row

if __name__ == "__main__":
    main()
msg195660 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-08-19 20:24
What platform are you running on? Please run the following script in the same environment as you get the segfault and report the results.

#!/usr/bin/env python

import multiprocessing
import platform
import sqlite3
import sys

print(sys.version)
print(sqlite3.version)
print(sqlite3.sqlite_version)
print(multiprocessing.__version__)
print(multiprocessing.cpu_count())
print(platform.platform())

For what it's worth, I was not able to reproduce this behavior using several different environments.  Also, what happens if you add

    pool.close()
    pool.join()

following the pool.map call?
msg195666 - (view) Author: Timothy O'Keefe (tokeefe) Date: 2013-08-19 20:46
Could you change the dummy function to do something other than simply return the row? For example:

#!/usr/bin/env python

import sqlite3
import multiprocessing as mp

def main():
    ## --- create a database
    conn = sqlite3.connect(':memory:')
    conn.row_factory = sqlite3.Row
    c = conn.cursor()

    ## --- create example table similar to python docs
    c.execute('''CREATE TABLE stocks (date text, trans text, symbol text, qty real, price real)''')
    c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','GOOG',100,869.29)")
    c.execute("INSERT INTO stocks VALUES ('1992-01-06','SELL','AAPL',20,512.99)")
    c.execute("SELECT * FROM stocks")

    ## --- map fun over cursor (fun does little to nothing)
    pool = mp.Pool(processes=mp.cpu_count())
    rows = pool.map(fun, c)

def fun(row):
    _ = len(row)
    return row

if __name__ == "__main__":
    main()

In this example the code simply hangs. For whatever reason, I can no longer provoke the segfault.

Regardless, there is something going on. Here are the results from the print statements you asked for:

2.7.2 (default, Oct 11 2012, 20:14:37) 
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)]
2.6.0
3.7.12
0.70a1
4
Darwin-12.4.0-x86_64-i386-64bit

I have also tried this on Ubuntu 12.04:

2.7.3 (default, Apr 10 2013, 06:20:15) 
[GCC 4.6.3]
2.6.0
3.7.9
0.70a1
4
Linux-3.2.0-24-generic-x86_64-with-Ubuntu-12.04-precise
msg195667 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-08-19 20:53
Adding the line

    features[0][0]

to the end of main() produces a segfault for me on Linux.

The FAQ for sqlite3 says that

    Under Unix, you should not carry an open SQLite database across a 
    fork() system call into the child process. Problems will result if 
    you do.

-- see http://www.sqlite.org/faq.html.  So assuming you are using Unix, this is probably not a Python bug.

(And anyway, I would not expect the pickling of row objects to work correctly.)
msg195729 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-08-20 23:40
I agree with Richard's comments.  The crash appears to be a result of an unsupported usage of SQLite and one that Python can't really protect you from.
History
Date User Action Args
2022-04-11 14:57:49adminsetgithub: 62982
2013-08-20 23:40:27ned.deilysetstatus: open -> closed
resolution: not a bug
messages: + msg195729

stage: resolved
2013-08-19 20:53:26sbtsetnosy: + sbt
messages: + msg195667
2013-08-19 20:46:08tokeefesetmessages: + msg195666
2013-08-19 20:24:31ned.deilysetnosy: + ned.deily
messages: + msg195660
2013-08-19 15:33:25tokeefecreate