Message258264
Eric, Steven, thank you for your feedback so far.
I am using Windows7, Intel i7.
That one particular file of 6.5MB took ~1 minute on my machine.
When I ran that same test on Linux with Python 3.5.1, it took about 3 seconds. I was amazed to see a 20x difference.
Steven suggested the idea that this phenomenon might be specific to Windows. And I agree, that is what it is looking like. Or is Python doing something in the background?
The Python script is straight forward with a loop that reads a line from a CSV file, split the column values and saves each value as '<value>' to another file. Basically building an SQL statement.
I have had no issues until I added the encapsulating single quotes around the value.
Because I can reproduce this performance difference at will by alternating which line I comment out, leads me to believe it cannot be HDD, AV or something outside the python script interfering.
I repeated the simplified test, that I ran earlier on a Linux system, but this time on my Windows system.
I don't see anything spectacular.
I am just puzzled that using one statement or the other causes such a huge performance impact somehow.
I will try some more tests and copy your examples.
import time
loopcount = 10000000
# Using string value
s="test 1"
v="test 1"
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
y = "{0}".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End {0}: {1}".format(s,end_ms))
print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 1: 1452828394523
# End test 1: 1452828397957
# Diff test 1: 3434 ms
s="test 2"
v="test 2"
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
y = "'%s'" % (v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End {0}: {1}".format(s,end_ms))
print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 2: 1452828397957
# End test 2: 1452828401233
# Diff test 2: 3276 ms
s="test 3"
v="test 3"
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
y = "'{0}'".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End {0}: {1}".format(s,end_ms))
print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 3: 1452828401233
# End test 3: 1452828406320
# Diff test 3: 5087 ms
# Using integer value
s="test 4"
v=123456
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
y = "{0}".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End {0}: {1}".format(s,end_ms))
print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 4: 1452828406320
# End test 4: 1452828411378
# Diff test 4: 5058 ms
s="test 5"
v=123456
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
y = "'%s'" % (v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End {0}: {1}".format(s,end_ms))
print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 5: 1452828411378
# End test 5: 1452828415264
# Diff test 5: 3886 ms
s="test 6"
v=123456
start_ms = int(round(time.time() * 1000))
for x in range (loopcount):
y = "'{0}'".format(v)
end_ms = int(round(time.time() * 1000))
print("Start {0}: {1}".format(s,start_ms))
print("End {0}: {1}".format(s,end_ms))
print("Diff {0}: {1} ms\n\n".format(s,end_ms-start_ms))
# Start test 6: 1452828415264
# End test 6: 1452828421292
# Diff test 6: 6028 ms |
|
Date |
User |
Action |
Args |
2016-01-15 03:29:53 | poostenr | set | recipients:
+ poostenr, eric.smith, steven.daprano, ubehera |
2016-01-15 03:29:53 | poostenr | set | messageid: <1452828593.65.0.0816975415113.issue26118@psf.upfronthosting.co.za> |
2016-01-15 03:29:53 | poostenr | link | issue26118 messages |
2016-01-15 03:29:52 | poostenr | create | |
|