Author cheryl.sabella
Recipients cheryl.sabella, serhiy.storchaka, terry.reedy
Date 2018-02-24.17:30:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1519493455.63.0.467229070634.issue32880@psf.upfronthosting.co.za>
In-reply-to
Content
For msg312444 on StringTranslatePseudoMapping:

I ran a comparison of the current class vs defaultdict.

This was my test:
```
with open('/home/cheryl/cpython/Lib/idlelib/pyparse.py') as fd:
    code = fd.read()
copies = 20000
code *= copies
for i in range(10):
    start = time.time()
    trans_code = code.translate(_tran)
    end = time.time()
    print(f'copies: {copies}   translate time: {end - start}')
```

I ran the current `_tran` first and saved the results.
```
_tran = {}
_tran.update((ord(c), ord('(')) for c in "({[")
_tran.update((ord(c), ord(')')) for c in ")}]")
_tran.update((ord(c), ord(c)) for c in "\"'\\\n#")
_tran = StringTranslatePseudoMapping(_tran, default_value=ord('x'))
```
I know time.time isn't perfect, but thought it might be good enough.  The results:
copies: 20000   translate time: 0.8162932395935059
copies: 20000   translate time: 0.610985517501831
copies: 20000   translate time: 0.8164870738983154
copies: 20000   translate time: 0.6125986576080322
copies: 20000   translate time: 0.8143167495727539
copies: 20000   translate time: 0.612929105758667
copies: 20000   translate time: 0.8299245834350586
copies: 20000   translate time: 0.6127865314483643
copies: 20000   translate time: 0.812185525894165
copies: 20000   translate time: 0.6151354312896729


Then I changed it to a defaultdict:
```
_tran = defaultdict(lambda: 'x')
 # _tran = {}
_tran.update((ord(c), ord('(')) for c in "({[")
_tran.update((ord(c), ord(')')) for c in ")}]")
_tran.update((ord(c), ord(c)) for c in "\"'\\\n#")
# _tran = StringTranslatePseudoMapping(_tran, default_value=ord('x'))
```

I compared the results to make sure the defaultdict produced the same output as the mapping, which it did.

The results:
copies: 20000   translate time: 0.8172969818115234
copies: 20000   translate time: 0.6214878559112549
copies: 20000   translate time: 0.8143007755279541
copies: 20000   translate time: 0.6127951145172119
copies: 20000   translate time: 0.8154017925262451
copies: 20000   translate time: 0.6144123077392578
copies: 20000   translate time: 0.8128812313079834
copies: 20000   translate time: 0.6167266368865967
copies: 20000   translate time: 0.8143749237060547
copies: 20000   translate time: 0.6116495132446289


So, the results are simliar, down to the alternating of .61ish and .81ish times.
History
Date User Action Args
2018-02-24 21:21:33terry.reedyunlinkissue32880 messages
2018-02-24 17:30:55cheryl.sabellasetrecipients: + cheryl.sabella, terry.reedy, serhiy.storchaka
2018-02-24 17:30:55cheryl.sabellasetmessageid: <1519493455.63.0.467229070634.issue32880@psf.upfronthosting.co.za>
2018-02-24 17:30:55cheryl.sabellalinkissue32880 messages
2018-02-24 17:30:55cheryl.sabellacreate