Title: Decoding unicode not supported after upd to 2.7.17 [possible pymysql related?]
Components: macOS Versions: Python 2.7
Status: closed Resolution: not a bug
Created on 2019-11-26 22:45 by Sebastian Szwarc

populateTable.txt Sebastian Szwarc, 2019-11-26 22:45
msg357537 - (view) Author: Sebastian Szwarc (Sebastian Szwarc) Date: 2019-11-26 22:45
As follow up to my recent bug error regarding segmentation fault.
Installed 2.7.17 on Mojave.
Because MySQLdb for reason unknown (SSL required error) is impossible to install by PIP I used PyMysql and modified line as `import pymysql as MySQLdb`

There is no segmentation fault for now (what indicates there can be bug in older python interpreter) but:
the following line worked fine for 2+ years:

colVals = unicode(", ".join(stringList), 'utf-8')

however now I got the error:
2019-11-26 23:25:55,273 [INFO]: Beginning incremental ingest of epf_video_price (200589 records)
Traceback (most recent call last):
  File "", line 453, in <module>
  File "", line 436, in main
  File "", line 221, in doImport
  File "/Users/sebastian/Documents/test2/", line 111, in ingest
  File "/Users/sebastian/Documents/test2/", line 206, in ingestIncremental
  File "/Users/sebastian/Documents/test2/", line 375, in _populateTable
    colVals = unicode(", ".join(stringList), 'utf-8')
TypeError: decoding Unicode is not supported

So the questions:
1. why decoding Unicode is not supported if previously was and worked fine?
2. is it python thing or some pymysql enforcing rules ?

For reference I attached populateTable function
msg357539 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-11-26 23:33
Hi Sebastian,

It will help if you do some minimal debugging before reporting what you think is a bug. Also, you should report what version you are upgrading from, not just the version you have upgraded to.

It may help you to provide better bug reports if you read this:

"Dcoding unicode" does not make sense and never did, and hasn't been supported since since at least version 2.4:

    py> unicode(u'a', 'uft-8')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: decoding Unicode is not supported

One *encodes* Unicode to bytes, and *decodes* bytes to Unicode.

What I believe is happening is that somewhere, somehow, your ``stringList`` variable has a Unicode string object in it, rather than all byte-strings. Calling `', '.join(stringList)` returns a Unicode string if any item in the list is Unicode.

I'm closing this as "Not a bug" as it is not a bug in the language.
