classification
Title: Can't get ConfigParser.write to write unicode strings
Type: behavior Stage:
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: lukasz.langa Nosy List: Eugene.Klimov, georg.brandl, lukasz.langa, neo gurb, r.david.murray, rhettinger, the_isz, vstinner
Priority: normal Keywords:

Created on 2011-03-18 17:08 by the_isz, last changed 2016-02-01 21:40 by neo gurb. This issue is now closed.

Files
File name Uploaded Description Edit
test.py the_isz, 2011-03-18 17:08 Script reproducing the problem
Messages (17)
msg131339 - (view) Author: (the_isz) Date: 2011-03-18 17:08
Hey everyone,

I'm having issues writing unicode strings with ConfigParser.write. I don't know
if this is python's fault or my own but I couldn't find help on this, neither by
googling, nor by asking on the python irc channels.

Attached to this description I'll add an example script reproducing the error
and hope that someone will enlighten me on what I'm doing wrong.

It seems that this only occurs in python2, doing the same with python3 (omitting
the u before the unicode string), everything works fine.

Thanks in advance!
msg131341 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-18 17:25
>>> str(u"\u0411")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0411' in position 0: ordinal not in range(128)

So, clearly configparser in 2.x doesn't support unicode.  Now the question is, is this a bug or would adding support be a feature? (If the latter it can't be fixed in 2.7.)  I'll leave the answer to that question up to Lucaz.
msg131342 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-03-18 17:32
> Now the question is, is this a bug 
> or would adding support be a feature?

That may be a good question for python-dev.

Since ConfigParser is a very old module,
if there were a pressing need, we probably
would have heard about it before now.
msg131346 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-18 18:09
Well, python3 is probably pushing some people to try to add better unicode support to their python2 versions.  I think it is more a question of "is this an easy fix?" or would it require extensive changes to support unicode properly.  If it is easy (ie: it is really just a bug in configparser's string handling) I don't see a downside to fixing it.
msg131353 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-03-18 18:36
It's also a question of adding a feature to a point release.  If dev's relied on the new feature, they would also have to test for version > 2.7.1.  We usually try to avoid that (after a minor fiasco with booleans many years ago).
msg131362 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-18 20:15
I understand what you are saying, and I thought about that, too; but you could say the same thing about any bug fix that makes code work that didn't work before, yet we don't.  So I guess you are right that it should be discussed on python-dev.
msg131363 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-03-18 21:06
My vote would be that this is a new feature and therefore uneligible for 2.x.
msg133074 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2011-04-05 18:46
In my opinion we should unfortunately close this with WONTFIX because:

* adding a feature in a point release is not an option
* this may be poorly documented but most of the standard library in 2.x assumes bytestrings (and fails with Unicode strings). Same goes for instance for csv.
* storing encoded data internally enables for direct reading and writing to compatible files
* having Unicode instead requires the possibility to specify an `encoding` parameter to read* and write* methods. I added this parameter in 3.2 exactly for this reason.

So, unless I missed something obvious and there's more discussion, I am closing this issue next week.
msg133080 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-04-05 19:25
No, the need for an encoding parameter for read/write makes it unambiguous that it is indeed a feature.
msg133111 - (view) Author: (the_isz) Date: 2011-04-06 06:42
Well, the only thing I can add to this is that the json module (which I ended up
using) supports unicode with no problem. So I think the argument that most of
the standard library in 2.x assumes bytestrings is a bit... shaky.

Other than that, I can follow your reasoning and will just hope that all my
needed external libraries will soon make it to python 3 so I don't have to fight
such inconsistencies anymore :)
msg133117 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2011-04-06 08:53
As another core dev aptly said, most standard library Unicode support is probably accidental. As for `json`, this is one of the "newest" additions to stdlib, introduced in Python 2.6 (released at the same time as Python 3.0). Not the best example if you compare it to libraries that were added in 1.5 or so.
msg133118 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-06 09:36
> I think it is more a question of "is this an easy fix?" 
> or would it require extensive changes to support unicode properly.

First of all, the question is: who would like to develop it. You can vote for an issue, but it doesn't change anything if you don't find a developer to implement your idea :-)

Anyway, try to use Python everywhere in Python 2 is a waste of time. The work has already been done in Python 3, and it's much easier to use Unicode in Python 3, just because everything uses Unicode (exceptions, filenames, file content, modules, etc.).
msg133123 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-06 12:30
> Anyway, try to use Python everywhere in Python 2 is a waste of time.

Oh... I mean "use Unicode in Python 2"
msg134843 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2011-04-30 07:33
the_isz, for serious Unicode support you might try using the configparser 3.2 backport:

http://pypi.python.org/pypi/configparser
msg134847 - (view) Author: (the_isz) Date: 2011-04-30 08:52
Thanks for the hint, Łukasz, but like I stated earlier, I already worked around
this problem by using the json module instead.
msg187829 - (view) Author: Eugene Klimov (Eugene.Klimov) * Date: 2013-04-26 05:51
some workaround

import configparser
import codecs

cfg = configparser.ConfigParser()
cfg.write(codecs.open('filename','wb+','utf-8'))
msg259345 - (view) Author: neo gurb (neo gurb) Date: 2016-02-01 21:40
In file ConfigParser.py, def write, replace line

    key = " = ".join((key, str(value).replace('\n', '\n\t')))

by

    key = " = ".join((key, str(value).decode('utf-8').replace('\n', '\n\t')))

Tested in Windows7 and Ubuntu 15.04.
History
Date User Action Args
2016-02-01 21:40:54neo gurbsetnosy: + neo gurb
messages: + msg259345
2013-04-26 05:51:33Eugene.Klimovsetnosy: + Eugene.Klimov
messages: + msg187829
2011-04-30 08:52:44the_iszsetmessages: + msg134847
2011-04-30 07:33:56lukasz.langasetmessages: + msg134843
2011-04-06 12:30:51vstinnersetmessages: + msg133123
2011-04-06 09:36:09vstinnersetnosy: + vstinner
messages: + msg133118
2011-04-06 08:53:50lukasz.langasetmessages: + msg133117
2011-04-06 06:42:02the_iszsetmessages: + msg133111
2011-04-05 19:25:07r.david.murraysetstatus: open -> closed
resolution: wont fix
messages: + msg133080
2011-04-05 18:46:41lukasz.langasetmessages: + msg133074
2011-03-18 21:06:45georg.brandlsetnosy: + georg.brandl
messages: + msg131363
2011-03-18 20:15:30r.david.murraysetnosy: rhettinger, r.david.murray, lukasz.langa, the_isz
messages: + msg131362
2011-03-18 18:36:03rhettingersetnosy: rhettinger, r.david.murray, lukasz.langa, the_isz
messages: + msg131353
2011-03-18 18:09:11r.david.murraysetnosy: rhettinger, r.david.murray, lukasz.langa, the_isz
messages: + msg131346
2011-03-18 17:32:33rhettingersetnosy: + rhettinger
messages: + msg131342
2011-03-18 17:25:35r.david.murraysetassignee: lukasz.langa

messages: + msg131341
nosy: + r.david.murray, lukasz.langa
2011-03-18 17:08:03the_iszcreate