msg131339 - (view) |
Author: (the_isz) |
Date: 2011-03-18 17:08 |
Hey everyone,
I'm having issues writing unicode strings with ConfigParser.write. I don't know
if this is python's fault or my own but I couldn't find help on this, neither by
googling, nor by asking on the python irc channels.
Attached to this description I'll add an example script reproducing the error
and hope that someone will enlighten me on what I'm doing wrong.
It seems that this only occurs in python2, doing the same with python3 (omitting
the u before the unicode string), everything works fine.
Thanks in advance!
|
msg131341 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-03-18 17:25 |
>>> str(u"\u0411")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0411' in position 0: ordinal not in range(128)
So, clearly configparser in 2.x doesn't support unicode. Now the question is, is this a bug or would adding support be a feature? (If the latter it can't be fixed in 2.7.) I'll leave the answer to that question up to Lucaz.
|
msg131342 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2011-03-18 17:32 |
> Now the question is, is this a bug
> or would adding support be a feature?
That may be a good question for python-dev.
Since ConfigParser is a very old module,
if there were a pressing need, we probably
would have heard about it before now.
|
msg131346 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-03-18 18:09 |
Well, python3 is probably pushing some people to try to add better unicode support to their python2 versions. I think it is more a question of "is this an easy fix?" or would it require extensive changes to support unicode properly. If it is easy (ie: it is really just a bug in configparser's string handling) I don't see a downside to fixing it.
|
msg131353 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2011-03-18 18:36 |
It's also a question of adding a feature to a point release. If dev's relied on the new feature, they would also have to test for version > 2.7.1. We usually try to avoid that (after a minor fiasco with booleans many years ago).
|
msg131362 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-03-18 20:15 |
I understand what you are saying, and I thought about that, too; but you could say the same thing about any bug fix that makes code work that didn't work before, yet we don't. So I guess you are right that it should be discussed on python-dev.
|
msg131363 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2011-03-18 21:06 |
My vote would be that this is a new feature and therefore uneligible for 2.x.
|
msg133074 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2011-04-05 18:46 |
In my opinion we should unfortunately close this with WONTFIX because:
* adding a feature in a point release is not an option
* this may be poorly documented but most of the standard library in 2.x assumes bytestrings (and fails with Unicode strings). Same goes for instance for csv.
* storing encoded data internally enables for direct reading and writing to compatible files
* having Unicode instead requires the possibility to specify an `encoding` parameter to read* and write* methods. I added this parameter in 3.2 exactly for this reason.
So, unless I missed something obvious and there's more discussion, I am closing this issue next week.
|
msg133080 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-04-05 19:25 |
No, the need for an encoding parameter for read/write makes it unambiguous that it is indeed a feature.
|
msg133111 - (view) |
Author: (the_isz) |
Date: 2011-04-06 06:42 |
Well, the only thing I can add to this is that the json module (which I ended up
using) supports unicode with no problem. So I think the argument that most of
the standard library in 2.x assumes bytestrings is a bit... shaky.
Other than that, I can follow your reasoning and will just hope that all my
needed external libraries will soon make it to python 3 so I don't have to fight
such inconsistencies anymore :)
|
msg133117 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2011-04-06 08:53 |
As another core dev aptly said, most standard library Unicode support is probably accidental. As for `json`, this is one of the "newest" additions to stdlib, introduced in Python 2.6 (released at the same time as Python 3.0). Not the best example if you compare it to libraries that were added in 1.5 or so.
|
msg133118 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-04-06 09:36 |
> I think it is more a question of "is this an easy fix?"
> or would it require extensive changes to support unicode properly.
First of all, the question is: who would like to develop it. You can vote for an issue, but it doesn't change anything if you don't find a developer to implement your idea :-)
Anyway, try to use Python everywhere in Python 2 is a waste of time. The work has already been done in Python 3, and it's much easier to use Unicode in Python 3, just because everything uses Unicode (exceptions, filenames, file content, modules, etc.).
|
msg133123 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-04-06 12:30 |
> Anyway, try to use Python everywhere in Python 2 is a waste of time.
Oh... I mean "use Unicode in Python 2"
|
msg134843 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2011-04-30 07:33 |
the_isz, for serious Unicode support you might try using the configparser 3.2 backport:
http://pypi.python.org/pypi/configparser
|
msg134847 - (view) |
Author: (the_isz) |
Date: 2011-04-30 08:52 |
Thanks for the hint, Łukasz, but like I stated earlier, I already worked around
this problem by using the json module instead.
|
msg187829 - (view) |
Author: Eugene Klimov (Eugene.Klimov) * |
Date: 2013-04-26 05:51 |
some workaround
import configparser
import codecs
cfg = configparser.ConfigParser()
cfg.write(codecs.open('filename','wb+','utf-8'))
|
msg259345 - (view) |
Author: neo gurb (neo gurb) |
Date: 2016-02-01 21:40 |
In file ConfigParser.py, def write, replace line
key = " = ".join((key, str(value).replace('\n', '\n\t')))
by
key = " = ".join((key, str(value).decode('utf-8').replace('\n', '\n\t')))
Tested in Windows7 and Ubuntu 15.04.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:15 | admin | set | github: 55806 |
2016-02-01 21:40:54 | neo gurb | set | nosy:
+ neo gurb messages:
+ msg259345
|
2013-04-26 05:51:33 | Eugene.Klimov | set | nosy:
+ Eugene.Klimov messages:
+ msg187829
|
2011-04-30 08:52:44 | the_isz | set | messages:
+ msg134847 |
2011-04-30 07:33:56 | lukasz.langa | set | messages:
+ msg134843 |
2011-04-06 12:30:51 | vstinner | set | messages:
+ msg133123 |
2011-04-06 09:36:09 | vstinner | set | nosy:
+ vstinner messages:
+ msg133118
|
2011-04-06 08:53:50 | lukasz.langa | set | messages:
+ msg133117 |
2011-04-06 06:42:02 | the_isz | set | messages:
+ msg133111 |
2011-04-05 19:25:07 | r.david.murray | set | status: open -> closed resolution: wont fix messages:
+ msg133080
|
2011-04-05 18:46:41 | lukasz.langa | set | messages:
+ msg133074 |
2011-03-18 21:06:45 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg131363
|
2011-03-18 20:15:30 | r.david.murray | set | nosy:
rhettinger, r.david.murray, lukasz.langa, the_isz messages:
+ msg131362 |
2011-03-18 18:36:03 | rhettinger | set | nosy:
rhettinger, r.david.murray, lukasz.langa, the_isz messages:
+ msg131353 |
2011-03-18 18:09:11 | r.david.murray | set | nosy:
rhettinger, r.david.murray, lukasz.langa, the_isz messages:
+ msg131346 |
2011-03-18 17:32:33 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg131342
|
2011-03-18 17:25:35 | r.david.murray | set | assignee: lukasz.langa
messages:
+ msg131341 nosy:
+ r.david.murray, lukasz.langa |
2011-03-18 17:08:03 | the_isz | create | |