classification
Title: logging.baseConfig is missing the encoding parameter
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: janis.slapins, vinay.sajip
Priority: normal Keywords: patch

Created on 2016-03-21 22:17 by janis.slapins, last changed 2017-11-19 18:35 by vinay.sajip. This issue is now closed.

Files
File name Uploaded Description Edit
__init__.py.patch janis.slapins, 2016-03-21 22:17 Minor changes to Python35/Lib/logging/__init__.py
Messages (9)
msg262150 - (view) Author: Jānis Šlapiņš (janis.slapins) * Date: 2016-03-21 22:17
Hi!
Log files are only saved using the system default encoding.
On Windows, this means that the current ANSI code page is used. This may lead to a disaster (exceptions may be thrown due to unmappable characters) if you want to log particular text strings in a foreign language, e.g. those read from a source file in the UTF-8 or UTF-16 encoding, and such strings contain characters not available in your ANSI code page.
I guess this issue does not affect Linux or Mac OSX as they already use the UTF-8 encoding for their system locales.

Actually, the logging module already has the built-in functionality for setting a particular encoding for output files. However, it was not added as a parameter of the baseConfig function (in __init__.py).

I added a patch file with suggested amendments.

I already tested writing logs with those changes applied and files are now saved in the specified encoding which differs from the Windows current ANSI page. For example:
logging.basicConfig(filename=log_path, filemode='w', encoding='utf-8', format='%(message)s', level=logging.INFO)
msg262621 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2016-03-29 18:51
But you can open a stream using the encoding you want and pass it as the stream= parameter to basicConfig(). Why does that not work for you?
msg262700 - (view) Author: Jānis Šlapiņš (janis.slapins) * Date: 2016-03-31 18:46
Using the stream or other options requires much more coding (for example, an additional redirection of sys.stdout to a file) instead of just one line with the basicConfig.

In the meanwhile, I tried to use logging.FileHandler instead where I could specify the encoding in parameters and it works as I wanted when used the basicConfig method.

Anyway, it would be nice to have the encoding parameter among the basicConfig parameters when logging to files.
msg262707 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2016-03-31 21:19
> requires much more coding

Much more? How so? It just seems like one open() call and passing the result to basicConfig().
msg262728 - (view) Author: Jānis Šlapiņš (janis.slapins) * Date: 2016-04-01 08:57
Yes, it also works. But then you have also to remember to restore sys.stdout to the initial state at the end. In addition, for non-English languages it would be more appropriate to use codecs.open() instead of just open() in this case.
The complexity of the code grows and increases a danger of "more code, more bugs".

Why to use a "detour" and try always to remember that a part of a module is not useful for you due to particual drawbacks if it is possible to implement a small addition in it that does not break anything?
msg262746 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2016-04-01 16:03
> you have also to remember to restore sys.stdout

I'm not sure you understand how it works. The value of sys.stdout isn't changed, so why does it need to be restored?

> for non-English languages it would be more appropriate to use codecs.open() instead of just open()

codecs.open() for older versions of Python, perhaps, but in newer Pythons (this issue is marked for Python 3.5), open is io.open which takes an encoding parameter.

basicConfig() is meant for the simplest cases, so you have to draw the line somewhere as to what "basic" means. I don't propose to change where the line is drawn - and AFAIK this is the first time it's come up, so it looks as if the many non-English speaking Python users are managing just fine with basicConfig() as it is ... note that this kind of thing is always a judgement call.

> The complexity of the code grows and increases a danger of "more code, more bugs".

Maybe that's why I'm choosing not to increase the complexity of my code ;-)
msg262762 - (view) Author: Jānis Šlapiņš (janis.slapins) * Date: 2016-04-01 19:47
Many examples in the internet only show the usage of the filename parameter of basicConfig() and almost no one shows how to use the stream. That's why I wanted to use the filename parameter. But now I tested other options and they work for me. My case may be very specific as I need to log words in very different languages including not only those having the Latin script but also cyrillic - Russian, Greek etc.

Regarding the codecs module and open() - yes, I made a mistake. There is no need for that in Python3.

About sys.stdout. I understand the redirection in the following way (also shown in the Dive Into Python book):
normal_stdout = sys.stdout
sys.stdout = open(mylogfile, 'w', encoding='utf-8')
logging.basicConfig(level=logging.INFO, stream=sys.stdout)

After that, all the STDOUT goes to mylogfile. In order to send the output to the terminal window again, sys.stdout must be set back to normal:
sys.stdout = normal_stdout
msg262766 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2016-04-01 19:52
> and almost no one shows how to use the stream.

Because most examples out there don't care about Unicode, etc.

> I understand the redirection in the following way (also shown in the Dive Into Python book)

There's certainly no need to do that, and that would not be a normal way of using logging. The use of stream= should be clear from the documentation for basicConfig() parameters.
msg262829 - (view) Author: Jānis Šlapiņš (janis.slapins) * Date: 2016-04-03 15:44
> that's why I'm choosing not to increase the complexity of my code

I disagree about the classification of my proposal. This is not about increasing the complexity (changing algorithms, adding a new functionality and so on). It is just about getting the most out of the code with a minimum effort.
History
Date User Action Args
2017-11-19 18:35:32vinay.sajipsetstatus: open -> closed
resolution: not a bug
stage: patch review -> resolved
2016-04-03 15:44:41janis.slapinssetmessages: + msg262829
2016-04-01 19:52:30vinay.sajipsetmessages: + msg262766
2016-04-01 19:47:31janis.slapinssetmessages: + msg262762
2016-04-01 16:03:29vinay.sajipsetmessages: + msg262746
2016-04-01 08:57:27janis.slapinssetmessages: + msg262728
2016-03-31 21:19:36vinay.sajipsetmessages: + msg262707
2016-03-31 18:46:04janis.slapinssetmessages: + msg262700
2016-03-29 18:51:44vinay.sajipsetmessages: + msg262621
2016-03-22 09:42:33SilentGhostsetnosy: + vinay.sajip

stage: patch review
2016-03-21 22:17:06janis.slapinscreate