Author petri
Recipients petri
Date 2017-03-08.09:17:47
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1488964668.24.0.723245092961.issue29755@psf.upfronthosting.co.za>
In-reply-to
Content
On Debian stable (Python 3.4), with the LANGUAGE environment variable set to "C" or "en_US.UTF-8", the following produces a string:

d = gettext.textdomain('apt-listchanges')
print(gettext.lgettext("Informational notes"))

However, setting the language, for example fi_FI.UTF-8, it will output a bytes object. Same apparently happens with some other languages, too.

Why is this? The discrepancy is not documented anywhere, AFAIK. Is this a bug or intended behavior depending on some (undocumented) circumstances? Given both the above examples define UTF-8 as the encoding, the result value does not depend directly on the encoding. 

The docs say lgettext should merely return the translation in a particular encoding. It does not say the return value will be switched from a string to bytes as well.

I saw this originally in the Debian bug tracker and thought the issue merits at least clarification here as well (link to Debian bug below for reference).

(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=818728)

No idea if this happens on Python > 3.4 or another platforms. I would guess so, but have not had time to confirm.
History
Date User Action Args
2017-03-08 09:17:48petrisetrecipients: + petri
2017-03-08 09:17:48petrisetmessageid: <1488964668.24.0.723245092961.issue29755@psf.upfronthosting.co.za>
2017-03-08 09:17:48petrilinkissue29755 messages
2017-03-08 09:17:47petricreate