This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unicode.encode docstring says return value can be unicode
Type: Stage: resolved
Components: Unicode Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, lemburg, radiocane, serhiy.storchaka, steven.daprano, vstinner
Priority: normal Keywords:

Created on 2018-12-20 09:18 by radiocane, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (7)
msg332203 - (view) Author: (radiocane) Date: 2018-12-20 09:18
In Python 2.7.15rc1 the docstring for unicode.encode starts with: "S.encode([encoding[,errors]]) -> string or unicode"

But if this answer https://stackoverflow.com/a/449281/5397695 is correct, then unicode.encode will never return a unicode object. Am I right?
msg332206 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-20 10:12
There is no code that prevents unicode.encode() from returning the result of arbitrary type. Seems all standard codecs return str, but you can not be sure about custom codecs.
msg332208 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-12-20 10:18
Encoding and decoding, in the most general sense, can include unicode -> unicode and bytestring -> bytestring.

I can't see any standard unicode->unicode encodings in Python 2.7

https://docs.python.org/2/library/codecs.html

but we can create one:

py> import codecs
py> class NullCodec(codecs.Codec):  # "do nothing" codec
...     def encode(self, input, errors='strict'):
...         return (input, len(input))
...     def decode(self, input, errors='strict'):
...         return (input, len(input))
...
py> def getregentry(name):
...     return codecs.CodecInfo(
...         name='null',
...         encode=NullCodec().encode,
...         decode=NullCodec().decode,
...         incrementalencoder=None,
...         incrementaldecoder=None,
...         streamwriter=None,
...         streamreader=None,
...     )
...
py> codecs.register(getregentry)
py> u'unicode text'.encode('null')
u'unicode text'


so the documentation is correct, and the Stackoverflow answer is not.
msg332209 - (view) Author: (radiocane) Date: 2018-12-20 10:48
Given that:
1) No standard codec returns unicode
2) I consider as "most common scenario" the case where a user wants to encode a unicode object using some character encoding and get back an str-like object

I'll keep on finding "-> string or unicode" misleading. I'd rather have the same as str.encode i.e. "-> object".
Anyway thanks for your time and attention :)
msg332215 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2018-12-20 11:07
You can install any codec you like and those essentially decide
on what to return as type. However, the unicode methods only
allow strings or unicode to be returned in Python 2.
In Python 3, .encode() only allows bytes.

You can still get the full codec encode/decode functionality
via the codecs encode/decode methods in Python 3.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Dec 20 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
                      http://www.malemburg.com/
msg332220 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-12-20 11:56
> I'll keep on finding "-> string or unicode" misleading.

How is it misleading when its true?
msg332225 - (view) Author: (radiocane) Date: 2018-12-20 13:06
>> I'll keep on finding "-> string or unicode" misleading.
> How is it misleading when its true?

[I promise this is the last reply: I won't waste more of your time]

me: How fast does this car go?
docstring: 100 km/h or 300 km/h

me: actually most people use it to do 100km/h and I don't know how to do 300 km/h
people: no physical law forbids 300 km/h so it's true
people2: if you remove the seats, the lights, the windshield etc and basically end up with a chassis with four wheels and a motor, it can do 300 km/h
History
Date User Action Args
2022-04-11 14:59:09adminsetgithub: 79725
2018-12-20 13:06:44radiocanesetmessages: + msg332225
2018-12-20 11:56:17steven.dapranosetmessages: + msg332220
2018-12-20 11:07:14lemburgsetmessages: + msg332215
2018-12-20 10:48:56radiocanesetmessages: + msg332209
2018-12-20 10:25:30serhiy.storchakasetstatus: open -> closed
resolution: not a bug
stage: resolved
2018-12-20 10:18:44steven.dapranosetnosy: + steven.daprano
messages: + msg332208
2018-12-20 10:12:24serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg332206
2018-12-20 09:48:12vstinnersetnosy: + lemburg
2018-12-20 09:18:56radiocanecreate