diff -r 5ae830ff6d64 Doc/library/codecs.rst --- a/Doc/library/codecs.rst Mon May 20 22:13:39 2013 -0500 +++ b/Doc/library/codecs.rst Tue May 21 17:15:29 2013 +0300 @@ -1098,88 +1098,112 @@ | utf_8_sig | | all languages | +-----------------+--------------------------------+--------------------------------+ -A number of codecs are specific to Python, so their codec names have no meaning -outside Python. Some of them don't convert from Unicode strings to byte strings, -but instead use the property of the Python codecs machinery that any bijective -function with one argument can be considered as an encoding. +Python Specific Encodings +------------------------- -For the codecs listed below, the result in the "encoding" direction is always a -byte string. The result of the "decoding" direction is listed as operand type in -the table. +A number of predefined codecs are specific to Python, so their codec names have +no meaning outside Python. These are listed in the tables below based on the +expected input and output types (note that while text encodings are the most +common use case for codecs, the underlying codec infrastructure supports +arbitrary data transforms rather than just text encodings). For asymmetric +codecs, the stated purpose describes the encoding direction. -.. tabularcolumns:: |l|p{0.3\linewidth}|l|p{0.3\linewidth}| +The following codecs provide unicode-to-str encoding [#encoding-note]_ and +str-to-unicode decoding [#decoding-note]_, similar to the Unicode text +encodings. -+--------------------+---------------------------+----------------+---------------------------+ -| Codec | Aliases | Operand type | Purpose | -+====================+===========================+================+===========================+ -| base64_codec | base64, base-64 | byte string | Convert operand to MIME | -| | | | base64 (the result always | -| | | | includes a trailing | -| | | | ``'\n'``) | -+--------------------+---------------------------+----------------+---------------------------+ -| bz2_codec | bz2 | byte string | Compress the operand | -| | | | using bz2 | -+--------------------+---------------------------+----------------+---------------------------+ -| hex_codec | hex | byte string | Convert operand to | -| | | | hexadecimal | -| | | | representation, with two | -| | | | digits per byte | -+--------------------+---------------------------+----------------+---------------------------+ -| idna | | Unicode string | Implements :rfc:`3490`, | -| | | | see also | -| | | | :mod:`encodings.idna` | -+--------------------+---------------------------+----------------+---------------------------+ -| mbcs | dbcs | Unicode string | Windows only: Encode | -| | | | operand according to the | -| | | | ANSI codepage (CP_ACP) | -+--------------------+---------------------------+----------------+---------------------------+ -| palmos | | Unicode string | Encoding of PalmOS 3.5 | -+--------------------+---------------------------+----------------+---------------------------+ -| punycode | | Unicode string | Implements :rfc:`3492` | -+--------------------+---------------------------+----------------+---------------------------+ -| quopri_codec | quopri, quoted-printable, | byte string | Convert operand to MIME | -| | quotedprintable | | quoted printable | -+--------------------+---------------------------+----------------+---------------------------+ -| raw_unicode_escape | | Unicode string | Produce a string that is | -| | | | suitable as raw Unicode | -| | | | literal in Python source | -| | | | code | -+--------------------+---------------------------+----------------+---------------------------+ -| rot_13 | rot13 | Unicode string | Returns the Caesar-cypher | -| | | | encryption of the operand | -+--------------------+---------------------------+----------------+---------------------------+ -| string_escape | | byte string | Produce a string that is | -| | | | suitable as string | -| | | | literal in Python source | -| | | | code | -+--------------------+---------------------------+----------------+---------------------------+ -| undefined | | any | Raise an exception for | -| | | | all conversions. Can be | -| | | | used as the system | -| | | | encoding if no automatic | -| | | | :term:`coercion` between | -| | | | byte and Unicode strings | -| | | | is desired. | -+--------------------+---------------------------+----------------+---------------------------+ -| unicode_escape | | Unicode string | Produce a string that is | -| | | | suitable as Unicode | -| | | | literal in Python source | -| | | | code | -+--------------------+---------------------------+----------------+---------------------------+ -| unicode_internal | | Unicode string | Return the internal | -| | | | representation of the | -| | | | operand | -+--------------------+---------------------------+----------------+---------------------------+ -| uu_codec | uu | byte string | Convert the operand using | -| | | | uuencode | -+--------------------+---------------------------+----------------+---------------------------+ -| zlib_codec | zip, zlib | byte string | Compress the operand | -| | | | using gzip | -+--------------------+---------------------------+----------------+---------------------------+ +.. tabularcolumns:: |l|L|L| + ++--------------------+---------------------------+---------------------------+ +| Codec | Aliases | Purpose | ++====================+===========================+===========================+ +| idna | | Implements :rfc:`3490`, | +| | | see also | +| | | :mod:`encodings.idna` | ++--------------------+---------------------------+---------------------------+ +| mbcs | dbcs | Windows only: Encode | +| | | operand according to the | +| | | ANSI codepage (CP_ACP) | ++--------------------+---------------------------+---------------------------+ +| palmos | | Encoding of PalmOS 3.5 | ++--------------------+---------------------------+---------------------------+ +| punycode | | Implements :rfc:`3492` | ++--------------------+---------------------------+---------------------------+ +| raw_unicode_escape | | Produce a string that is | +| | | suitable as raw Unicode | +| | | literal in Python source | +| | | code | ++--------------------+---------------------------+---------------------------+ +| rot_13 | rot13 | Returns the Caesar-cypher | +| | | encryption of the operand | ++--------------------+---------------------------+---------------------------+ +| undefined | | Raise an exception for | +| | | all conversions. Can be | +| | | used as the system | +| | | encoding if no automatic | +| | | :term:`coercion` between | +| | | byte and Unicode strings | +| | | is desired. | ++--------------------+---------------------------+---------------------------+ +| unicode_escape | | Produce a string that is | +| | | suitable as Unicode | +| | | literal in Python source | +| | | code | ++--------------------+---------------------------+---------------------------+ +| unicode_internal | | Return the internal | +| | | representation of the | +| | | operand | ++--------------------+---------------------------+---------------------------+ .. versionadded:: 2.3 The ``idna`` and ``punycode`` encodings. +The following codecs provide str-to-str encoding and decoding +[#decoding-note]_. + +.. tabularcolumns:: |l|L|L|L| + ++--------------------+---------------------------+---------------------------+------------------------------+ +| Codec | Aliases | Purpose | Encoder/decoder | ++====================+===========================+===========================+==============================+ +| base64_codec | base64, base-64 | Convert operand to MIME | :meth:`base64.b64encode`, | +| | | base64 (the result always | :meth:`base64.b64decode` | +| | | includes a trailing | | +| | | ``'\n'``) | | ++--------------------+---------------------------+---------------------------+------------------------------+ +| bz2_codec | bz2 | Compress the operand | :meth:`bz2.compress`, | +| | | using bz2 | :meth:`bz2.decompress` | ++--------------------+---------------------------+---------------------------+------------------------------+ +| hex_codec | hex | Convert operand to | :meth:`base64.b16encode`, | +| | | hexadecimal | :meth:`base64.b16decode` | +| | | representation, with two | | +| | | digits per byte | | ++--------------------+---------------------------+---------------------------+------------------------------+ +| quopri_codec | quopri, quoted-printable, | Convert operand to MIME | :meth:`quopri.encodestring`, | +| | quotedprintable | quoted printable | :meth:`quopri.decodestring` | ++--------------------+---------------------------+---------------------------+------------------------------+ +| string_escape | | Produce a string that is | | +| | | suitable as string | | +| | | literal in Python source | | +| | | code | | ++--------------------+---------------------------+---------------------------+------------------------------+ +| uu_codec | uu | Convert the operand using | :meth:`uu.encode`, | +| | | uuencode | :meth:`uu.decode` | ++--------------------+---------------------------+---------------------------+------------------------------+ +| zlib_codec | zip, zlib | Compress the operand | :meth:`zlib.compress`, | +| | | using gzip | :meth:`zlib.decompress` | ++--------------------+---------------------------+---------------------------+------------------------------+ + +.. [#encoding-note] str objects are also accepted as input in place of unicode + objects. They are implicitly converted to unicode by decoding them using + the default encoding. If this conversion fails, it may lead to encoding + operations raising :exc:`UnicodeDecodeError`. + +.. [#decoding-note] unicode objects are also accepted as input in place of str + objects. They are implicitly converted to str by encoding them using the + default encoding. If this conversion fails, it may lead to decoding + operations raising :exc:`UnicodeEncodeError`. + :mod:`encodings.idna` --- Internationalized Domain Names in Applications ------------------------------------------------------------------------