diff --git a/Doc/c-api/buffer.rst b/Doc/c-api/buffer.rst --- a/Doc/c-api/buffer.rst +++ b/Doc/c-api/buffer.rst @@ -1,5 +1,10 @@ .. highlightlang:: c +.. index:: + single: buffer protocol + single: buffer interface; (see buffer protocol) + single: buffer object; (see buffer protocol) + .. _bufferobjects: Buffer Protocol @@ -10,9 +15,6 @@ .. sectionauthor:: Stefan Krah -.. index:: - single: buffer interface - Certain objects available in Python wrap access to an underlying memory array or *buffer*. Such objects include the built-in :class:`bytes` and :class:`bytearray`, and some extension types like :class:`array.array`. @@ -24,8 +26,8 @@ then desirable, in some situations, to access that buffer directly and without intermediate copying. -Python provides such a facility at the C level in the form of the *buffer -protocol*. This protocol has two sides: +Python provides such a facility at the C level in the form of the :ref:`buffer +protocol `. This protocol has two sides: .. index:: single: PyBufferProcs diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst --- a/Doc/library/functions.rst +++ b/Doc/library/functions.rst @@ -1214,34 +1214,41 @@ .. _func-str: .. function:: str(object='') - str(object[, encoding[, errors]]) + str(bytes, encoding[, errors='strict']) + str(bytes, errors[, encoding='utf-8']) Return a :ref:`string ` version of an object, using one of the - following modes: + following two modes: - If *encoding* and/or *errors* are given, :func:`str` will decode the - *object* which can either be a byte string or a character buffer using - the codec for *encoding*. The *encoding* parameter is a string giving - the name of an encoding; if the encoding is not known, :exc:`LookupError` - is raised. Error handling is done according to *errors*; this specifies the - treatment of characters which are invalid in the input encoding. If - *errors* is ``'strict'`` (the default), a :exc:`ValueError` is raised on - errors, while a value of ``'ignore'`` causes errors to be silently ignored, - and a value of ``'replace'`` causes the official Unicode replacement character, - U+FFFD, to be used to replace input characters which cannot be decoded. - See also the :mod:`codecs` module. + If no arguments are given, :func:`str` returns the empty string. If only + *object* is given, ``str(object)`` returns a nicely printable + representation of *object*. For :ref:`string ` objects, this is + the string itself. This behavior differs from :func:`repr` in that the + return value of ``str(object)`` is not usually meant for passing to + :func:`eval`. Rather, its goal is to return a printable string. One can + control the return value for *object* by defining the :meth:`__str__` + method of *object*. - When only *object* is given, this returns its nicely printable representation. - For strings, this is the string itself. The difference with ``repr(object)`` - is that ``str(object)`` does not always attempt to return a string that is - acceptable to :func:`eval`; its goal is to return a printable string. - With no arguments, this returns the empty string. + If *encoding* or *errors* is given, *object* should be a :class:`bytes` or + :class:`bytearray` object, or more generally any object that supports the + :ref:`buffer protocol `. If *object* is a :class:`bytes` + (or :class:`bytearray`) object, then :func:`str` calls + :meth:`bytes.decode(encoding, errors) ` on the object + and returns the value. Otherwise, the bytes object underlying the buffer + object is obtained before calling :meth:`bytes.decode() `. + See :ref:`binaryseq` and :ref:`bufferobjects` for information on buffer + objects. - Objects can specify what ``str(object)`` returns by defining a :meth:`__str__` - special method. + Passing a :func:`bytes ` object to :func:`str` without the *encoding* + or *errors* arguments uses the first mode (see also the :option:`-b` + command-line option to Python). For example:: - For more information on strings and string methods, see the :ref:`textseq` - section. To output formatted strings, see the :ref:`string-formatting` + >>> str(b'Zoot!') + "b'Zoot!'" + + The :class:`str` object is the string class. For more information on + strings and string methods, see the :ref:`textseq` and :ref:`string-methods` + sections. To output formatted strings, see the :ref:`string-formatting` section. In addition, see the :ref:`stringservices` section. diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2064,6 +2064,9 @@ longer replaced by ``%g`` conversions. +.. index:: + single: buffer protocol; Binary Sequence Types + .. _binaryseq: Binary Sequence Types --- :class:`bytes`, :class:`bytearray`, :class:`memoryview` @@ -2077,8 +2080,8 @@ The core built-in types for manipulating binary data are :class:`bytes` and :class:`bytearray`. They are supported by :class:`memoryview` which uses -the buffer protocol to access the memory of other binary objects without -needing to make a copy. +the :ref:`buffer protocol ` to access the memory of other +binary objects without needing to make a copy. The :mod:`array` module supports efficient storage of basic data types like 32-bit integers and IEEE754 double-precision floating values. diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py --- a/Lib/test/test_builtin.py +++ b/Lib/test/test_builtin.py @@ -1286,6 +1286,7 @@ self.assertRaises(TypeError, setattr, sys, 1, 'spam') self.assertRaises(TypeError, setattr) + # test_str(): see test_unicode.py and test_bytes.py for str() tests. def test_sum(self): self.assertEqual(sum([]), 0) diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py --- a/Lib/test/test_unicode.py +++ b/Lib/test/test_unicode.py @@ -1155,6 +1154,12 @@ self.assertRaises(TypeError, str, 42, 42, 42) + def test_constructor_args(self): + """Test various combinations of positional and keyword arguments.""" + self.assertEqual(str(), "") + # The errors argument without encoding triggers decode mode. + self.assertEqual(str(b'foo', errors="strict"), "foo") + def test_codecs_utf7(self): utfTests = [ ('A\u2262\u0391.', b'A+ImIDkQ.'), # RFC2152 example