diff -r 96c1de5acbd3 -r 0a49f6382467 Doc/library/binary.rst --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Doc/library/binary.rst Sun Jan 29 16:55:40 2012 +1000 @@ -0,0 +1,23 @@ +.. _binaryservices: + +******************** +Binary Data Services +******************** + +The modules described in this chapter provide some basic services operations +for manipulation of binary data. Other operations on binary data, specifically +in relation to file formats and network protocols, are described in the +relevant sections. + +Some libraries described under :ref:`textservices` also work with either +ASCII-compatible binary formats (for example, :mod:`re`) or all binary data +(for example, :mod:`difflib`). + +In addition, see the documentation for Python's built-in binary data types in +:ref:`binaryseq`. + +.. toctree:: + + struct.rst + codecs.rst + diff -r 96c1de5acbd3 -r 0a49f6382467 Doc/library/index.rst --- a/Doc/library/index.rst Fri Jan 27 10:53:35 2012 +0100 +++ b/Doc/library/index.rst Sun Jan 29 16:55:40 2012 +1000 @@ -46,7 +46,8 @@ stdtypes.rst exceptions.rst - strings.rst + text.rst + binary.rst datatypes.rst numeric.rst functional.rst diff -r 96c1de5acbd3 -r 0a49f6382467 Doc/library/stdtypes.rst --- a/Doc/library/stdtypes.rst Fri Jan 27 10:53:35 2012 +0100 +++ b/Doc/library/stdtypes.rst Sun Jan 29 16:55:40 2012 +1000 @@ -672,7 +672,7 @@ To clarify the above rules, here's some example Python code, -equivalent to the builtin hash, for computing the hash of a rational +equivalent to the built-in hash, for computing the hash of a rational number, :class:`float`, or :class:`complex`:: @@ -799,109 +799,77 @@ .. _typesseq: -Sequence Types --- :class:`str`, :class:`bytes`, :class:`bytearray`, :class:`list`, :class:`tuple`, :class:`range` -================================================================================================================== - -There are six sequence types: strings, byte sequences (:class:`bytes` objects), -byte arrays (:class:`bytearray` objects), lists, tuples, and range objects. For -other containers see the built in :class:`dict` and :class:`set` classes, and -the :mod:`collections` module. - - -.. index:: - object: sequence - object: string - object: bytes - object: bytearray - object: tuple - object: list - object: range - -Strings contain Unicode characters. Their literals are written in single or -double quotes: ``'xyzzy'``, ``"frobozz"``. See :ref:`strings` for more about -string literals. In addition to the functionality described here, there are -also string-specific methods described in the :ref:`string-methods` section. - -Bytes and bytearray objects contain single bytes -- the former is immutable -while the latter is a mutable sequence. Bytes objects can be constructed the -constructor, :func:`bytes`, and from literals; use a ``b`` prefix with normal -string syntax: ``b'xyzzy'``. To construct byte arrays, use the -:func:`bytearray` function. - -While string objects are sequences of characters (represented by strings of -length 1), bytes and bytearray objects are sequences of *integers* (between 0 -and 255), representing the ASCII value of single bytes. That means that for -a bytes or bytearray object *b*, ``b[0]`` will be an integer, while -``b[0:1]`` will be a bytes or bytearray object of length 1. The -representation of bytes objects uses the literal format (``b'...'``) since it -is generally more useful than e.g. ``bytes([50, 19, 100])``. You can always -convert a bytes object into a list of integers using ``list(b)``. - -Also, while in previous Python versions, byte strings and Unicode strings -could be exchanged for each other rather freely (barring encoding issues), -strings and bytes are now completely separate concepts. There's no implicit -en-/decoding if you pass an object of the wrong type. A string always -compares unequal to a bytes or bytearray object. - -Lists are constructed with square brackets, separating items with commas: ``[a, -b, c]``. Tuples are constructed by the comma operator (not within square -brackets), with or without enclosing parentheses, but an empty tuple must have -the enclosing parentheses, such as ``a, b, c`` or ``()``. A single item tuple -must have a trailing comma, such as ``(d,)``. - -Objects of type range are created using the :func:`range` function. They don't -support concatenation or repetition, and using :func:`min` or :func:`max` on -them is inefficient. - -Most sequence types support the following operations. The ``in`` and ``not in`` -operations have the same priorities as the comparison operations. The ``+`` and -``*`` operations have the same priority as the corresponding numeric operations. -[3]_ Additional methods are provided for :ref:`typesseq-mutable`. +Sequence Types --- :class:`list`, :class:`tuple`, :class:`range` +================================================================ + +There are three basic sequence types: lists, tuples, and range objects. +Additional sequence types tailored for processing of +:ref:`binary data ` and :ref:`text strings ` are +described in dedicated sections. + + +.. _typesseq-common: + +Common Sequence Operations +-------------------------- + +.. index:: object: sequence + +The operations in the following table are supported by most sequence types, +both mutable and immutable. The :class:`collections.Sequence` ABC is +provide to make it easier to correctly implement these operations on +custom sequence types. This table lists the sequence operations sorted in ascending priority (operations in the same box have the same priority). In the table, *s* and *t* -are sequences of the same type; *n*, *i*, *j* and *k* are integers. - -+------------------+--------------------------------+----------+ -| Operation | Result | Notes | -+==================+================================+==========+ -| ``x in s`` | ``True`` if an item of *s* is | \(1) | -| | equal to *x*, else ``False`` | | -+------------------+--------------------------------+----------+ -| ``x not in s`` | ``False`` if an item of *s* is | \(1) | -| | equal to *x*, else ``True`` | | -+------------------+--------------------------------+----------+ -| ``s + t`` | the concatenation of *s* and | \(6) | -| | *t* | | -+------------------+--------------------------------+----------+ -| ``s * n, n * s`` | *n* shallow copies of *s* | \(2) | -| | concatenated | | -+------------------+--------------------------------+----------+ -| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) | -+------------------+--------------------------------+----------+ -| ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) | -+------------------+--------------------------------+----------+ -| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3)(5) | -| | with step *k* | | -+------------------+--------------------------------+----------+ -| ``len(s)`` | length of *s* | | -+------------------+--------------------------------+----------+ -| ``min(s)`` | smallest item of *s* | | -+------------------+--------------------------------+----------+ -| ``max(s)`` | largest item of *s* | | -+------------------+--------------------------------+----------+ -| ``s.index(i)`` | index of the first occurence | | -| | of *i* in *s* | | -+------------------+--------------------------------+----------+ -| ``s.count(i)`` | total number of occurences of | | -| | *i* in *s* | | -+------------------+--------------------------------+----------+ - -Sequence types also support comparisons. In particular, tuples and lists are -compared lexicographically by comparing corresponding elements. This means that -to compare equal, every element must compare equal and the two sequences must be -of the same type and have the same length. (For full details see -:ref:`comparisons` in the language reference.) +are sequences of the same type, *n*, *i*, *j* and *k* are integers and *x* is +an arbitrary object that meets any type and value restrictions imposed by *s*. + +The ``in`` and ``not in`` operations have the same priorities as the +comparison operations. The ``+`` (concatenation) and ``*`` (repetition) +operations have the same priority as the corresponding numeric operations. + ++--------------------------+--------------------------------+----------+ +| Operation | Result | Notes | ++==========================+================================+==========+ +| ``x in s`` | ``True`` if an item of *s* is | \(1) | +| | equal to *x*, else ``False`` | | ++--------------------------+--------------------------------+----------+ +| ``x not in s`` | ``False`` if an item of *s* is | \(1) | +| | equal to *x*, else ``True`` | | ++--------------------------+--------------------------------+----------+ +| ``s + t`` | the concatenation of *s* and | (6)(7) | +| | *t* | | ++--------------------------+--------------------------------+----------+ +| ``s * n, n * s`` | *n* shallow copies of *s* | (2)(7) | +| | concatenated | | ++--------------------------+--------------------------------+----------+ +| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) | ++--------------------------+--------------------------------+----------+ +| ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) | ++--------------------------+--------------------------------+----------+ +| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3)(5) | +| | with step *k* | | ++--------------------------+--------------------------------+----------+ +| ``len(s)`` | length of *s* | | ++--------------------------+--------------------------------+----------+ +| ``min(s)`` | smallest item of *s* | | ++--------------------------+--------------------------------+----------+ +| ``max(s)`` | largest item of *s* | | ++--------------------------+--------------------------------+----------+ +| ``s.index(x, [i[, j]])`` | index of the first occurence | \(8) | +| | of *x* in *s* (at or after | | +| | index *i* and before index *j*)| | ++--------------------------+--------------------------------+----------+ +| ``s.count(x)`` | total number of occurences of | | +| | *x* in *s* | | ++--------------------------+--------------------------------+----------+ + +Sequences of the same type also support comparisons. In particular, tuples +and lists are compared lexicographically by comparing corresponding elements. +This means that to compare equal, every element must compare equal and the +two sequences must be of the same type and have the same length. (For full +details see :ref:`comparisons` in the language reference.) .. index:: triple: operations on; sequence; types @@ -918,14 +886,19 @@ Notes: (1) - When *s* is a string object, the ``in`` and ``not in`` operations act like a - substring test. + While the ``in`` and ``not in`` operations are used only for simple + containment testing in the general case, some specialised sequences + (such as :class:`str`, :class:`bytes` and :class:`bytearray`) also use + them for subsequence testing:: + + >>> "gg" in "eggs" + True (2) Values of *n* less than ``0`` are treated as ``0`` (which yields an empty sequence of the same type as *s*). Note also that the copies are shallow; nested structures are not copied. This often haunts new Python programmers; - consider: + consider:: >>> lists = [[]] * 3 >>> lists @@ -937,7 +910,7 @@ What has happened is that ``[[]]`` is a one-element list containing an empty list, so all three elements of ``[[]] * 3`` are (pointers to) this single empty list. Modifying any of the elements of ``lists`` modifies this single list. - You can create a list of different lists this way: + You can create a list of different lists this way:: >>> lists = [[] for i in range(3)] >>> lists[0].append(3) @@ -968,18 +941,325 @@ If *k* is ``None``, it is treated like ``1``. (6) - Concatenating immutable strings always results in a new object. This means - that building up a string by repeated concatenation will have a quadratic - runtime cost in the total string length. To get a linear runtime cost, - you must switch to one of the alternatives below: + Concatenating immutable sequences always results in a new object. This + means that building up a sequence by repeated concatenation will have a + quadratic runtime cost in the total sequence length. To get a linear + runtime cost, you must switch to one of the alternatives below: * if concatenating :class:`str` objects, you can build a list and use - :meth:`str.join` at the end; + :meth:`str.join` at the end or else write to a :class:`io.StringIO` + instance and retrieve its value when complete; * if concatenating :class:`bytes` objects, you can similarly use - :meth:`bytes.join`, or you can do in-place concatenation with a - :class:`bytearray` object. :class:`bytearray` objects are mutable and - have an efficient overallocation mechanism. + :meth:`bytes.join` or :class:`io.BytesIO`, or you can do in-place + concatenation with a :class:`bytearray` object. :class:`bytearray` + objects are mutable and have an efficient overallocation mechanism. + + * if concatenating :class:`tuple` objects, extend a :class:`list` instead. + + * for other types, investigate the relevant class documentation + +(7) + Some sequence types (such as :class:`range`) only support item sequences + that follow specific patterns, and hence don't support sequence + concatentation or repetition. + +(8) + ``index`` raises :exc:`ValueError` when *x* is not found in *s*. + When supported, the additional arguments to the index method allow + efficient searching of subsections of the sequence. Passing the extra + arguments is roughly equivalent to using ``s[i:j].index(x)``, only + without copying any data and with the returned index being relative to + the start of the sequence rather than the start of the slice. + + +.. _typesseq-immutable: + +Immutable Sequence Types +------------------------ + +.. index:: + triple: immutable; sequence; types + object: tuple + +The only operation that immutable sequence types generally implement that is +not also implemented by mutable sequence types is support for the :func:`hash` +built-in. + +This support allows immutable sequences, such as :class:`tuple` instances, to +be used as :class:`dict` keys and stored in :class:`set` and :class:`frozenset` +instances. + + +.. _typesseq-mutable: + +Mutable Sequence Types +---------------------- + +.. index:: + triple: mutable; sequence; types + object: list + object: bytearray + +Mutable sequences, such as :class:`list`, support additional operations that +allow in-place modification of the object. Custom mutable sequence types are +generally expected to support these operations (although leaving output +``sort()`` is a common exception). + +The operations in the following table are defined on mutable sequence types. +The :class:`collections.MutableSequence` ABC is provided to make it easier to +correctly implement these operations on custom sequence types. + +In the table *s* is an instance of a mutable sequence type, *t* is any +iterable object and *x* is an arbitrary object that meets any type +and value restrictions imposed by *s* (for example, :class:`bytearray` only +accepts integers that meet the value restriction ``0 <= x <= 255``). + + +.. index:: + triple: operations on; sequence; types + triple: operations on; list; type + pair: subscript; assignment + pair: slice; assignment + statement: del + single: append() (sequence method) + single: extend() (sequence method) + single: count() (sequence method) + single: index() (sequence method) + single: insert() (sequence method) + single: pop() (sequence method) + single: remove() (sequence method) + single: reverse() (sequence method) + single: sort() (sequence method) + ++------------------------------+--------------------------------+---------------------+ +| Operation | Result | Notes | ++==============================+================================+=====================+ +| ``s[i] = x`` | item *i* of *s* is replaced by | | +| | *x* | | ++------------------------------+--------------------------------+---------------------+ +| ``s[i:j] = t`` | slice of *s* from *i* to *j* | | +| | is replaced by the contents of | | +| | the iterable *t* | | ++------------------------------+--------------------------------+---------------------+ +| ``del s[i:j]`` | same as ``s[i:j] = []`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) | +| | are replaced by those of *t* | | ++------------------------------+--------------------------------+---------------------+ +| ``del s[i:j:k]`` | removes the elements of | | +| | ``s[i:j:k]`` from the list | | ++------------------------------+--------------------------------+---------------------+ +| ``s.append(x)`` | same as ``s[len(s):len(s)] = | | +| | [x]`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.clear()`` | remove all items from ``s`` | \(5) | +| | (same as ``del s[:]``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.copy()`` | return a shallow copy of ``s`` | \(5) | +| | (same as ``s[:]``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.extend(t)`` | same as ``s[len(s):len(s)] = | | +| | t`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(2) | +| | return x`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(3) | ++------------------------------+--------------------------------+---------------------+ +| ``s.reverse()`` | reverses the items of *s* in | \(4) | +| | place | | ++------------------------------+--------------------------------+---------------------+ + + +Notes: + +(1) + *t* must have the same length as the slice it is replacing. + +(2) + The optional argument *i* defaults to ``-1``, so that by default the last + item is removed and returned. + +(3) + ``remove`` raises :exc:`ValueError` when *x* is not found in *s*. + +(4) + The :meth:`reverse` method modifies the sequence in place for economy of + space when reversing a large sequence. To remind users that it operates by + side effect, it does not return the reversed sequence. + +(5) + :meth:`clear` and :meth:`!copy` are included for consistency with the + interfaces of mutable containers that don't support slicing operations + (such as :class:`dict` and :class:`set`) + + .. versionadded:: 3.3 + :meth:`clear` and :meth:`!copy` methods. + + +.. _typesseq-list: + +Lists +----- + +.. index:: object: list + +Lists are mutable sequences, typically used to store collections of +homogeneous items (where the precise degree of similarity will vary by +application). + +Lists may be constructed in several ways: + +* Using a pair of square brackets to denote the empty list: ``[]`` +* Using square brackets, separating items with commas: ``[a]``, ``[a, b, c]`` +* Using a list comprehension: ``[x for x in iterable]`` +* Using the :func:`list` built-in: ``list()`` or ``list(iterable)`` + +Many other operations also produce lists, including the :func:`sorted` built-in. + +Lists implement all of the :ref:`common ` and +:ref:`mutable ` sequence operations. Lists also provide the +following additional method: + +.. method:: list.sort(*, key=None, reverse=None) + + This method sorts the list in place, using only ``<`` comparisons + between items. Exceptions are not suppressed - if any comparison operations + fail, the entire sort operation will fail (and the list will likely be left + in a partially modified state). + + *key* specifies a function of one argument that is used to extract a + comparison key from each list element (for example, ``key=str.lower``). + The key corresponding to each item in the list is calculated once and + then used for the entire sorting process. The default value of ``None`` + means that list items are sorted directly without calculating a separate + key value. + + The :func:`functools.cmp_to_key` utility is available to convert a 2.x + style *cmp* function to a *key* function. + + *reverse* is a boolean value. If set to ``True``, then the list elements + are sorted as if each comparison were reversed. + + This method modifies the sequence in place for economy of space when + sorting a large sequence. To remind users that it operates by side + effect, it does not return the sorted sequence (use :func:`sorted` to + explicitly request a new sorted list instance). + + The :meth:`sort` method is guaranteed to be stable. A sort is stable if it + guarantees not to change the relative order of elements that compare equal + --- this is helpful for sorting in multiple passes (for example, sort by + department, then by salary grade). + + .. impl-detail:: + + While a list is being sorted, the effect of attempting to mutate, or even + inspect, the list is undefined. The C implementation of Python makes the + list appear empty for the duration, and raises :exc:`ValueError` if it can + detect that the list has been mutated during a sort. + + +.. _typesseq-tuple: + +Tuples +------ + +.. index:: object: tuple + +Tuples are immutable sequences, typically used to store collections of +heterogeneous data (such as the 2-tuples produced by the :func:`enumerate` +built-in). Tuples are also used for cases where an immutable sequence of +homogeneous data is needed (such as allowing storage in a :class:`set` or +:class:`dict` instance). + +Tuples may be constructed in a number of ways: + +* Using a pair of parentheses to denote the empty tuple: ``()`` +* Using a trailing comma for a singleton tuple: ``a,`` or ``(a,)`` +* Separating items with commas: ``a, b, c`` or ``(a, b, c)`` +* Using the :func:`tuple` built-in: ``tuple()`` or ``tuple(iterable)`` + +Note that the parentheses are optional (except in the empty tuple case, or +when needed to avoid syntactic ambiguity). It is actually the comma which +makes a tuple, not the parentheses. + +Tuples implement all of the :ref:`common ` sequence +operations. + + +.. _typesseq-range: + +Ranges +------ + +.. index:: object: range + +The :class:`range` type represents an immutable sequence of numbers and is +commonly used for looping a specific number of times. Instances are created +using the :func:`range` built-in. + +For positive indices with results between the defined ``start`` and ``stop`` +values, integers within the range are determined by the formula: +``r[i] = start + step*i`` + +For negative indices and slicing operations, a range instance determines the +appropriate result for the corresponding tuple and returns either the +appropriate integer (for negative indices) or an appropriate range object +(for slicing operations) . + +The advantage of the :class:`range` type over a regular :class:`list` or +:class:`tuple` is that a :class:`range` object will always take the same +(small) amount of memory, no matter the size of the range it represents (as it +only stores the ``start``, ``stop`` and ``step`` values, calculating individual +items and subranges as needed). + +Ranges implement all of the :ref:`common ` sequence operations +except concatenation and repetition (due to the fact that range objects can +only represent sequences that follow a strict pattern and repetition and +concatenation will usually violate that pattern). + + +.. _textseq: + +Text Sequence Type --- :class:`str` +=================================== + +.. index:: + object: string + object: bytes + object: bytearray + object: io.StringIO + + +Textual data in Python is handled with :class:`str` objects, which are +immutable sequences of Unicode code points. String literals are +written in a variety of ways: + +* Single quotes: ``'allows embedded "double" quotes'`` +* Double quotes: ``"allows embedded 'single' quotes"``. +* Triple quoted: ``'''Three single quotes'''``, ``"""Three double quotes"""`` + +Triple quoted strings may span multiple lines - all associated whitespace will +be included in the string literal. + +String literals that are part of a single expression and have only whitespace +between them will be implicitly converted to a single string literal. + +See :ref:`strings` for more about the various forms of string literal, +including supported escape sequences, and the ``r`` ("raw") prefix that +disables most escape sequence processing. + +Strings may also be created from other objects with the :func:`str` built-in. + +Since there is no separate "character" type, indexing a string produces +strings of length 1. That is, for a non-empty string *s*, ``s[0] == s[0:1]``. + +There is also no mutable string type, but :meth:`str.join` or +:class:`io.StringIO` can be used to efficiently construct strings from +multiple fragments. .. _string-methods: @@ -987,14 +1267,23 @@ String Methods -------------- -.. index:: pair: string; methods - -String objects support the methods listed below. - -In addition, Python's strings support the sequence type methods described in the -:ref:`typesseq` section. To output formatted strings, see the -:ref:`string-formatting` section. Also, see the :mod:`re` module for string -functions based on regular expressions. +.. index:: + pair: string; methods + module: re + +Strings implement all of the :ref:`common ` sequence +operations, along with the additional methods described below. + +Strings also support two styles of string formatting, one providing a large +degree of flexibility and customization (see :meth:`str.format`, +:ref:`formatstrings` and :ref:`string-formatting`) and the other based on C +``printf`` style formatting that handles a narrower range of types and is +slightly harder to use correctly, but is often faster for the cases it can +handle (:ref:`old-string-formatting`). + +The :ref:`textservices` section of the standard library covers a number of +other modules that provide various text related utilities (including regular +expression support in the :mod:`re` module). .. method:: str.capitalize() @@ -1449,8 +1738,8 @@ .. _old-string-formatting: -Old String Formatting Operations --------------------------------- +``printf``-style String Formatting +---------------------------------- .. index:: single: formatting, string (%) @@ -1462,18 +1751,19 @@ single: % formatting single: % interpolation -.. XXX is the note enough? - .. note:: - The formatting operations described here are obsolete and may go away in future - versions of Python. Use the new :ref:`string-formatting` in new code. + The formatting operations described here exhibit a variety of quirks that + lead to a number of common errors (such as failing to display tuples and + dictionaries correctly). Using the newer :meth:`str.format` interface + helps avoid these errors, and also provides a generally more powerful, + flexible and extensible approach to formatting text. String objects have one unique built-in operation: the ``%`` operator (modulo). This is also known as the string *formatting* or *interpolation* operator. Given ``format % values`` (where *format* is a string), ``%`` conversion specifications in *format* are replaced with zero or more elements of *values*. -The effect is similar to the using :c:func:`sprintf` in the C language. +The effect is similar to using the :c:func:`sprintf` in the C language. If *format* requires a single argument, *values* may be a single non-tuple object. [5]_ Otherwise, *values* must be a tuple with exactly the number of @@ -1631,228 +1921,172 @@ ``%f`` conversions for numbers whose absolute value is over 1e50 are no longer replaced by ``%g`` conversions. + +.. _binaryseq: + +Binary Sequence Types --- :class:`bytes`, :class:`bytearray`, :class:`memoryview` +================================================================================= + .. index:: - module: string - module: re - -Additional string operations are defined in standard modules :mod:`string` and -:mod:`re`. - - -.. _typesseq-range: - -Range Type ----------- - -.. index:: object: range - -The :class:`range` type is an immutable sequence which is commonly used for -looping. The advantage of the :class:`range` type is that an :class:`range` -object will always take the same amount of memory, no matter the size of the -range it represents. - -Range objects have relatively little behavior: they support indexing, contains, -iteration, the :func:`len` function, and the following methods: - -.. method:: range.count(x) - - Return the number of *i*'s for which ``s[i] == x``. - - .. versionadded:: 3.2 - -.. method:: range.index(x) - - Return the smallest *i* such that ``s[i] == x``. Raises - :exc:`ValueError` when *x* is not in the range. - - .. versionadded:: 3.2 - -.. _typesseq-mutable: - -Mutable Sequence Types ----------------------- - -.. index:: - triple: mutable; sequence; types - object: list + object: bytes object: bytearray - -List and bytearray objects support additional operations that allow in-place -modification of the object. Other mutable sequence types (when added to the -language) should also support these operations. Strings and tuples are -immutable sequence types: such objects cannot be modified once created. The -following operations are defined on mutable sequence types (where *x* is an -arbitrary object). - -Note that while lists allow their items to be of any type, bytearray object -"items" are all integers in the range 0 <= x < 256. - -.. index:: - triple: operations on; sequence; types - triple: operations on; list; type - pair: subscript; assignment - pair: slice; assignment - statement: del - single: append() (sequence method) - single: extend() (sequence method) - single: count() (sequence method) - single: clear() (sequence method) - single: copy() (sequence method) - single: index() (sequence method) - single: insert() (sequence method) - single: pop() (sequence method) - single: remove() (sequence method) - single: reverse() (sequence method) - single: sort() (sequence method) - -+------------------------------+--------------------------------+---------------------+ -| Operation | Result | Notes | -+==============================+================================+=====================+ -| ``s[i] = x`` | item *i* of *s* is replaced by | | -| | *x* | | -+------------------------------+--------------------------------+---------------------+ -| ``s[i:j] = t`` | slice of *s* from *i* to *j* | | -| | is replaced by the contents of | | -| | the iterable *t* | | -+------------------------------+--------------------------------+---------------------+ -| ``del s[i:j]`` | same as ``s[i:j] = []`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) | -| | are replaced by those of *t* | | -+------------------------------+--------------------------------+---------------------+ -| ``del s[i:j:k]`` | removes the elements of | | -| | ``s[i:j:k]`` from the list | | -+------------------------------+--------------------------------+---------------------+ -| ``s.append(x)`` | same as ``s[len(s):len(s)] = | | -| | [x]`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.extend(x)`` | same as ``s[len(s):len(s)] = | \(2) | -| | x`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.clear()`` | remove all items from ``s`` | | -| | | | -+------------------------------+--------------------------------+---------------------+ -| ``s.copy()`` | return a shallow copy of ``s`` | | -| | | | -+------------------------------+--------------------------------+---------------------+ -| ``s.count(x)`` | return number of *i*'s for | | -| | which ``s[i] == x`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.index(x[, i[, j]])`` | return smallest *k* such that | \(3) | -| | ``s[k] == x`` and ``i <= k < | | -| | j`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | \(4) | -+------------------------------+--------------------------------+---------------------+ -| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(5) | -| | return x`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(3) | -+------------------------------+--------------------------------+---------------------+ -| ``s.reverse()`` | reverses the items of *s* in | \(6) | -| | place | | -+------------------------------+--------------------------------+---------------------+ -| ``s.sort([key[, reverse]])`` | sort the items of *s* in place | (6), (7), (8) | -+------------------------------+--------------------------------+---------------------+ - - -Notes: - -(1) - *t* must have the same length as the slice it is replacing. - -(2) - *x* can be any iterable object. - -(3) - Raises :exc:`ValueError` when *x* is not found in *s*. When a negative index is - passed as the second or third parameter to the :meth:`index` method, the sequence - length is added, as for slice indices. If it is still negative, it is truncated - to zero, as for slice indices. - -(4) - When a negative index is passed as the first parameter to the :meth:`insert` - method, the sequence length is added, as for slice indices. If it is still - negative, it is truncated to zero, as for slice indices. - -(5) - The optional argument *i* defaults to ``-1``, so that by default the last - item is removed and returned. - -(6) - The :meth:`sort` and :meth:`reverse` methods modify the sequence in place for - economy of space when sorting or reversing a large sequence. To remind you - that they operate by side effect, they don't return the sorted or reversed - sequence. - -(7) - The :meth:`sort` method takes optional arguments for controlling the - comparisons. Each must be specified as a keyword argument. - - *key* specifies a function of one argument that is used to extract a comparison - key from each list element: ``key=str.lower``. The default value is ``None``. - Use :func:`functools.cmp_to_key` to convert an - old-style *cmp* function to a *key* function. - - - *reverse* is a boolean value. If set to ``True``, then the list elements are - sorted as if each comparison were reversed. - - The :meth:`sort` method is guaranteed to be stable. A - sort is stable if it guarantees not to change the relative order of elements - that compare equal --- this is helpful for sorting in multiple passes (for - example, sort by department, then by salary grade). - - .. impl-detail:: - - While a list is being sorted, the effect of attempting to mutate, or even - inspect, the list is undefined. The C implementation of Python makes the - list appear empty for the duration, and raises :exc:`ValueError` if it can - detect that the list has been mutated during a sort. - -(8) - :meth:`sort` is not supported by :class:`bytearray` objects. - - .. versionadded:: 3.3 - :meth:`clear` and :meth:`!copy` methods. + object: memoryview + module: array + +The core built-in types for manipulating binary data are :class:`bytes` and +:class:`bytearray`. They are supported by :class:`memoryview` which uses +the buffer protocol to access the memory of other binary objects without +needing to make a copy. + +The :mod:`array` module supports efficient storage of basic data types like +32-bit integers and IEEE754 double-precision floating values. + +.. _typebytes: + +Bytes +----- + +.. index:: object: bytes + +Bytes objects are immutable sequences of single bytes. Since many major +binary protocols are based on the ASCII text encoding, bytes objects offer +several methods that are only valid when working with ASCII compatible +data and are closely related to string objects in a variety of other ways. + +Firstly, the syntax for bytes literals is largely the same as that for string +literals, except that a ``b`` prefix is added: + +* Single quotes: ``b'still allows embedded "double" quotes'`` +* Double quotes: ``b"still allows embedded 'single' quotes"``. +* Triple quoted: ``b'''3 single quotes'''``, ``b"""3 double quotes"""`` + +Only ASCII characters are permitted in bytes literals (regardless of the +declared source code encoding). Any binary values over 127 must be entered +into bytes literals using the appropriate escape sequence. + +As with string literals, bytes literals may also use a ``r`` prefix to disable +processing of escape sequences. See :ref:`strings` for more about the various +forms of bytes literal, including supported escape sequences. + +While bytes literals and representations are based on ASCII text, bytes +objects actually behave like immutable sequences of integers, with each +value in the sequence restricted such that ``0 <= x < 256`` (attempts to +violate this restriction will trigger :exc:`ValueError`. This is done +deliberately to emphasise that while many binary formats include ASCII based +elements and can be usefully manipulated with some text-oriented algorithms, +this is not generally the case for arbitrary binary data (blindly applying +text processing algorithms to binary data formats that are not ASCII +compatible will usually lead to data corruption). + +In addition to the literal forms, bytes objects can be created in a number of +other ways: + +* A zero-filled bytes object of a specified length: ``bytes(10)`` +* From an iterable of integers: ``bytes(range(20))`` +* Copying existing binary data via the buffer protocol: ``bytes(obj)`` + +Since bytes objects are sequences of integers, for a bytes object *b*, +``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes object of +length 1. (This contrasts with text strings, where both indexing and +slicing will produce a string of length 1) + +The representation of bytes objects uses the literal format (``b'...'``) +since it is often more useful than e.g. ``bytes([46, 46, 46])``. You can +always convert a bytes object into a list of integers using ``list(b)``. + +Note for Python 2.x users: In the Python 2.x series, a variety of implicit +conversions between 8-bit strings (the closest thing 2.x offers to a built-in +binary data type) and Unicode strings were permitted. This was a backwards +compatibility workaround to account for the fact that Python originally only +supported 8-bit text, and Unicode text was a later addition. In Python 3.x, +those implicit conversions are gone - conversions between 8-bit binary data +and Unicode text must be explicit, and bytes and string objects will always +compare unequal. + + +.. _typebytearray: + +Bytearray Objects +----------------- + +.. index:: object: bytearray + +:class:`bytearray` objects are a mutable counterpart to :class:`bytes` +objects. There is no dedicated literal syntax for bytearray objects, instead +they are always created by calling the constructor: + +* Creating an empty instance: ``bytearray()`` +* Creating a zero-filled instance with a given length: ``bytearray(10)`` +* From an iterable of integers: ``bytearray(range(20))`` +* Copying existing binary data via the buffer protocol: ``bytearray(b'Hi!)`` + +As bytearray objects are mutable, they support the +:ref:`mutable ` sequence operations in addition to the +common bytes and bytearray operations described in :ref:`bytes-methods`. .. _bytes-methods: -Bytes and Byte Array Methods ----------------------------- +Bytes and Bytearray Operations +------------------------------ .. index:: pair: bytes; methods pair: bytearray; methods -Bytes and bytearray objects, being "strings of bytes", have all methods found on -strings, with the exception of :func:`encode`, :func:`format` and -:func:`isidentifier`, which do not make sense with these types. For converting -the objects to strings, they have a :func:`decode` method. - -Wherever one of these methods needs to interpret the bytes as characters -(e.g. the :func:`is...` methods), the ASCII character set is assumed. - -.. versionadded:: 3.3 - The functions :func:`count`, :func:`find`, :func:`index`, - :func:`rfind` and :func:`rindex` have additional semantics compared to - the corresponding string functions: They also accept an integer in - range 0 to 255 (a byte) as their first argument. +Both bytes and bytearray objects support the :ref:`common ` +sequence operations. They interoperate not just with operands of the same +type, but with any object that supports the +:ref:`buffer protocol `. Due to this flexibility, they can be +freely mixed in operations without causing errors. However, the return type +of the result may depend on the order of operands. + +Due to the common use of ASCII text as the basis for binary protocols, bytes +and bytearray objects provide almost all methods found on text strings, with +the exceptions of: + +* :meth:`str.encode` (which converts text strings to bytes objects) +* :meth:`str.format` and :meth:`str.format_map` (which are used to format + text for display to users) +* :meth:`str.isidentifier`, :meth:`str.isnumeric`, :meth:`str.isdecimal`, + :meth:`str.isprintable` (which are used to check various properties of + text strings which are not typically applicable to binary protocols). + +All other string methods are supported, although sometimes with slight +differences in functionality and semantics (as described below). .. note:: The methods on bytes and bytearray objects don't accept strings as their arguments, just as the methods on strings don't accept bytes as their - arguments. For example, you have to write :: + arguments. For example, you have to write:: a = "abc" b = a.replace("a", "f") - and :: + and:: a = b"abc" b = a.replace(b"a", b"f") +Whenever a bytes or bytearray method needs to interpret the bytes as +characters (e.g. the :meth:`is...` methods, :meth:`split`, :meth:`strip`), +the ASCII character set is assumed (text strings use Unicode semantics). + +.. note:: + Using these ASCII based methods to manipulate binary data that is not + stored in an ASCII based format may lead to data corruption. + +The search operations (:keyword:`in`, :meth:`count`, :meth:`find`, +:meth:`index`, :meth:`rfind` and :meth:`rindex`) all accept both integers +in the range 0 to 255 as well bytes and byte array sequences. + +.. versionchanged:: 3.3 + All of the search methods accept an integer in range 0 to 255 (a byte) as + their first argument, not just containment testing. + + +Each bytes and bytearray instance provides a :meth:`decode` convenience +method that is the inverse of "meth:`str.encode`: .. method:: bytes.decode(encoding="utf-8", errors="strict") bytearray.decode(encoding="utf-8", errors="strict") @@ -1868,8 +2102,10 @@ .. versionchanged:: 3.1 Added support for keyword arguments. - -The bytes and bytearray types have an additional class method: +Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal +numbers are a commonly used format for describing binary data. Accordingly, +the bytes and bytearray types have an additional class method to read data in +that format: .. classmethod:: bytes.fromhex(string) bytearray.fromhex(string) @@ -1878,8 +2114,8 @@ decoding the given string object. The string must contain two hexadecimal digits per byte, spaces are ignored. - >>> bytes.fromhex('f0 f1f2 ') - b'\xf0\xf1\xf2' + >>> bytes.fromhex('2Ef0 F1f2 ') + b'.\xf0\xf1\xf2' The maketrans and translate methods differ in semantics from the versions @@ -1913,467 +2149,10 @@ .. versionadded:: 3.1 -.. _types-set: - -Set Types --- :class:`set`, :class:`frozenset` -============================================== - -.. index:: object: set - -A :dfn:`set` object is an unordered collection of distinct :term:`hashable` objects. -Common uses include membership testing, removing duplicates from a sequence, and -computing mathematical operations such as intersection, union, difference, and -symmetric difference. -(For other containers see the built in :class:`dict`, :class:`list`, -and :class:`tuple` classes, and the :mod:`collections` module.) - -Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in -set``. Being an unordered collection, sets do not record element position or -order of insertion. Accordingly, sets do not support indexing, slicing, or -other sequence-like behavior. - -There are currently two built-in set types, :class:`set` and :class:`frozenset`. -The :class:`set` type is mutable --- the contents can be changed using methods -like :meth:`add` and :meth:`remove`. Since it is mutable, it has no hash value -and cannot be used as either a dictionary key or as an element of another set. -The :class:`frozenset` type is immutable and :term:`hashable` --- its contents cannot be -altered after it is created; it can therefore be used as a dictionary key or as -an element of another set. - -Non-empty sets (not frozensets) can be created by placing a comma-separated list -of elements within braces, for example: ``{'jack', 'sjoerd'}``, in addition to the -:class:`set` constructor. - -The constructors for both classes work the same: - -.. class:: set([iterable]) - frozenset([iterable]) - - Return a new set or frozenset object whose elements are taken from - *iterable*. The elements of a set must be hashable. To represent sets of - sets, the inner sets must be :class:`frozenset` objects. If *iterable* is - not specified, a new empty set is returned. - - Instances of :class:`set` and :class:`frozenset` provide the following - operations: - - .. describe:: len(s) - - Return the cardinality of set *s*. - - .. describe:: x in s - - Test *x* for membership in *s*. - - .. describe:: x not in s - - Test *x* for non-membership in *s*. - - .. method:: isdisjoint(other) - - Return True if the set has no elements in common with *other*. Sets are - disjoint if and only if their intersection is the empty set. - - .. method:: issubset(other) - set <= other - - Test whether every element in the set is in *other*. - - .. method:: set < other - - Test whether the set is a true subset of *other*, that is, - ``set <= other and set != other``. - - .. method:: issuperset(other) - set >= other - - Test whether every element in *other* is in the set. - - .. method:: set > other - - Test whether the set is a true superset of *other*, that is, ``set >= - other and set != other``. - - .. method:: union(other, ...) - set | other | ... - - Return a new set with elements from the set and all others. - - .. method:: intersection(other, ...) - set & other & ... - - Return a new set with elements common to the set and all others. - - .. method:: difference(other, ...) - set - other - ... - - Return a new set with elements in the set that are not in the others. - - .. method:: symmetric_difference(other) - set ^ other - - Return a new set with elements in either the set or *other* but not both. - - .. method:: copy() - - Return a new set with a shallow copy of *s*. - - - Note, the non-operator versions of :meth:`union`, :meth:`intersection`, - :meth:`difference`, and :meth:`symmetric_difference`, :meth:`issubset`, and - :meth:`issuperset` methods will accept any iterable as an argument. In - contrast, their operator based counterparts require their arguments to be - sets. This precludes error-prone constructions like ``set('abc') & 'cbs'`` - in favor of the more readable ``set('abc').intersection('cbs')``. - - Both :class:`set` and :class:`frozenset` support set to set comparisons. Two - sets are equal if and only if every element of each set is contained in the - other (each is a subset of the other). A set is less than another set if and - only if the first set is a proper subset of the second set (is a subset, but - is not equal). A set is greater than another set if and only if the first set - is a proper superset of the second set (is a superset, but is not equal). - - Instances of :class:`set` are compared to instances of :class:`frozenset` - based on their members. For example, ``set('abc') == frozenset('abc')`` - returns ``True`` and so does ``set('abc') in set([frozenset('abc')])``. - - The subset and equality comparisons do not generalize to a complete ordering - function. For example, any two disjoint sets are not equal and are not - subsets of each other, so *all* of the following return ``False``: ``ab``. - - Since sets only define partial ordering (subset relationships), the output of - the :meth:`list.sort` method is undefined for lists of sets. - - Set elements, like dictionary keys, must be :term:`hashable`. - - Binary operations that mix :class:`set` instances with :class:`frozenset` - return the type of the first operand. For example: ``frozenset('ab') | - set('bc')`` returns an instance of :class:`frozenset`. - - The following table lists operations available for :class:`set` that do not - apply to immutable instances of :class:`frozenset`: - - .. method:: update(other, ...) - set |= other | ... - - Update the set, adding elements from all others. - - .. method:: intersection_update(other, ...) - set &= other & ... - - Update the set, keeping only elements found in it and all others. - - .. method:: difference_update(other, ...) - set -= other | ... - - Update the set, removing elements found in others. - - .. method:: symmetric_difference_update(other) - set ^= other - - Update the set, keeping only elements found in either set, but not in both. - - .. method:: add(elem) - - Add element *elem* to the set. - - .. method:: remove(elem) - - Remove element *elem* from the set. Raises :exc:`KeyError` if *elem* is - not contained in the set. - - .. method:: discard(elem) - - Remove element *elem* from the set if it is present. - - .. method:: pop() - - Remove and return an arbitrary element from the set. Raises - :exc:`KeyError` if the set is empty. - - .. method:: clear() - - Remove all elements from the set. - - - Note, the non-operator versions of the :meth:`update`, - :meth:`intersection_update`, :meth:`difference_update`, and - :meth:`symmetric_difference_update` methods will accept any iterable as an - argument. - - Note, the *elem* argument to the :meth:`__contains__`, :meth:`remove`, and - :meth:`discard` methods may be a set. To support searching for an equivalent - frozenset, the *elem* set is temporarily mutated during the search and then - restored. During the search, the *elem* set should not be read or mutated - since it does not have a meaningful value. - - -.. _typesmapping: - -Mapping Types --- :class:`dict` -=============================== - -.. index:: - object: mapping - object: dictionary - triple: operations on; mapping; types - triple: operations on; dictionary; type - statement: del - builtin: len - -A :dfn:`mapping` object maps :term:`hashable` values to arbitrary objects. -Mappings are mutable objects. There is currently only one standard mapping -type, the :dfn:`dictionary`. (For other containers see the built in -:class:`list`, :class:`set`, and :class:`tuple` classes, and the -:mod:`collections` module.) - -A dictionary's keys are *almost* arbitrary values. Values that are not -:term:`hashable`, that is, values containing lists, dictionaries or other -mutable types (that are compared by value rather than by object identity) may -not be used as keys. Numeric types used for keys obey the normal rules for -numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``) -then they can be used interchangeably to index the same dictionary entry. (Note -however, that since computers store floating-point numbers as approximations it -is usually unwise to use them as dictionary keys.) - -Dictionaries can be created by placing a comma-separated list of ``key: value`` -pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098: -'jack', 4127: 'sjoerd'}``, or by the :class:`dict` constructor. - -.. class:: dict([arg]) - - Return a new dictionary initialized from an optional positional argument or - from a set of keyword arguments. If no arguments are given, return a new - empty dictionary. If the positional argument *arg* is a mapping object, - return a dictionary mapping the same keys to the same values as does the - mapping object. Otherwise the positional argument must be a sequence, a - container that supports iteration, or an iterator object. The elements of - the argument must each also be of one of those kinds, and each must in turn - contain exactly two objects. The first is used as a key in the new - dictionary, and the second as the key's value. If a given key is seen more - than once, the last value associated with it is retained in the new - dictionary. - - If keyword arguments are given, the keywords themselves with their associated - values are added as items to the dictionary. If a key is specified both in - the positional argument and as a keyword argument, the value associated with - the keyword is retained in the dictionary. For example, these all return a - dictionary equal to ``{"one": 1, "two": 2}``: - - * ``dict(one=1, two=2)`` - * ``dict({'one': 1, 'two': 2})`` - * ``dict(zip(('one', 'two'), (1, 2)))`` - * ``dict([['two', 2], ['one', 1]])`` - - The first example only works for keys that are valid Python identifiers; the - others work with any valid keys. - - - These are the operations that dictionaries support (and therefore, custom - mapping types should support too): - - .. describe:: len(d) - - Return the number of items in the dictionary *d*. - - .. describe:: d[key] - - Return the item of *d* with key *key*. Raises a :exc:`KeyError` if *key* is - not in the map. - - If a subclass of dict defines a method :meth:`__missing__`, if the key *key* - is not present, the ``d[key]`` operation calls that method with the key *key* - as argument. The ``d[key]`` operation then returns or raises whatever is - returned or raised by the ``__missing__(key)`` call if the key is not - present. No other operations or methods invoke :meth:`__missing__`. If - :meth:`__missing__` is not defined, :exc:`KeyError` is raised. - :meth:`__missing__` must be a method; it cannot be an instance variable:: - - >>> class Counter(dict): - ... def __missing__(self, key): - ... return 0 - >>> c = Counter() - >>> c['red'] - 0 - >>> c['red'] += 1 - >>> c['red'] - 1 - - See :class:`collections.Counter` for a complete implementation including - other methods helpful for accumulating and managing tallies. - - .. describe:: d[key] = value - - Set ``d[key]`` to *value*. - - .. describe:: del d[key] - - Remove ``d[key]`` from *d*. Raises a :exc:`KeyError` if *key* is not in the - map. - - .. describe:: key in d - - Return ``True`` if *d* has a key *key*, else ``False``. - - .. describe:: key not in d - - Equivalent to ``not key in d``. - - .. describe:: iter(d) - - Return an iterator over the keys of the dictionary. This is a shortcut - for ``iter(d.keys())``. - - .. method:: clear() - - Remove all items from the dictionary. - - .. method:: copy() - - Return a shallow copy of the dictionary. - - .. classmethod:: fromkeys(seq[, value]) - - Create a new dictionary with keys from *seq* and values set to *value*. - - :meth:`fromkeys` is a class method that returns a new dictionary. *value* - defaults to ``None``. - - .. method:: get(key[, default]) - - Return the value for *key* if *key* is in the dictionary, else *default*. - If *default* is not given, it defaults to ``None``, so that this method - never raises a :exc:`KeyError`. - - .. method:: items() - - Return a new view of the dictionary's items (``(key, value)`` pairs). See - below for documentation of view objects. - - .. method:: keys() - - Return a new view of the dictionary's keys. See below for documentation of - view objects. - - .. method:: pop(key[, default]) - - If *key* is in the dictionary, remove it and return its value, else return - *default*. If *default* is not given and *key* is not in the dictionary, - a :exc:`KeyError` is raised. - - .. method:: popitem() - - Remove and return an arbitrary ``(key, value)`` pair from the dictionary. - - :meth:`popitem` is useful to destructively iterate over a dictionary, as - often used in set algorithms. If the dictionary is empty, calling - :meth:`popitem` raises a :exc:`KeyError`. - - .. method:: setdefault(key[, default]) - - If *key* is in the dictionary, return its value. If not, insert *key* - with a value of *default* and return *default*. *default* defaults to - ``None``. - - .. method:: update([other]) - - Update the dictionary with the key/value pairs from *other*, overwriting - existing keys. Return ``None``. - - :meth:`update` accepts either another dictionary object or an iterable of - key/value pairs (as tuples or other iterables of length two). If keyword - arguments are specified, the dictionary is then updated with those - key/value pairs: ``d.update(red=1, blue=2)``. - - .. method:: values() - - Return a new view of the dictionary's values. See below for documentation of - view objects. - - -.. _dict-views: - -Dictionary view objects ------------------------ - -The objects returned by :meth:`dict.keys`, :meth:`dict.values` and -:meth:`dict.items` are *view objects*. They provide a dynamic view on the -dictionary's entries, which means that when the dictionary changes, the view -reflects these changes. - -Dictionary views can be iterated over to yield their respective data, and -support membership tests: - -.. describe:: len(dictview) - - Return the number of entries in the dictionary. - -.. describe:: iter(dictview) - - Return an iterator over the keys, values or items (represented as tuples of - ``(key, value)``) in the dictionary. - - Keys and values are iterated over in an arbitrary order which is non-random, - varies across Python implementations, and depends on the dictionary's history - of insertions and deletions. If keys, values and items views are iterated - over with no intervening modifications to the dictionary, the order of items - will directly correspond. This allows the creation of ``(value, key)`` pairs - using :func:`zip`: ``pairs = zip(d.values(), d.keys())``. Another way to - create the same list is ``pairs = [(v, k) for (k, v) in d.items()]``. - - Iterating views while adding or deleting entries in the dictionary may raise - a :exc:`RuntimeError` or fail to iterate over all entries. - -.. describe:: x in dictview - - Return ``True`` if *x* is in the underlying dictionary's keys, values or - items (in the latter case, *x* should be a ``(key, value)`` tuple). - - -Keys views are set-like since their entries are unique and hashable. If all -values are hashable, so that ``(key, value)`` pairs are unique and hashable, -then the items view is also set-like. (Values views are not treated as set-like -since the entries are generally not unique.) For set-like views, all of the -operations defined for the abstract base class :class:`collections.Set` are -available (for example, ``==``, ``<``, or ``^``). - -An example of dictionary view usage:: - - >>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500} - >>> keys = dishes.keys() - >>> values = dishes.values() - - >>> # iteration - >>> n = 0 - >>> for val in values: - ... n += val - >>> print(n) - 504 - - >>> # keys and values are iterated over in the same order - >>> list(keys) - ['eggs', 'bacon', 'sausage', 'spam'] - >>> list(values) - [2, 1, 1, 500] - - >>> # view objects are dynamic and reflect dict changes - >>> del dishes['eggs'] - >>> del dishes['sausage'] - >>> list(keys) - ['spam', 'bacon'] - - >>> # set operations - >>> keys & {'eggs', 'bacon', 'salad'} - {'bacon'} - >>> keys ^ {'sausage', 'juice'} - {'juice', 'sausage', 'bacon', 'spam'} - - .. _typememoryview: -memoryview type -=============== +Memory Views +------------ :class:`memoryview` objects allow Python code to access the internal data of an object that supports the :ref:`buffer protocol ` without @@ -2536,6 +2315,463 @@ .. memoryview.suboffsets isn't documented because it only seems useful for C +.. _types-set: + +Set Types --- :class:`set`, :class:`frozenset` +============================================== + +.. index:: object: set + +A :dfn:`set` object is an unordered collection of distinct :term:`hashable` objects. +Common uses include membership testing, removing duplicates from a sequence, and +computing mathematical operations such as intersection, union, difference, and +symmetric difference. +(For other containers see the built in :class:`dict`, :class:`list`, +and :class:`tuple` classes, and the :mod:`collections` module.) + +Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in +set``. Being an unordered collection, sets do not record element position or +order of insertion. Accordingly, sets do not support indexing, slicing, or +other sequence-like behavior. + +There are currently two built-in set types, :class:`set` and :class:`frozenset`. +The :class:`set` type is mutable --- the contents can be changed using methods +like :meth:`add` and :meth:`remove`. Since it is mutable, it has no hash value +and cannot be used as either a dictionary key or as an element of another set. +The :class:`frozenset` type is immutable and :term:`hashable` --- its contents cannot be +altered after it is created; it can therefore be used as a dictionary key or as +an element of another set. + +Non-empty sets (not frozensets) can be created by placing a comma-separated list +of elements within braces, for example: ``{'jack', 'sjoerd'}``, in addition to the +:class:`set` constructor. + +The constructors for both classes work the same: + +.. class:: set([iterable]) + frozenset([iterable]) + + Return a new set or frozenset object whose elements are taken from + *iterable*. The elements of a set must be hashable. To represent sets of + sets, the inner sets must be :class:`frozenset` objects. If *iterable* is + not specified, a new empty set is returned. + + Instances of :class:`set` and :class:`frozenset` provide the following + operations: + + .. describe:: len(s) + + Return the cardinality of set *s*. + + .. describe:: x in s + + Test *x* for membership in *s*. + + .. describe:: x not in s + + Test *x* for non-membership in *s*. + + .. method:: isdisjoint(other) + + Return True if the set has no elements in common with *other*. Sets are + disjoint if and only if their intersection is the empty set. + + .. method:: issubset(other) + set <= other + + Test whether every element in the set is in *other*. + + .. method:: set < other + + Test whether the set is a true subset of *other*, that is, + ``set <= other and set != other``. + + .. method:: issuperset(other) + set >= other + + Test whether every element in *other* is in the set. + + .. method:: set > other + + Test whether the set is a true superset of *other*, that is, ``set >= + other and set != other``. + + .. method:: union(other, ...) + set | other | ... + + Return a new set with elements from the set and all others. + + .. method:: intersection(other, ...) + set & other & ... + + Return a new set with elements common to the set and all others. + + .. method:: difference(other, ...) + set - other - ... + + Return a new set with elements in the set that are not in the others. + + .. method:: symmetric_difference(other) + set ^ other + + Return a new set with elements in either the set or *other* but not both. + + .. method:: copy() + + Return a new set with a shallow copy of *s*. + + + Note, the non-operator versions of :meth:`union`, :meth:`intersection`, + :meth:`difference`, and :meth:`symmetric_difference`, :meth:`issubset`, and + :meth:`issuperset` methods will accept any iterable as an argument. In + contrast, their operator based counterparts require their arguments to be + sets. This precludes error-prone constructions like ``set('abc') & 'cbs'`` + in favor of the more readable ``set('abc').intersection('cbs')``. + + Both :class:`set` and :class:`frozenset` support set to set comparisons. Two + sets are equal if and only if every element of each set is contained in the + other (each is a subset of the other). A set is less than another set if and + only if the first set is a proper subset of the second set (is a subset, but + is not equal). A set is greater than another set if and only if the first set + is a proper superset of the second set (is a superset, but is not equal). + + Instances of :class:`set` are compared to instances of :class:`frozenset` + based on their members. For example, ``set('abc') == frozenset('abc')`` + returns ``True`` and so does ``set('abc') in set([frozenset('abc')])``. + + The subset and equality comparisons do not generalize to a complete ordering + function. For example, any two disjoint sets are not equal and are not + subsets of each other, so *all* of the following return ``False``: ``ab``. + + Since sets only define partial ordering (subset relationships), the output of + the :meth:`list.sort` method is undefined for lists of sets. + + Set elements, like dictionary keys, must be :term:`hashable`. + + Binary operations that mix :class:`set` instances with :class:`frozenset` + return the type of the first operand. For example: ``frozenset('ab') | + set('bc')`` returns an instance of :class:`frozenset`. + + The following table lists operations available for :class:`set` that do not + apply to immutable instances of :class:`frozenset`: + + .. method:: update(other, ...) + set |= other | ... + + Update the set, adding elements from all others. + + .. method:: intersection_update(other, ...) + set &= other & ... + + Update the set, keeping only elements found in it and all others. + + .. method:: difference_update(other, ...) + set -= other | ... + + Update the set, removing elements found in others. + + .. method:: symmetric_difference_update(other) + set ^= other + + Update the set, keeping only elements found in either set, but not in both. + + .. method:: add(elem) + + Add element *elem* to the set. + + .. method:: remove(elem) + + Remove element *elem* from the set. Raises :exc:`KeyError` if *elem* is + not contained in the set. + + .. method:: discard(elem) + + Remove element *elem* from the set if it is present. + + .. method:: pop() + + Remove and return an arbitrary element from the set. Raises + :exc:`KeyError` if the set is empty. + + .. method:: clear() + + Remove all elements from the set. + + + Note, the non-operator versions of the :meth:`update`, + :meth:`intersection_update`, :meth:`difference_update`, and + :meth:`symmetric_difference_update` methods will accept any iterable as an + argument. + + Note, the *elem* argument to the :meth:`__contains__`, :meth:`remove`, and + :meth:`discard` methods may be a set. To support searching for an equivalent + frozenset, the *elem* set is temporarily mutated during the search and then + restored. During the search, the *elem* set should not be read or mutated + since it does not have a meaningful value. + + +.. _typesmapping: + +Mapping Types --- :class:`dict` +=============================== + +.. index:: + object: mapping + object: dictionary + triple: operations on; mapping; types + triple: operations on; dictionary; type + statement: del + builtin: len + +A :dfn:`mapping` object maps :term:`hashable` values to arbitrary objects. +Mappings are mutable objects. There is currently only one standard mapping +type, the :dfn:`dictionary`. (For other containers see the built in +:class:`list`, :class:`set`, and :class:`tuple` classes, and the +:mod:`collections` module.) + +A dictionary's keys are *almost* arbitrary values. Values that are not +:term:`hashable`, that is, values containing lists, dictionaries or other +mutable types (that are compared by value rather than by object identity) may +not be used as keys. Numeric types used for keys obey the normal rules for +numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``) +then they can be used interchangeably to index the same dictionary entry. (Note +however, that since computers store floating-point numbers as approximations it +is usually unwise to use them as dictionary keys.) + +Dictionaries can be created by placing a comma-separated list of ``key: value`` +pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098: +'jack', 4127: 'sjoerd'}``, or by the :class:`dict` constructor. + +.. class:: dict([arg]) + + Return a new dictionary initialized from an optional positional argument or + from a set of keyword arguments. If no arguments are given, return a new + empty dictionary. If the positional argument *arg* is a mapping object, + return a dictionary mapping the same keys to the same values as does the + mapping object. Otherwise the positional argument must be a sequence, a + container that supports iteration, or an iterator object. The elements of + the argument must each also be of one of those kinds, and each must in turn + contain exactly two objects. The first is used as a key in the new + dictionary, and the second as the key's value. If a given key is seen more + than once, the last value associated with it is retained in the new + dictionary. + + If keyword arguments are given, the keywords themselves with their associated + values are added as items to the dictionary. If a key is specified both in + the positional argument and as a keyword argument, the value associated with + the keyword is retained in the dictionary. For example, these all return a + dictionary equal to ``{"one": 1, "two": 2}``: + + * ``dict(one=1, two=2)`` + * ``dict({'one': 1, 'two': 2})`` + * ``dict(zip(('one', 'two'), (1, 2)))`` + * ``dict([['two', 2], ['one', 1]])`` + + The first example only works for keys that are valid Python identifiers; the + others work with any valid keys. + + + These are the operations that dictionaries support (and therefore, custom + mapping types should support too): + + .. describe:: len(d) + + Return the number of items in the dictionary *d*. + + .. describe:: d[key] + + Return the item of *d* with key *key*. Raises a :exc:`KeyError` if *key* is + not in the map. + + If a subclass of dict defines a method :meth:`__missing__`, if the key *key* + is not present, the ``d[key]`` operation calls that method with the key *key* + as argument. The ``d[key]`` operation then returns or raises whatever is + returned or raised by the ``__missing__(key)`` call if the key is not + present. No other operations or methods invoke :meth:`__missing__`. If + :meth:`__missing__` is not defined, :exc:`KeyError` is raised. + :meth:`__missing__` must be a method; it cannot be an instance variable:: + + >>> class Counter(dict): + ... def __missing__(self, key): + ... return 0 + >>> c = Counter() + >>> c['red'] + 0 + >>> c['red'] += 1 + >>> c['red'] + 1 + + See :class:`collections.Counter` for a complete implementation including + other methods helpful for accumulating and managing tallies. + + .. describe:: d[key] = value + + Set ``d[key]`` to *value*. + + .. describe:: del d[key] + + Remove ``d[key]`` from *d*. Raises a :exc:`KeyError` if *key* is not in the + map. + + .. describe:: key in d + + Return ``True`` if *d* has a key *key*, else ``False``. + + .. describe:: key not in d + + Equivalent to ``not key in d``. + + .. describe:: iter(d) + + Return an iterator over the keys of the dictionary. This is a shortcut + for ``iter(d.keys())``. + + .. method:: clear() + + Remove all items from the dictionary. + + .. method:: copy() + + Return a shallow copy of the dictionary. + + .. classmethod:: fromkeys(seq[, value]) + + Create a new dictionary with keys from *seq* and values set to *value*. + + :meth:`fromkeys` is a class method that returns a new dictionary. *value* + defaults to ``None``. + + .. method:: get(key[, default]) + + Return the value for *key* if *key* is in the dictionary, else *default*. + If *default* is not given, it defaults to ``None``, so that this method + never raises a :exc:`KeyError`. + + .. method:: items() + + Return a new view of the dictionary's items (``(key, value)`` pairs). See + below for documentation of view objects. + + .. method:: keys() + + Return a new view of the dictionary's keys. See below for documentation of + view objects. + + .. method:: pop(key[, default]) + + If *key* is in the dictionary, remove it and return its value, else return + *default*. If *default* is not given and *key* is not in the dictionary, + a :exc:`KeyError` is raised. + + .. method:: popitem() + + Remove and return an arbitrary ``(key, value)`` pair from the dictionary. + + :meth:`popitem` is useful to destructively iterate over a dictionary, as + often used in set algorithms. If the dictionary is empty, calling + :meth:`popitem` raises a :exc:`KeyError`. + + .. method:: setdefault(key[, default]) + + If *key* is in the dictionary, return its value. If not, insert *key* + with a value of *default* and return *default*. *default* defaults to + ``None``. + + .. method:: update([other]) + + Update the dictionary with the key/value pairs from *other*, overwriting + existing keys. Return ``None``. + + :meth:`update` accepts either another dictionary object or an iterable of + key/value pairs (as tuples or other iterables of length two). If keyword + arguments are specified, the dictionary is then updated with those + key/value pairs: ``d.update(red=1, blue=2)``. + + .. method:: values() + + Return a new view of the dictionary's values. See below for documentation of + view objects. + + +.. _dict-views: + +Dictionary view objects +----------------------- + +The objects returned by :meth:`dict.keys`, :meth:`dict.values` and +:meth:`dict.items` are *view objects*. They provide a dynamic view on the +dictionary's entries, which means that when the dictionary changes, the view +reflects these changes. + +Dictionary views can be iterated over to yield their respective data, and +support membership tests: + +.. describe:: len(dictview) + + Return the number of entries in the dictionary. + +.. describe:: iter(dictview) + + Return an iterator over the keys, values or items (represented as tuples of + ``(key, value)``) in the dictionary. + + Keys and values are iterated over in an arbitrary order which is non-random, + varies across Python implementations, and depends on the dictionary's history + of insertions and deletions. If keys, values and items views are iterated + over with no intervening modifications to the dictionary, the order of items + will directly correspond. This allows the creation of ``(value, key)`` pairs + using :func:`zip`: ``pairs = zip(d.values(), d.keys())``. Another way to + create the same list is ``pairs = [(v, k) for (k, v) in d.items()]``. + + Iterating views while adding or deleting entries in the dictionary may raise + a :exc:`RuntimeError` or fail to iterate over all entries. + +.. describe:: x in dictview + + Return ``True`` if *x* is in the underlying dictionary's keys, values or + items (in the latter case, *x* should be a ``(key, value)`` tuple). + + +Keys views are set-like since their entries are unique and hashable. If all +values are hashable, so that ``(key, value)`` pairs are unique and hashable, +then the items view is also set-like. (Values views are not treated as set-like +since the entries are generally not unique.) For set-like views, all of the +operations defined for the abstract base class :class:`collections.Set` are +available (for example, ``==``, ``<``, or ``^``). + +An example of dictionary view usage:: + + >>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500} + >>> keys = dishes.keys() + >>> values = dishes.values() + + >>> # iteration + >>> n = 0 + >>> for val in values: + ... n += val + >>> print(n) + 504 + + >>> # keys and values are iterated over in the same order + >>> list(keys) + ['eggs', 'bacon', 'sausage', 'spam'] + >>> list(values) + [2, 1, 1, 500] + + >>> # view objects are dynamic and reflect dict changes + >>> del dishes['eggs'] + >>> del dishes['sausage'] + >>> list(keys) + ['spam', 'bacon'] + + >>> # set operations + >>> keys & {'eggs', 'bacon', 'salad'} + {'bacon'} + >>> keys ^ {'sausage', 'juice'} + {'juice', 'sausage', 'bacon', 'spam'} + + .. _typecontextmanager: Context Manager Types diff -r 96c1de5acbd3 -r 0a49f6382467 Doc/library/strings.rst --- a/Doc/library/strings.rst Fri Jan 27 10:53:35 2012 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,27 +0,0 @@ -.. _stringservices: - -*************** -String Services -*************** - -The modules described in this chapter provide a wide range of string -manipulation operations. - -In addition, Python's built-in string classes support the sequence type methods -described in the :ref:`typesseq` section, and also the string-specific methods -described in the :ref:`string-methods` section. To output formatted strings, -see the :ref:`string-formatting` section. Also, see the :mod:`re` module for -string functions based on regular expressions. - - -.. toctree:: - - string.rst - re.rst - struct.rst - difflib.rst - textwrap.rst - codecs.rst - unicodedata.rst - stringprep.rst - diff -r 96c1de5acbd3 -r 0a49f6382467 Doc/library/text.rst --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Doc/library/text.rst Sun Jan 29 16:55:40 2012 +1000 @@ -0,0 +1,24 @@ +.. _stringservices: +.. _textservices: + +************************ +Text Processing Services +************************ + +The modules described in this chapter provide a wide range of string +manipulation operations and other text processing services. + +The :mod:`codecs` module described under :ref:`binaryservices` is also +highly relevant to text processing. In addition, see the documentation for +Python's built-in string type in :ref:`textseq`. + + +.. toctree:: + + string.rst + re.rst + difflib.rst + textwrap.rst + unicodedata.rst + stringprep.rst +