diff --git a/Doc/glossary.rst b/Doc/glossary.rst --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -78,6 +78,13 @@ Glossary Benevolent Dictator For Life, a.k.a. `Guido van Rossum `_, Python's creator. + binary file + A :term:`file object` able to read and write + :term:`bytes-like objects `. + + .. seealso:: + A :term:`text file` reads and writes :class:`str` objects. + bytes-like object An object that supports the :ref:`bufferobjects`, like :class:`bytes`, :class:`bytearray` or :class:`memoryview`. Bytes-like objects can @@ -225,10 +232,11 @@ Glossary etc.). File objects are also called :dfn:`file-like objects` or :dfn:`streams`. - There are actually three categories of file objects: raw binary files, - buffered binary files and text files. Their interfaces are defined in the - :mod:`io` module. The canonical way to create a file object is by using - the :func:`open` function. + There are actually three categories of file objects: raw + :term:`binary files `, buffered + :term:`binary files ` and :term:`text files `. + Their interfaces are defined in the :mod:`io` module. The canonical + way to create a file object is by using the :func:`open` function. file-like object A synonym for :term:`file object`. @@ -780,6 +788,14 @@ Glossary :meth:`~collections.somenamedtuple._asdict`. Examples of struct sequences include :data:`sys.float_info` and the return value of :func:`os.stat`. + text file + A :term:`file object` able to read and write :class:`str` objects. + Often, a text file actually accesses a byte-oriented datastream + and handles the text encoding automatically. + + .. seealso:: + A :term:`binary file` reads and write :class:`bytes` objects. + triple-quoted string A string which is bound by three instances of either a quotation mark (") or an apostrophe ('). While they don't provide any functionality diff --git a/Doc/tutorial/inputoutput.rst b/Doc/tutorial/inputoutput.rst --- a/Doc/tutorial/inputoutput.rst +++ b/Doc/tutorial/inputoutput.rst @@ -377,47 +377,64 @@ File objects have some additional method Reference for a complete guide to file objects. -.. _tut-pickle: +.. _tut-json: -The :mod:`pickle` Module ------------------------- +Saving structured data with :mod:`json` +--------------------------------------- -.. index:: module: pickle +.. index:: module: json -Strings can easily be written to and read from a file. Numbers take a bit more +Strings can easily be written to and read from a file. Numbers take a bit more effort, since the :meth:`read` method only returns strings, which will have to be passed to a function like :func:`int`, which takes a string like ``'123'`` -and returns its numeric value 123. However, when you want to save more complex -data types like lists, dictionaries, or class instances, things get a lot more -complicated. +and returns its numeric value 123. When you want to save more complex data +types like nested lists and dictionaries, parsing and serializing by hand +becomes complicated. -Rather than have users be constantly writing and debugging code to save -complicated data types, Python provides a standard module called :mod:`pickle`. -This is an amazing module that can take almost any Python object (even some -forms of Python code!), and convert it to a string representation; this process -is called :dfn:`pickling`. Reconstructing the object from the string -representation is called :dfn:`unpickling`. Between pickling and unpickling, -the string representing the object may have been stored in a file or data, or +Rather than having users constantly writing and debugging code to save +complicated data types to files, Python allows you to use the popular data +interchange format called `JSON (JavaScript Object Notation) +`_. The standard module called :mod:`json` can take Python +data hierarchies, and convert them to string representations; this process is +called :dfn:`serializing`. Reconstructing the data from the string representation +is called :dfn:`deserializing`. Between serializing and deserializing, the +string representing the object may have been stored in a file or data, or sent over a network connection to some distant machine. -If you have an object ``x``, and a file object ``f`` that's been opened for -writing, the simplest way to pickle the object takes only one line of code:: +.. note:: + The JSON format is commonly used by modern applications to allow for data + exchange. Many programmers are already familiar with it, which makes + it a good choice for interoperability. - pickle.dump(x, f) +If you have an object ``x``, you can view its JSON string representation with a +simple line of code:: -To unpickle the object again, if ``f`` is a file object which has been opened -for reading:: + >>> json.dumps([1, 'simple', 'list']) + '[1, "simple", "list"]' - x = pickle.load(f) +Another variant of the :func:`~json.dumps` function, called :func:`~json.dump`, +simply serializes the object to a :term:`text file`. So if ``f`` is a +:term:`text file` object opened for writing, we can do this:: -(There are other variants of this, used when pickling many objects or when you -don't want to write the pickled data to a file; consult the complete -documentation for :mod:`pickle` in the Python Library Reference.) + json.dump(x, f) -:mod:`pickle` is the standard way to make Python objects which can be stored and -reused by other programs or by a future invocation of the same program; the -technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is -so widely used, many authors who write Python extensions take care to ensure -that new data types such as matrices can be properly pickled and unpickled. +To decode the object again, if ``f`` is a :term:`text file` object which has +been opened for reading:: + x = json.load(f) +This simple serialization technique can handle lists and dictionaries, but +serializing arbitrary class instances in JSON requires a bit of extra effort. +The reference for the :mod:`json` module contains an explanation of this. + +.. seealso:: + + :mod:`pickle` - the pickle module + + Contrary to :ref:`JSON `, *pickle* is a protocol which allows + the serialization of arbitrarily complex Python objects. As such, it is + specific to Python and cannot be used to communicate with applications + written in other languages. It is also insecure by default: + deserializing pickle data coming from an untrusted source can execute + arbitrary code, if the data was crafted by a skilled attacker. +