symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/pickle.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 :mod:`pickle` --- Python object serialization
       
     2 =============================================
       
     3 
       
     4 .. index::
       
     5    single: persistence
       
     6    pair: persistent; objects
       
     7    pair: serializing; objects
       
     8    pair: marshalling; objects
       
     9    pair: flattening; objects
       
    10    pair: pickling; objects
       
    11 
       
    12 .. module:: pickle
       
    13    :synopsis: Convert Python objects to streams of bytes and back.
       
    14 .. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
       
    15 .. sectionauthor:: Barry Warsaw <barry@zope.com>
       
    16 
       
    17 The :mod:`pickle` module implements a fundamental, but powerful algorithm for
       
    18 serializing and de-serializing a Python object structure.  "Pickling" is the
       
    19 process whereby a Python object hierarchy is converted into a byte stream, and
       
    20 "unpickling" is the inverse operation, whereby a byte stream is converted back
       
    21 into an object hierarchy.  Pickling (and unpickling) is alternatively known as
       
    22 "serialization", "marshalling," [#]_ or "flattening", however, to avoid
       
    23 confusion, the terms used here are "pickling" and "unpickling".
       
    24 
       
    25 This documentation describes both the :mod:`pickle` module and the
       
    26 :mod:`cPickle` module.
       
    27 
       
    28 
       
    29 Relationship to other Python modules
       
    30 ------------------------------------
       
    31 
       
    32 The :mod:`pickle` module has an optimized cousin called the :mod:`cPickle`
       
    33 module.  As its name implies, :mod:`cPickle` is written in C, so it can be up to
       
    34 1000 times faster than :mod:`pickle`.  However it does not support subclassing
       
    35 of the :func:`Pickler` and :func:`Unpickler` classes, because in :mod:`cPickle`
       
    36 these are functions, not classes.  Most applications have no need for this
       
    37 functionality, and can benefit from the improved performance of :mod:`cPickle`.
       
    38 Other than that, the interfaces of the two modules are nearly identical; the
       
    39 common interface is described in this manual and differences are pointed out
       
    40 where necessary.  In the following discussions, we use the term "pickle" to
       
    41 collectively describe the :mod:`pickle` and :mod:`cPickle` modules.
       
    42 
       
    43 The data streams the two modules produce are guaranteed to be interchangeable.
       
    44 
       
    45 Python has a more primitive serialization module called :mod:`marshal`, but in
       
    46 general :mod:`pickle` should always be the preferred way to serialize Python
       
    47 objects.  :mod:`marshal` exists primarily to support Python's :file:`.pyc`
       
    48 files.
       
    49 
       
    50 The :mod:`pickle` module differs from :mod:`marshal` several significant ways:
       
    51 
       
    52 * The :mod:`pickle` module keeps track of the objects it has already serialized,
       
    53   so that later references to the same object won't be serialized again.
       
    54   :mod:`marshal` doesn't do this.
       
    55 
       
    56   This has implications both for recursive objects and object sharing.  Recursive
       
    57   objects are objects that contain references to themselves.  These are not
       
    58   handled by marshal, and in fact, attempting to marshal recursive objects will
       
    59   crash your Python interpreter.  Object sharing happens when there are multiple
       
    60   references to the same object in different places in the object hierarchy being
       
    61   serialized.  :mod:`pickle` stores such objects only once, and ensures that all
       
    62   other references point to the master copy.  Shared objects remain shared, which
       
    63   can be very important for mutable objects.
       
    64 
       
    65 * :mod:`marshal` cannot be used to serialize user-defined classes and their
       
    66   instances.  :mod:`pickle` can save and restore class instances transparently,
       
    67   however the class definition must be importable and live in the same module as
       
    68   when the object was stored.
       
    69 
       
    70 * The :mod:`marshal` serialization format is not guaranteed to be portable
       
    71   across Python versions.  Because its primary job in life is to support
       
    72   :file:`.pyc` files, the Python implementers reserve the right to change the
       
    73   serialization format in non-backwards compatible ways should the need arise.
       
    74   The :mod:`pickle` serialization format is guaranteed to be backwards compatible
       
    75   across Python releases.
       
    76 
       
    77 .. warning::
       
    78 
       
    79    The :mod:`pickle` module is not intended to be secure against erroneous or
       
    80    maliciously constructed data.  Never unpickle data received from an untrusted or
       
    81    unauthenticated source.
       
    82 
       
    83 Note that serialization is a more primitive notion than persistence; although
       
    84 :mod:`pickle` reads and writes file objects, it does not handle the issue of
       
    85 naming persistent objects, nor the (even more complicated) issue of concurrent
       
    86 access to persistent objects.  The :mod:`pickle` module can transform a complex
       
    87 object into a byte stream and it can transform the byte stream into an object
       
    88 with the same internal structure.  Perhaps the most obvious thing to do with
       
    89 these byte streams is to write them onto a file, but it is also conceivable to
       
    90 send them across a network or store them in a database.  The module
       
    91 :mod:`shelve` provides a simple interface to pickle and unpickle objects on
       
    92 DBM-style database files.
       
    93 
       
    94 
       
    95 Data stream format
       
    96 ------------------
       
    97 
       
    98 .. index::
       
    99    single: XDR
       
   100    single: External Data Representation
       
   101 
       
   102 The data format used by :mod:`pickle` is Python-specific.  This has the
       
   103 advantage that there are no restrictions imposed by external standards such as
       
   104 XDR (which can't represent pointer sharing); however it means that non-Python
       
   105 programs may not be able to reconstruct pickled Python objects.
       
   106 
       
   107 By default, the :mod:`pickle` data format uses a printable ASCII representation.
       
   108 This is slightly more voluminous than a binary representation.  The big
       
   109 advantage of using printable ASCII (and of some other characteristics of
       
   110 :mod:`pickle`'s representation) is that for debugging or recovery purposes it is
       
   111 possible for a human to read the pickled file with a standard text editor.
       
   112 
       
   113 There are currently 3 different protocols which can be used for pickling.
       
   114 
       
   115 * Protocol version 0 is the original ASCII protocol and is backwards compatible
       
   116   with earlier versions of Python.
       
   117 
       
   118 * Protocol version 1 is the old binary format which is also compatible with
       
   119   earlier versions of Python.
       
   120 
       
   121 * Protocol version 2 was introduced in Python 2.3.  It provides much more
       
   122   efficient pickling of :term:`new-style class`\es.
       
   123 
       
   124 Refer to :pep:`307` for more information.
       
   125 
       
   126 If a *protocol* is not specified, protocol 0 is used. If *protocol* is specified
       
   127 as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol version
       
   128 available will be used.
       
   129 
       
   130 .. versionchanged:: 2.3
       
   131    Introduced the *protocol* parameter.
       
   132 
       
   133 A binary format, which is slightly more efficient, can be chosen by specifying a
       
   134 *protocol* version >= 1.
       
   135 
       
   136 
       
   137 Usage
       
   138 -----
       
   139 
       
   140 To serialize an object hierarchy, you first create a pickler, then you call the
       
   141 pickler's :meth:`dump` method.  To de-serialize a data stream, you first create
       
   142 an unpickler, then you call the unpickler's :meth:`load` method.  The
       
   143 :mod:`pickle` module provides the following constant:
       
   144 
       
   145 
       
   146 .. data:: HIGHEST_PROTOCOL
       
   147 
       
   148    The highest protocol version available.  This value can be passed as a
       
   149    *protocol* value.
       
   150 
       
   151    .. versionadded:: 2.3
       
   152 
       
   153 .. note::
       
   154 
       
   155    Be sure to always open pickle files created with protocols >= 1 in binary mode.
       
   156    For the old ASCII-based pickle protocol 0 you can use either text mode or binary
       
   157    mode as long as you stay consistent.
       
   158 
       
   159    A pickle file written with protocol 0 in binary mode will contain lone linefeeds
       
   160    as line terminators and therefore will look "funny" when viewed in Notepad or
       
   161    other editors which do not support this format.
       
   162 
       
   163 The :mod:`pickle` module provides the following functions to make the pickling
       
   164 process more convenient:
       
   165 
       
   166 
       
   167 .. function:: dump(obj, file[, protocol])
       
   168 
       
   169    Write a pickled representation of *obj* to the open file object *file*.  This is
       
   170    equivalent to ``Pickler(file, protocol).dump(obj)``.
       
   171 
       
   172    If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is
       
   173    specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol
       
   174    version will be used.
       
   175 
       
   176    .. versionchanged:: 2.3
       
   177       Introduced the *protocol* parameter.
       
   178 
       
   179    *file* must have a :meth:`write` method that accepts a single string argument.
       
   180    It can thus be a file object opened for writing, a :mod:`StringIO` object, or
       
   181    any other custom object that meets this interface.
       
   182 
       
   183 
       
   184 .. function:: load(file)
       
   185 
       
   186    Read a string from the open file object *file* and interpret it as a pickle data
       
   187    stream, reconstructing and returning the original object hierarchy.  This is
       
   188    equivalent to ``Unpickler(file).load()``.
       
   189 
       
   190    *file* must have two methods, a :meth:`read` method that takes an integer
       
   191    argument, and a :meth:`readline` method that requires no arguments.  Both
       
   192    methods should return a string.  Thus *file* can be a file object opened for
       
   193    reading, a :mod:`StringIO` object, or any other custom object that meets this
       
   194    interface.
       
   195 
       
   196    This function automatically determines whether the data stream was written in
       
   197    binary mode or not.
       
   198 
       
   199 
       
   200 .. function:: dumps(obj[, protocol])
       
   201 
       
   202    Return the pickled representation of the object as a string, instead of writing
       
   203    it to a file.
       
   204 
       
   205    If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is
       
   206    specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol
       
   207    version will be used.
       
   208 
       
   209    .. versionchanged:: 2.3
       
   210       The *protocol* parameter was added.
       
   211 
       
   212 
       
   213 .. function:: loads(string)
       
   214 
       
   215    Read a pickled object hierarchy from a string.  Characters in the string past
       
   216    the pickled object's representation are ignored.
       
   217 
       
   218 The :mod:`pickle` module also defines three exceptions:
       
   219 
       
   220 
       
   221 .. exception:: PickleError
       
   222 
       
   223    A common base class for the other exceptions defined below.  This inherits from
       
   224    :exc:`Exception`.
       
   225 
       
   226 
       
   227 .. exception:: PicklingError
       
   228 
       
   229    This exception is raised when an unpicklable object is passed to the
       
   230    :meth:`dump` method.
       
   231 
       
   232 
       
   233 .. exception:: UnpicklingError
       
   234 
       
   235    This exception is raised when there is a problem unpickling an object. Note that
       
   236    other exceptions may also be raised during unpickling, including (but not
       
   237    necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`,
       
   238    :exc:`ImportError`, and :exc:`IndexError`.
       
   239 
       
   240 The :mod:`pickle` module also exports two callables [#]_, :class:`Pickler` and
       
   241 :class:`Unpickler`:
       
   242 
       
   243 
       
   244 .. class:: Pickler(file[, protocol])
       
   245 
       
   246    This takes a file-like object to which it will write a pickle data stream.
       
   247 
       
   248    If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is
       
   249    specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
       
   250    protocol version will be used.
       
   251 
       
   252    .. versionchanged:: 2.3
       
   253       Introduced the *protocol* parameter.
       
   254 
       
   255    *file* must have a :meth:`write` method that accepts a single string argument.
       
   256    It can thus be an open file object, a :mod:`StringIO` object, or any other
       
   257    custom object that meets this interface.
       
   258 
       
   259    :class:`Pickler` objects define one (or two) public methods:
       
   260 
       
   261 
       
   262    .. method:: dump(obj)
       
   263 
       
   264       Write a pickled representation of *obj* to the open file object given in the
       
   265       constructor.  Either the binary or ASCII format will be used, depending on the
       
   266       value of the *protocol* argument passed to the constructor.
       
   267 
       
   268 
       
   269    .. method:: clear_memo()
       
   270 
       
   271       Clears the pickler's "memo".  The memo is the data structure that remembers
       
   272       which objects the pickler has already seen, so that shared or recursive objects
       
   273       pickled by reference and not by value.  This method is useful when re-using
       
   274       picklers.
       
   275 
       
   276       .. note::
       
   277 
       
   278          Prior to Python 2.3, :meth:`clear_memo` was only available on the picklers
       
   279          created by :mod:`cPickle`.  In the :mod:`pickle` module, picklers have an
       
   280          instance variable called :attr:`memo` which is a Python dictionary.  So to clear
       
   281          the memo for a :mod:`pickle` module pickler, you could do the following::
       
   282 
       
   283             mypickler.memo.clear()
       
   284 
       
   285          Code that does not need to support older versions of Python should simply use
       
   286          :meth:`clear_memo`.
       
   287 
       
   288 It is possible to make multiple calls to the :meth:`dump` method of the same
       
   289 :class:`Pickler` instance.  These must then be matched to the same number of
       
   290 calls to the :meth:`load` method of the corresponding :class:`Unpickler`
       
   291 instance.  If the same object is pickled by multiple :meth:`dump` calls, the
       
   292 :meth:`load` will all yield references to the same object. [#]_
       
   293 
       
   294 :class:`Unpickler` objects are defined as:
       
   295 
       
   296 
       
   297 .. class:: Unpickler(file)
       
   298 
       
   299    This takes a file-like object from which it will read a pickle data stream.
       
   300    This class automatically determines whether the data stream was written in
       
   301    binary mode or not, so it does not need a flag as in the :class:`Pickler`
       
   302    factory.
       
   303 
       
   304    *file* must have two methods, a :meth:`read` method that takes an integer
       
   305    argument, and a :meth:`readline` method that requires no arguments.  Both
       
   306    methods should return a string.  Thus *file* can be a file object opened for
       
   307    reading, a :mod:`StringIO` object, or any other custom object that meets this
       
   308    interface.
       
   309 
       
   310    :class:`Unpickler` objects have one (or two) public methods:
       
   311 
       
   312 
       
   313    .. method:: load()
       
   314 
       
   315       Read a pickled object representation from the open file object given in
       
   316       the constructor, and return the reconstituted object hierarchy specified
       
   317       therein.
       
   318 
       
   319       This method automatically determines whether the data stream was written
       
   320       in binary mode or not.
       
   321 
       
   322 
       
   323    .. method:: noload()
       
   324 
       
   325       This is just like :meth:`load` except that it doesn't actually create any
       
   326       objects.  This is useful primarily for finding what's called "persistent
       
   327       ids" that may be referenced in a pickle data stream.  See section
       
   328       :ref:`pickle-protocol` below for more details.
       
   329 
       
   330       **Note:** the :meth:`noload` method is currently only available on
       
   331       :class:`Unpickler` objects created with the :mod:`cPickle` module.
       
   332       :mod:`pickle` module :class:`Unpickler`\ s do not have the :meth:`noload`
       
   333       method.
       
   334 
       
   335 
       
   336 What can be pickled and unpickled?
       
   337 ----------------------------------
       
   338 
       
   339 The following types can be pickled:
       
   340 
       
   341 * ``None``, ``True``, and ``False``
       
   342 
       
   343 * integers, long integers, floating point numbers, complex numbers
       
   344 
       
   345 * normal and Unicode strings
       
   346 
       
   347 * tuples, lists, sets, and dictionaries containing only picklable objects
       
   348 
       
   349 * functions defined at the top level of a module
       
   350 
       
   351 * built-in functions defined at the top level of a module
       
   352 
       
   353 * classes that are defined at the top level of a module
       
   354 
       
   355 * instances of such classes whose :attr:`__dict__` or :meth:`__setstate__` is
       
   356   picklable  (see section :ref:`pickle-protocol` for details)
       
   357 
       
   358 Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
       
   359 exception; when this happens, an unspecified number of bytes may have already
       
   360 been written to the underlying file. Trying to pickle a highly recursive data
       
   361 structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be
       
   362 raised in this case. You can carefully raise this limit with
       
   363 :func:`sys.setrecursionlimit`.
       
   364 
       
   365 Note that functions (built-in and user-defined) are pickled by "fully qualified"
       
   366 name reference, not by value.  This means that only the function name is
       
   367 pickled, along with the name of module the function is defined in.  Neither the
       
   368 function's code, nor any of its function attributes are pickled.  Thus the
       
   369 defining module must be importable in the unpickling environment, and the module
       
   370 must contain the named object, otherwise an exception will be raised. [#]_
       
   371 
       
   372 Similarly, classes are pickled by named reference, so the same restrictions in
       
   373 the unpickling environment apply.  Note that none of the class's code or data is
       
   374 pickled, so in the following example the class attribute ``attr`` is not
       
   375 restored in the unpickling environment::
       
   376 
       
   377    class Foo:
       
   378        attr = 'a class attr'
       
   379 
       
   380    picklestring = pickle.dumps(Foo)
       
   381 
       
   382 These restrictions are why picklable functions and classes must be defined in
       
   383 the top level of a module.
       
   384 
       
   385 Similarly, when class instances are pickled, their class's code and data are not
       
   386 pickled along with them.  Only the instance data are pickled.  This is done on
       
   387 purpose, so you can fix bugs in a class or add methods to the class and still
       
   388 load objects that were created with an earlier version of the class.  If you
       
   389 plan to have long-lived objects that will see many versions of a class, it may
       
   390 be worthwhile to put a version number in the objects so that suitable
       
   391 conversions can be made by the class's :meth:`__setstate__` method.
       
   392 
       
   393 
       
   394 .. _pickle-protocol:
       
   395 
       
   396 The pickle protocol
       
   397 -------------------
       
   398 
       
   399 .. currentmodule:: None
       
   400 
       
   401 This section describes the "pickling protocol" that defines the interface
       
   402 between the pickler/unpickler and the objects that are being serialized.  This
       
   403 protocol provides a standard way for you to define, customize, and control how
       
   404 your objects are serialized and de-serialized.  The description in this section
       
   405 doesn't cover specific customizations that you can employ to make the unpickling
       
   406 environment slightly safer from untrusted pickle data streams; see section
       
   407 :ref:`pickle-sub` for more details.
       
   408 
       
   409 
       
   410 .. _pickle-inst:
       
   411 
       
   412 Pickling and unpickling normal class instances
       
   413 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       
   414 
       
   415 .. method:: object.__getinitargs__()
       
   416    
       
   417    When a pickled class instance is unpickled, its :meth:`__init__` method is
       
   418    normally *not* invoked.  If it is desirable that the :meth:`__init__` method
       
   419    be called on unpickling, an old-style class can define a method
       
   420    :meth:`__getinitargs__`, which should return a *tuple* containing the
       
   421    arguments to be passed to the class constructor (:meth:`__init__` for
       
   422    example).  The :meth:`__getinitargs__` method is called at pickle time; the
       
   423    tuple it returns is incorporated in the pickle for the instance.
       
   424 
       
   425 .. method:: object.__getnewargs__()
       
   426 
       
   427    New-style types can provide a :meth:`__getnewargs__` method that is used for
       
   428    protocol 2.  Implementing this method is needed if the type establishes some
       
   429    internal invariants when the instance is created, or if the memory allocation
       
   430    is affected by the values passed to the :meth:`__new__` method for the type
       
   431    (as it is for tuples and strings).  Instances of a :term:`new-style class`
       
   432    ``C`` are created using ::
       
   433     
       
   434       obj = C.__new__(C, *args)
       
   435     
       
   436    where *args* is the result of calling :meth:`__getnewargs__` on the original
       
   437    object; if there is no :meth:`__getnewargs__`, an empty tuple is assumed.
       
   438 
       
   439 .. method:: object.__getstate__()
       
   440    
       
   441    Classes can further influence how their instances are pickled; if the class
       
   442    defines the method :meth:`__getstate__`, it is called and the return state is
       
   443    pickled as the contents for the instance, instead of the contents of the
       
   444    instance's dictionary.  If there is no :meth:`__getstate__` method, the
       
   445    instance's :attr:`__dict__` is pickled.
       
   446 
       
   447 .. method:: object.__setstate__() 
       
   448    
       
   449    Upon unpickling, if the class also defines the method :meth:`__setstate__`,
       
   450    it is called with the unpickled state. [#]_ If there is no
       
   451    :meth:`__setstate__` method, the pickled state must be a dictionary and its
       
   452    items are assigned to the new instance's dictionary.  If a class defines both
       
   453    :meth:`__getstate__` and :meth:`__setstate__`, the state object needn't be a
       
   454    dictionary and these methods can do what they want. [#]_
       
   455     
       
   456    .. warning::
       
   457     
       
   458       For :term:`new-style class`\es, if :meth:`__getstate__` returns a false
       
   459       value, the :meth:`__setstate__` method will not be called.
       
   460 
       
   461 
       
   462 Pickling and unpickling extension types
       
   463 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       
   464 
       
   465 .. method:: object.__reduce__()
       
   466    
       
   467    When the :class:`Pickler` encounters an object of a type it knows nothing
       
   468    about --- such as an extension type --- it looks in two places for a hint of
       
   469    how to pickle it.  One alternative is for the object to implement a
       
   470    :meth:`__reduce__` method.  If provided, at pickling time :meth:`__reduce__`
       
   471    will be called with no arguments, and it must return either a string or a
       
   472    tuple.
       
   473 
       
   474    If a string is returned, it names a global variable whose contents are
       
   475    pickled as normal.  The string returned by :meth:`__reduce__` should be the
       
   476    object's local name relative to its module; the pickle module searches the
       
   477    module namespace to determine the object's module.
       
   478 
       
   479    When a tuple is returned, it must be between two and five elements long.
       
   480    Optional elements can either be omitted, or ``None`` can be provided as their
       
   481    value.  The contents of this tuple are pickled as normal and used to
       
   482    reconstruct the object at unpickling time.  The semantics of each element
       
   483    are:
       
   484 
       
   485    * A callable object that will be called to create the initial version of the
       
   486      object.  The next element of the tuple will provide arguments for this
       
   487      callable, and later elements provide additional state information that will
       
   488      subsequently be used to fully reconstruct the pickled data.
       
   489 
       
   490      In the unpickling environment this object must be either a class, a
       
   491      callable registered as a "safe constructor" (see below), or it must have an
       
   492      attribute :attr:`__safe_for_unpickling__` with a true value. Otherwise, an
       
   493      :exc:`UnpicklingError` will be raised in the unpickling environment.  Note
       
   494      that as usual, the callable itself is pickled by name.
       
   495 
       
   496    * A tuple of arguments for the callable object.
       
   497 
       
   498      .. versionchanged:: 2.5
       
   499         Formerly, this argument could also be ``None``.
       
   500 
       
   501    * Optionally, the object's state, which will be passed to the object's
       
   502      :meth:`__setstate__` method as described in section :ref:`pickle-inst`.  If
       
   503      the object has no :meth:`__setstate__` method, then, as above, the value
       
   504      must be a dictionary and it will be added to the object's :attr:`__dict__`.
       
   505 
       
   506    * Optionally, an iterator (and not a sequence) yielding successive list
       
   507      items.  These list items will be pickled, and appended to the object using
       
   508      either ``obj.append(item)`` or ``obj.extend(list_of_items)``.  This is
       
   509      primarily used for list subclasses, but may be used by other classes as
       
   510      long as they have :meth:`append` and :meth:`extend` methods with the
       
   511      appropriate signature.  (Whether :meth:`append` or :meth:`extend` is used
       
   512      depends on which pickle protocol version is used as well as the number of
       
   513      items to append, so both must be supported.)
       
   514 
       
   515    * Optionally, an iterator (not a sequence) yielding successive dictionary
       
   516      items, which should be tuples of the form ``(key, value)``.  These items
       
   517      will be pickled and stored to the object using ``obj[key] = value``. This
       
   518      is primarily used for dictionary subclasses, but may be used by other
       
   519      classes as long as they implement :meth:`__setitem__`.
       
   520 
       
   521 .. method:: object.__reduce_ex__(protocol) 
       
   522 
       
   523    It is sometimes useful to know the protocol version when implementing
       
   524    :meth:`__reduce__`.  This can be done by implementing a method named
       
   525    :meth:`__reduce_ex__` instead of :meth:`__reduce__`. :meth:`__reduce_ex__`,
       
   526    when it exists, is called in preference over :meth:`__reduce__` (you may
       
   527    still provide :meth:`__reduce__` for backwards compatibility).  The
       
   528    :meth:`__reduce_ex__` method will be called with a single integer argument,
       
   529    the protocol version.
       
   530 
       
   531    The :class:`object` class implements both :meth:`__reduce__` and
       
   532    :meth:`__reduce_ex__`; however, if a subclass overrides :meth:`__reduce__`
       
   533    but not :meth:`__reduce_ex__`, the :meth:`__reduce_ex__` implementation
       
   534    detects this and calls :meth:`__reduce__`.
       
   535 
       
   536 An alternative to implementing a :meth:`__reduce__` method on the object to be
       
   537 pickled, is to register the callable with the :mod:`copy_reg` module.  This
       
   538 module provides a way for programs to register "reduction functions" and
       
   539 constructors for user-defined types.   Reduction functions have the same
       
   540 semantics and interface as the :meth:`__reduce__` method described above, except
       
   541 that they are called with a single argument, the object to be pickled.
       
   542 
       
   543 The registered constructor is deemed a "safe constructor" for purposes of
       
   544 unpickling as described above.
       
   545 
       
   546 
       
   547 Pickling and unpickling external objects
       
   548 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       
   549 
       
   550 .. index::
       
   551    single: persistent_id (pickle protocol)
       
   552    single: persistent_load (pickle protocol)
       
   553 
       
   554 For the benefit of object persistence, the :mod:`pickle` module supports the
       
   555 notion of a reference to an object outside the pickled data stream.  Such
       
   556 objects are referenced by a "persistent id", which is just an arbitrary string
       
   557 of printable ASCII characters. The resolution of such names is not defined by
       
   558 the :mod:`pickle` module; it will delegate this resolution to user defined
       
   559 functions on the pickler and unpickler. [#]_
       
   560 
       
   561 To define external persistent id resolution, you need to set the
       
   562 :attr:`persistent_id` attribute of the pickler object and the
       
   563 :attr:`persistent_load` attribute of the unpickler object.
       
   564 
       
   565 To pickle objects that have an external persistent id, the pickler must have a
       
   566 custom :func:`persistent_id` method that takes an object as an argument and
       
   567 returns either ``None`` or the persistent id for that object.  When ``None`` is
       
   568 returned, the pickler simply pickles the object as normal.  When a persistent id
       
   569 string is returned, the pickler will pickle that string, along with a marker so
       
   570 that the unpickler will recognize the string as a persistent id.
       
   571 
       
   572 To unpickle external objects, the unpickler must have a custom
       
   573 :func:`persistent_load` function that takes a persistent id string and returns
       
   574 the referenced object.
       
   575 
       
   576 Here's a silly example that *might* shed more light::
       
   577 
       
   578    import pickle
       
   579    from cStringIO import StringIO
       
   580 
       
   581    src = StringIO()
       
   582    p = pickle.Pickler(src)
       
   583 
       
   584    def persistent_id(obj):
       
   585        if hasattr(obj, 'x'):
       
   586            return 'the value %d' % obj.x
       
   587        else:
       
   588            return None
       
   589 
       
   590    p.persistent_id = persistent_id
       
   591 
       
   592    class Integer:
       
   593        def __init__(self, x):
       
   594            self.x = x
       
   595        def __str__(self):
       
   596            return 'My name is integer %d' % self.x
       
   597 
       
   598    i = Integer(7)
       
   599    print i
       
   600    p.dump(i)
       
   601 
       
   602    datastream = src.getvalue()
       
   603    print repr(datastream)
       
   604    dst = StringIO(datastream)
       
   605 
       
   606    up = pickle.Unpickler(dst)
       
   607 
       
   608    class FancyInteger(Integer):
       
   609        def __str__(self):
       
   610            return 'I am the integer %d' % self.x
       
   611 
       
   612    def persistent_load(persid):
       
   613        if persid.startswith('the value '):
       
   614            value = int(persid.split()[2])
       
   615            return FancyInteger(value)
       
   616        else:
       
   617            raise pickle.UnpicklingError, 'Invalid persistent id'
       
   618 
       
   619    up.persistent_load = persistent_load
       
   620 
       
   621    j = up.load()
       
   622    print j
       
   623 
       
   624 In the :mod:`cPickle` module, the unpickler's :attr:`persistent_load` attribute
       
   625 can also be set to a Python list, in which case, when the unpickler reaches a
       
   626 persistent id, the persistent id string will simply be appended to this list.
       
   627 This functionality exists so that a pickle data stream can be "sniffed" for
       
   628 object references without actually instantiating all the objects in a pickle.
       
   629 [#]_  Setting :attr:`persistent_load` to a list is usually used in conjunction
       
   630 with the :meth:`noload` method on the Unpickler.
       
   631 
       
   632 .. BAW: Both pickle and cPickle support something called inst_persistent_id()
       
   633    which appears to give unknown types a second shot at producing a persistent
       
   634    id.  Since Jim Fulton can't remember why it was added or what it's for, I'm
       
   635    leaving it undocumented.
       
   636 
       
   637 
       
   638 .. _pickle-sub:
       
   639 
       
   640 Subclassing Unpicklers
       
   641 ----------------------
       
   642 
       
   643 .. index::
       
   644    single: load_global() (pickle protocol)
       
   645    single: find_global() (pickle protocol)
       
   646 
       
   647 By default, unpickling will import any class that it finds in the pickle data.
       
   648 You can control exactly what gets unpickled and what gets called by customizing
       
   649 your unpickler.  Unfortunately, exactly how you do this is different depending
       
   650 on whether you're using :mod:`pickle` or :mod:`cPickle`. [#]_
       
   651 
       
   652 In the :mod:`pickle` module, you need to derive a subclass from
       
   653 :class:`Unpickler`, overriding the :meth:`load_global` method.
       
   654 :meth:`load_global` should read two lines from the pickle data stream where the
       
   655 first line will the name of the module containing the class and the second line
       
   656 will be the name of the instance's class.  It then looks up the class, possibly
       
   657 importing the module and digging out the attribute, then it appends what it
       
   658 finds to the unpickler's stack.  Later on, this class will be assigned to the
       
   659 :attr:`__class__` attribute of an empty class, as a way of magically creating an
       
   660 instance without calling its class's :meth:`__init__`. Your job (should you
       
   661 choose to accept it), would be to have :meth:`load_global` push onto the
       
   662 unpickler's stack, a known safe version of any class you deem safe to unpickle.
       
   663 It is up to you to produce such a class.  Or you could raise an error if you
       
   664 want to disallow all unpickling of instances.  If this sounds like a hack,
       
   665 you're right.  Refer to the source code to make this work.
       
   666 
       
   667 Things are a little cleaner with :mod:`cPickle`, but not by much. To control
       
   668 what gets unpickled, you can set the unpickler's :attr:`find_global` attribute
       
   669 to a function or ``None``.  If it is ``None`` then any attempts to unpickle
       
   670 instances will raise an :exc:`UnpicklingError`.  If it is a function, then it
       
   671 should accept a module name and a class name, and return the corresponding class
       
   672 object.  It is responsible for looking up the class and performing any necessary
       
   673 imports, and it may raise an error to prevent instances of the class from being
       
   674 unpickled.
       
   675 
       
   676 The moral of the story is that you should be really careful about the source of
       
   677 the strings your application unpickles.
       
   678 
       
   679 
       
   680 .. _pickle-example:
       
   681 
       
   682 Example
       
   683 -------
       
   684 
       
   685 For the simplest code, use the :func:`dump` and :func:`load` functions.  Note
       
   686 that a self-referencing list is pickled and restored correctly. ::
       
   687 
       
   688    import pickle
       
   689 
       
   690    data1 = {'a': [1, 2.0, 3, 4+6j],
       
   691             'b': ('string', u'Unicode string'),
       
   692             'c': None}
       
   693 
       
   694    selfref_list = [1, 2, 3]
       
   695    selfref_list.append(selfref_list)
       
   696 
       
   697    output = open('data.pkl', 'wb')
       
   698 
       
   699    # Pickle dictionary using protocol 0.
       
   700    pickle.dump(data1, output)
       
   701 
       
   702    # Pickle the list using the highest protocol available.
       
   703    pickle.dump(selfref_list, output, -1)
       
   704 
       
   705    output.close()
       
   706 
       
   707 The following example reads the resulting pickled data.  When reading a
       
   708 pickle-containing file, you should open the file in binary mode because you
       
   709 can't be sure if the ASCII or binary format was used. ::
       
   710 
       
   711    import pprint, pickle
       
   712 
       
   713    pkl_file = open('data.pkl', 'rb')
       
   714 
       
   715    data1 = pickle.load(pkl_file)
       
   716    pprint.pprint(data1)
       
   717 
       
   718    data2 = pickle.load(pkl_file)
       
   719    pprint.pprint(data2)
       
   720 
       
   721    pkl_file.close()
       
   722 
       
   723 Here's a larger example that shows how to modify pickling behavior for a class.
       
   724 The :class:`TextReader` class opens a text file, and returns the line number and
       
   725 line contents each time its :meth:`readline` method is called. If a
       
   726 :class:`TextReader` instance is pickled, all attributes *except* the file object
       
   727 member are saved. When the instance is unpickled, the file is reopened, and
       
   728 reading resumes from the last location. The :meth:`__setstate__` and
       
   729 :meth:`__getstate__` methods are used to implement this behavior. ::
       
   730 
       
   731    #!/usr/local/bin/python
       
   732 
       
   733    class TextReader:
       
   734        """Print and number lines in a text file."""
       
   735        def __init__(self, file):
       
   736            self.file = file
       
   737            self.fh = open(file)
       
   738            self.lineno = 0
       
   739 
       
   740        def readline(self):
       
   741            self.lineno = self.lineno + 1
       
   742            line = self.fh.readline()
       
   743            if not line:
       
   744                return None
       
   745            if line.endswith("\n"):
       
   746                line = line[:-1]
       
   747            return "%d: %s" % (self.lineno, line)
       
   748 
       
   749        def __getstate__(self):
       
   750            odict = self.__dict__.copy() # copy the dict since we change it
       
   751            del odict['fh']              # remove filehandle entry
       
   752            return odict
       
   753 
       
   754        def __setstate__(self, dict):
       
   755            fh = open(dict['file'])      # reopen file
       
   756            count = dict['lineno']       # read from file...
       
   757            while count:                 # until line count is restored
       
   758                fh.readline()
       
   759                count = count - 1
       
   760            self.__dict__.update(dict)   # update attributes
       
   761            self.fh = fh                 # save the file object
       
   762 
       
   763 A sample usage might be something like this::
       
   764 
       
   765    >>> import TextReader
       
   766    >>> obj = TextReader.TextReader("TextReader.py")
       
   767    >>> obj.readline()
       
   768    '1: #!/usr/local/bin/python'
       
   769    >>> obj.readline()
       
   770    '2: '
       
   771    >>> obj.readline()
       
   772    '3: class TextReader:'
       
   773    >>> import pickle
       
   774    >>> pickle.dump(obj, open('save.p', 'wb'))
       
   775 
       
   776 If you want to see that :mod:`pickle` works across Python processes, start
       
   777 another Python session, before continuing.  What follows can happen from either
       
   778 the same process or a new process. ::
       
   779 
       
   780    >>> import pickle
       
   781    >>> reader = pickle.load(open('save.p', 'rb'))
       
   782    >>> reader.readline()
       
   783    '4:     """Print and number lines in a text file."""'
       
   784 
       
   785 
       
   786 .. seealso::
       
   787 
       
   788    Module :mod:`copy_reg`
       
   789       Pickle interface constructor registration for extension types.
       
   790 
       
   791    Module :mod:`shelve`
       
   792       Indexed databases of objects; uses :mod:`pickle`.
       
   793 
       
   794    Module :mod:`copy`
       
   795       Shallow and deep object copying.
       
   796 
       
   797    Module :mod:`marshal`
       
   798       High-performance serialization of built-in types.
       
   799 
       
   800 
       
   801 :mod:`cPickle` --- A faster :mod:`pickle`
       
   802 =========================================
       
   803 
       
   804 .. module:: cPickle
       
   805    :synopsis: Faster version of pickle, but not subclassable.
       
   806 .. moduleauthor:: Jim Fulton <jim@zope.com>
       
   807 .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
       
   808 
       
   809 
       
   810 .. index:: module: pickle
       
   811 
       
   812 The :mod:`cPickle` module supports serialization and de-serialization of Python
       
   813 objects, providing an interface and functionality nearly identical to the
       
   814 :mod:`pickle` module.  There are several differences, the most important being
       
   815 performance and subclassability.
       
   816 
       
   817 First, :mod:`cPickle` can be up to 1000 times faster than :mod:`pickle` because
       
   818 the former is implemented in C.  Second, in the :mod:`cPickle` module the
       
   819 callables :func:`Pickler` and :func:`Unpickler` are functions, not classes.
       
   820 This means that you cannot use them to derive custom pickling and unpickling
       
   821 subclasses.  Most applications have no need for this functionality and should
       
   822 benefit from the greatly improved performance of the :mod:`cPickle` module.
       
   823 
       
   824 The pickle data stream produced by :mod:`pickle` and :mod:`cPickle` are
       
   825 identical, so it is possible to use :mod:`pickle` and :mod:`cPickle`
       
   826 interchangeably with existing pickles. [#]_
       
   827 
       
   828 There are additional minor differences in API between :mod:`cPickle` and
       
   829 :mod:`pickle`, however for most applications, they are interchangeable.  More
       
   830 documentation is provided in the :mod:`pickle` module documentation, which
       
   831 includes a list of the documented differences.
       
   832 
       
   833 .. rubric:: Footnotes
       
   834 
       
   835 .. [#] Don't confuse this with the :mod:`marshal` module
       
   836 
       
   837 .. [#] In the :mod:`pickle` module these callables are classes, which you could
       
   838    subclass to customize the behavior.  However, in the :mod:`cPickle` module these
       
   839    callables are factory functions and so cannot be subclassed.  One common reason
       
   840    to subclass is to control what objects can actually be unpickled.  See section
       
   841    :ref:`pickle-sub` for more details.
       
   842 
       
   843 .. [#] *Warning*: this is intended for pickling multiple objects without intervening
       
   844    modifications to the objects or their parts.  If you modify an object and then
       
   845    pickle it again using the same :class:`Pickler` instance, the object is not
       
   846    pickled again --- a reference to it is pickled and the :class:`Unpickler` will
       
   847    return the old value, not the modified one. There are two problems here: (1)
       
   848    detecting changes, and (2) marshalling a minimal set of changes.  Garbage
       
   849    Collection may also become a problem here.
       
   850 
       
   851 .. [#] The exception raised will likely be an :exc:`ImportError` or an
       
   852    :exc:`AttributeError` but it could be something else.
       
   853 
       
   854 .. [#] These methods can also be used to implement copying class instances.
       
   855 
       
   856 .. [#] This protocol is also used by the shallow and deep copying operations defined in
       
   857    the :mod:`copy` module.
       
   858 
       
   859 .. [#] The actual mechanism for associating these user defined functions is slightly
       
   860    different for :mod:`pickle` and :mod:`cPickle`.  The description given here
       
   861    works the same for both implementations.  Users of the :mod:`pickle` module
       
   862    could also use subclassing to effect the same results, overriding the
       
   863    :meth:`persistent_id` and :meth:`persistent_load` methods in the derived
       
   864    classes.
       
   865 
       
   866 .. [#] We'll leave you with the image of Guido and Jim sitting around sniffing pickles
       
   867    in their living rooms.
       
   868 
       
   869 .. [#] A word of caution: the mechanisms described here use internal attributes and
       
   870    methods, which are subject to change in future versions of Python.  We intend to
       
   871    someday provide a common interface for controlling this behavior, which will
       
   872    work in either :mod:`pickle` or :mod:`cPickle`.
       
   873 
       
   874 .. [#] Since the pickle data format is actually a tiny stack-oriented programming
       
   875    language, and some freedom is taken in the encodings of certain objects, it is
       
   876    possible that the two modules produce different data streams for the same input
       
   877    objects.  However it is guaranteed that they will always be able to read each
       
   878    other's data streams.
       
   879