From the python documentation:
The cPickle.java module implements a basic but powerful algorithm
for ``pickling'' (a.k.a. serializing, marshalling or flattening) nearly
arbitrary Python objects. This is the act of converting objects to a
stream of bytes (and back: ``unpickling'').
This is a more primitive notion than
persistency -- although cPickle.java reads and writes file
objects, it does not handle the issue of naming persistent objects, nor
the (even more complicated) area of concurrent access to persistent
objects. The cPickle.java module can transform a complex object
into a byte stream and it can transform the byte stream into an object
with the same internal structure. The most obvious thing to do with these
byte streams is to write them onto a file, but it is also conceivable
to send them across a network or store them in a database. The module
shelve provides a simple interface to pickle and unpickle
objects on ``dbm''-style database files.
Note: The cPickle.java have the same interface as the
standard module pickleexcept that Pickler and
Unpickler are factory functions, not classes (so they cannot be
used as base classes for inheritance).
This limitation is similar for the original cPickle.c version.
Unlike the built-in module marshal, cPickle.java handles
the following correctly:
- recursive objects (objects containing references to themselves)
- object sharing (references to the same object in different places)
- user-defined classes and their instances
The data format used by cPickle.java is Python-specific. This has
the advantage that there are no restrictions imposed by external
standards such as XDR (which can't represent pointer sharing); however
it means that non-Python programs may not be able to reconstruct
pickled Python objects.
By default, the cPickle.java data format uses a printable ASCII
representation. This is slightly more voluminous than a binary
representation. The big advantage of using printable ASCII (and of
some other characteristics of cPickle.java's representation) is
that for debugging or recovery purposes it is possible for a human to read
the pickled file with a standard text editor.
A binary format, which is slightly more efficient, can be chosen by
specifying a nonzero (true) value for the bin argument to the
Pickler constructor or the dump() and dumps()
functions. The binary format is not the default because of backwards
compatibility with the Python 1.4 pickle module. In a future version,
the default may change to binary.
The cPickle.java module doesn't handle code objects.
For the benefit of persistency modules written using cPickle.java,
it supports the notion of a reference to an object outside the pickled
data stream. Such objects are referenced by a name, which is an
arbitrary string of printable ASCII characters. The resolution of
such names is not defined by the cPickle.java module -- the
persistent object module will have to implement a method
persistent_load(). To write references to persistent objects,
the persistent module must define a method persistent_id() which
returns either None or the persistent ID of the object.
There are some restrictions on the pickling of class instances.
First of all, the class must be defined at the top level in a module.
Furthermore, all its instance variables must be picklable.
When a pickled class instance is unpickled, its __init__() method
is normally not invoked. Note: This is a deviation
from previous versions of this module; the change was introduced in
Python 1.5b2. The reason for the change is that in many cases it is
desirable to have a constructor that requires arguments; it is a
(minor) nuisance to have to provide a __getinitargs__() method.
If it is desirable that the __init__() method be called on
unpickling, a class can define a method __getinitargs__(),
which should return a tuple containing the arguments to be
passed to the class constructor (__init__()). This method is
called at pickle time; the tuple it returns is incorporated in the
pickle for the instance.
Classes can further influence how their instances are pickled -- if the
class defines the method __getstate__(), it is called and the
return state is pickled as the contents for the instance, and if the class
defines the method __setstate__(), it is called with the
unpickled state. (Note that these methods can also be used to
implement copying class instances.) If there is no
__getstate__() method, the instance's __dict__ is
pickled. If there is no __setstate__() method, the pickled
object must be a dictionary and its items are assigned to the new
instance's dictionary. (If a class defines both __getstate__()
and __setstate__(), the state object needn't be a dictionary
-- these methods can do what they want.) This protocol is also used
by the shallow and deep copying operations defined in the copy
module.
Note that when class instances are pickled, their class's code and
data are not pickled along with them. Only the instance data are
pickled. This is done on purpose, so you can fix bugs in a class or
add methods and still load objects that were created with an earlier
version of the class. If you plan to have long-lived objects that
will see many versions of a class, it may be worthwhile to put a version
number in the objects so that suitable conversions can be made by the
class's __setstate__() method.
When a class itself is pickled, only its name is pickled -- the class
definition is not pickled, but re-imported by the unpickling process.
Therefore, the restriction that the class must be defined at the top
level in a module applies to pickled classes as well.
The interface can be summarized as follows.
To pickle an object x onto a file f, open for writing:
p = pickle.Pickler(f)
p.dump(x)
A shorthand for this is:
pickle.dump(x, f)
To unpickle an object x from a file f, open for reading:
u = pickle.Unpickler(f)
x = u.load()
A shorthand is:
x = pickle.load(f)
The Pickler class only calls the method f.write() with a
string argument. The Unpickler calls the methods
f.read() (with an integer argument) and f.readline()
(without argument), both returning a string. It is explicitly allowed to
pass non-file objects here, as long as they have the right methods.
The constructor for the Pickler class has an optional second
argument, bin. If this is present and nonzero, the binary
pickle format is used; if it is zero or absent, the (less efficient,
but backwards compatible) text pickle format is used. The
Unpickler class does not have an argument to distinguish
between binary and text pickle formats; it accepts either format.
The following types can be pickled:
- None
- integers, long integers, floating point numbers
- strings
- tuples, lists and dictionaries containing only picklable objects
- classes that are defined at the top level in a module
- instances of such classes whose __dict__ or
__setstate__() is picklable
Attempts to pickle unpicklable objects will raise the
PicklingError exception; when this happens, an unspecified
number of bytes may have been written to the file.
It is possible to make multiple calls to the dump() method of
the same Pickler instance. These must then be matched to the
same number of calls to the load() method of the
corresponding Unpickler instance. If the same object is
pickled by multiple dump() calls, the load() will all
yield references to the same object. Warning: this is intended
for pickling multiple objects without intervening modifications to the
objects or their parts. If you modify an object and then pickle it
again using the same Pickler instance, the object is not
pickled again -- a reference to it is pickled and the
Unpickler will return the old value, not the modified one.
(There are two problems here: (a) detecting changes, and (b)
marshalling a minimal set of changes. I have no answers. Garbage
Collection may also become a problem here.)
Apart from the Pickler and Unpickler classes, the
module defines the following functions, and an exception:
- dump (object, file[,
bin])
-
Write a pickled representation of obect to the open file object
file. This is equivalent to
"Pickler(file, bin).dump(object)".
If the optional bin argument is present and nonzero, the binary
pickle format is used; if it is zero or absent, the (less efficient)
text pickle format is used.
- load (file)
-
Read a pickled object from the open file object file. This is
equivalent to "Unpickler(file).load()".
- dumps (object[,
bin])
-
Return the pickled representation of the object as a string, instead
of writing it to a file. If the optional bin argument is
present and nonzero, the binary pickle format is used; if it is zero
or absent, the (less efficient) text pickle format is used.
- loads (string)
-
Read a pickled object from a string instead of a file. Characters in
the string past the pickled object's representation are ignored.
- PicklingError
-
This exception is raised when an unpicklable object is passed to
Pickler.dump().
For the complete documentation on the pickle module, please see the
"Python Library Reference"
The module is based on both original pickle.py and the cPickle.c
version, except that all mistakes and errors are my own.
author: Finn Bock, bckfnn@pipmail.dknet.dk version: cPickle.java,v 1.30 1999/05/15 17:40:12 fb Exp |