apptools.persistence.state_pickler module

This module provides code that allows one to pickle the state of a Python object to a dictionary.

The motivation for this is simple. The standard Python pickler/unpickler is best used to pickle simple objects and does not work too well for complex code. Specifically, there are two major problems (1) the pickle file format is not easy to edit with a text editor and (2) when a pickle is unpickled, it creates all the necessary objects and sets the state of these objects.

Issue (2) might not appear to be a problem. However, often, the determination of the entire ‘state’ of an application requires the knowledge of the state of many objects that are not really in the users concern. The user would ideally like to pickle just what he thinks is relevant. Now, given that the user is not going to save the entire state of the application, the use of pickle is insufficient since the state is no longer completely known (or worth knowing). The default Unpickler recreates the objects and the typical implementation of __setstate__ is usually to simply update the object’s __dict__ attribute. This is inadequate because the pickled information is taken out of the real context when it was saved.

The StatePickler basically pickles the ‘state’ of an object into a large dictionary. This pickled data may be easily unpickled and modified on the interpreter or edited with a text editor (pprint.saferepr is a friend). The second problem is also eliminated. When this state is unpickled using StateUnpickler, what you get is a special dictionary (a State instance). This allows one to navigate the state just like the original object. Its up to the user to create any new objects and set their states using this information. This allows for a lot of flexibility while allowing one to save and set the state of (almost) any Python object.

The StateSetter class helps set the state of a known instance. When setting the state of an instance it checks to see if there is a __set_pure_state__ method that in turn calls StateSetter.set appropriately.

Additionally, there is support for versioning. The class’ version is obtain from the __version__ class attribute. This version along with the versions of the bases of a class is embedded into the metadata of the state and stored. By using version_registry.py a user may register a handler for a particular class and module. When the state of an object is set using StateSetter.set_state, then these handlers are called in reverse order of their MRO. This gives the handler an opportunity to upgrade the state depending on its version. Builtin classes are not scanned for versions. If a class has no version, then by default it is assumed to be -1.

Example:

>>> class A:
...    def __init__(self):
...        self.a = 'a'
...
>>> a = A()
>>> a.a = 100
>>> import state_pickler
>>> s = state_pickler.dumps(a)               # Dump the state of `a`.
>>> state = state_pickler.loads_state(s)     # Get the state back.
>>> b = state_pickler.create_instance(state) # Create the object.
>>> state_pickler.set_state(b, state)        # Set the object's state.
>>> assert b.a == 100

Features

  • The output is a plain old dictionary so is easy to parse, edit etc.

  • Handles references to avoid duplication.

  • Gzips Numeric arrays when dumping them.

  • Support for versioning.

Caveats

  • Does not pickle a whole bunch of stuff including code objects and functions.

  • The output is a pure dictionary and does not contain instances. So using this as it is in __setstate__ will not work. Instead define a __set_pure_state__ and use the StateSetter class or the set_state function provided by this module.

Notes

Browsing the code from XMarshaL and pickle.py proved useful for ideas. None of the code is taken from there though.

class apptools.persistence.state_pickler.State(**kw)[source]

Bases: dict

Used to encapsulate the state of an instance in a very convenient form. The ‘__metadata__’ attribute/key is a dictionary that has class specific details like the class name, module name etc.

class apptools.persistence.state_pickler.StateDict(**kw)[source]

Bases: dict

Used to encapsulate a dictionary stored in a State instance. The has_instance attribute specifies if the dict has an instance embedded in it.

class apptools.persistence.state_pickler.StateList(seq=None)[source]

Bases: list

Used to encapsulate a list stored in a State instance. The has_instance attribute specifies if the list has an instance embedded in it.

class apptools.persistence.state_pickler.StatePickler[source]

Bases: object

Pickles the state of an object into a dictionary. The dictionary is itself either saved as a pickled file (dump) or pickled string (dumps). Alternatively, the dump_state method will return the dictionary that is pickled.

The format of the state dict is quite strightfoward. Basic types (bool, int, long, float, complex, None, string) are represented as they are. Everything else is stored as a dictionary containing metadata information on the object’s type etc. and also the actual object in the ‘data’ key. For example:

>>> p = StatePickler()
>>> p.dump_state(1)
1
>>> l = [1,2.0, None, [1,2,3]]
>>> p.dump_state(l)
{'data': [1, 2.0, None, {'data': [1, 2, 3], 'type': 'list', 'id': 1}],
 'id': 0,
 'type': 'list'}

Classes are also represented similarly. The state in this case is obtained from the __getstate__ method or from the __dict__. Here is an example:

>>> class A:
...     __version__ = 1  # State version
...     def __init__(self):
...         self.attribute = 1
...
>>> a = A()
>>> p = StatePickler()
>>> p.dump_state(a)
{'class_name': 'A',
 'data': {'data': {'attribute': 1}, 'type': 'dict', 'id': 2},
 'id': 0,
 'initargs': {'data': (), 'type': 'tuple', 'id': 1},
 'module': '__main__',
 'type': 'instance',
 'version': [(('A', '__main__'), 1)]}

When pickling data, references are taken care of. Numeric arrays can be pickled and are stored as a gzipped base64 encoded string.

dump(value, file)[source]

Pickles the state of the object (value) into the passed file.

dump_state(value)[source]

Returns a dictionary or a basic type representing the complete state of the object (value).

This value is pickled by the dump and dumps methods.

dumps(value)[source]

Pickles the state of the object (value) and returns a string.

exception apptools.persistence.state_pickler.StatePicklerError[source]

Bases: Exception

class apptools.persistence.state_pickler.StateSetter[source]

Bases: object

This is a convenience class that helps a user set the attributes of an object given its saved state. For instances it checks to see if a __set_pure_state__ method exists and calls that when it sets the state.

set(obj, state, ignore=None, first=None, last=None)[source]

Sets the state of the object.

This is to be used as a means to simplify loading the state of an object from its __setstate__ method using the dictionary describing its state. Note that before the state is set, the registered handlers for the particular class are called in order to upgrade the version of the state to the latest version.

Parameters
  • obj (-) – The object whose state is to be set. If this is None (default) then the object is created.

  • state (-) – The dictionary representing the state of the object.

  • ignore (-) – The list of attributes specified in this list are ignored and the state of these attributes are not set (this excludes the ones specified in first and last). If one specifies a ‘*’ then all attributes are ignored except the ones specified in first and last.

  • first (-) – The list of attributes specified in this list are set first (in order), before any other attributes are set.

  • last (-) – The list of attributes specified in this list are set last (in order), after all other attributes are set.

exception apptools.persistence.state_pickler.StateSetterError[source]

Bases: Exception

class apptools.persistence.state_pickler.StateTuple(seq=None)[source]

Bases: tuple

Used to encapsulate a tuple stored in a State instance. The has_instance attribute specifies if the tuple has an instance embedded in it.

class apptools.persistence.state_pickler.StateUnpickler[source]

Bases: object

Unpickles the state of an object saved using StatePickler.

Please note that unlike the standard Unpickler, no instances of any user class are created. The data for the state is obtained from the file or string, reference objects are setup to refer to the same state value and this state is returned in the form usually in the form of a dictionary. For example:

>>> class A:
...     def __init__(self):
...         self.attribute = 1
...
>>> a = A()
>>> p = StatePickler()
>>> s = p.dumps(a)
>>> up = StateUnpickler()
>>> state = up.loads_state(s)
>>> state.__class__.__name__
'State'
>>> state.attribute
1
>>> state.__metadata__
{'class_name': 'A',
 'has_instance': True,
 'id': 0,
 'initargs': (),
 'module': '__main__',
 'type': 'instance',
 'version': [(('A', '__main__'), -1)]}

Note that the state is actually a State instance and is navigable just like the original object. The details of the instance are stored in the __metadata__ attribute. This is highly convenient since it is possible for someone to view and modify the state very easily.

load_state(file)[source]

Returns the state of an object loaded from the pickled data in the given file.

loads_state(string)[source]

Returns the state of an object loaded from the pickled data in the given string.

exception apptools.persistence.state_pickler.StateUnpicklerError[source]

Bases: Exception

apptools.persistence.state_pickler.create_instance(state)[source]

Create an instance from the state if possible.

apptools.persistence.state_pickler.dump(value, file)[source]

Pickles the state of the object (value) into the passed file (or file name).

apptools.persistence.state_pickler.dumps(value)[source]

Pickles the state of the object (value) and returns a string.

apptools.persistence.state_pickler.get_state(obj)[source]

Returns the state of the object (usually as a dictionary). The returned state may be used directy to set the state of the object via set_state.

apptools.persistence.state_pickler.gunzip_string(data)[source]

Given a gzipped string (data) this unzips the string and returns it.

apptools.persistence.state_pickler.gzip_string(data)[source]

Given a string (data) this gzips the string and returns it.

apptools.persistence.state_pickler.load_state(file)[source]

Returns the state of an object loaded from the pickled data in the given file (or file name).

apptools.persistence.state_pickler.loads_state(string)[source]

Returns the state of an object loaded from the pickled data in the given string.

apptools.persistence.state_pickler.set_state(obj, state, ignore=None, first=None, last=None)[source]

Sets the state of the object.

This is to be used as a means to simplify loading the state of an object from its __setstate__ method using the dictionary describing its state. Note that before the state is set, the registered handlers for the particular class are called in order to upgrade the version of the state to the latest version.

Parameters
  • obj (-) – The object whose state is to be set. If this is None (default) then the object is created.

  • state (-) – The dictionary representing the state of the object.

  • ignore (-) – The list of attributes specified in this list are ignored and the state of these attributes are not set (this excludes the ones specified in first and last). If one specifies a ‘*’ then all attributes are ignored except the ones specified in first and last.

  • first (-) – The list of attributes specified in this list are set first (in order), before any other attributes are set.

  • last (-) – The list of attributes specified in this list are set last (in order), after all other attributes are set.

apptools.persistence.state_pickler.update_state(state)[source]

Given the state of an object, this updates the state to the latest version using the handlers given in the version registry. The state is modified in-place.