Posts Tagged With “xml”

Descriptors, Properties, and Metaprogramming

Comment

While reading through Google's AppEngine code for the datastore API, I came across a new (to me) technique for building classes with properties. The idea is that you can create a class for a Person and attach properties like so:

class Person(db.Model):
    firstname = db.StringProperty()
    lastname = db.StringProperty()

There are a lot of things you can do with those properties. For example, you can make sure they're required to exist and you can even validate their values by passing in a custom validator. When it's time to store a person object in the datastore, it will iterate over the properties, make sure they're valid, and finally convert each value to one suitable for storage. I've been using similar techniques for a few different projects at Based on Content, Inc..

For one of these projects, I wanted to build classes with properties that could be automatically turned into XML elements. The object itself would be the container XML element for its properties. Basically, create a class like this:

class Person(XMLElement):
    firstname = StringProperty()
    lastname = StringProperty()

and have it generate XML like this:

<person>
    <firstname>David</firstname>
    <lastname>Reynolds</lastname>
</person>

To accomplish this, I make use of some metaprogramming and descriptors. For a refresher on descriptors in Python, check out How-To Guide for Descriptors.

First off, I need to implement my property class. It's going to be very basic and not have validators or anything that makes it more useful. This is just a simplified example.

class Property(object):

    def __init__(self, name=None, default=None):
        self.name = name
        self.default = default

    def _configure(self, property_name):
        if self.name is None:
            self.name = property_name

    def __get__(self, instance, klass=None):
        if instance is None:
            return self
        try:
            return getattr(instance, self._attr_name())
        except AttributeError:
            return None

    def __set__(self, instance, value):
        setattr(instance, self._attr_name(), value)

    def _attr_name(self):
        return '_' + self.name

    def default_value(self):
        return self.default

    def get_value(self, instance):
        return self.__get__(instance, instance.__class__)

The Property class is called a "descriptor" since it implements the __get__ and __set__ methods.

The next piece is the metaclass for XML elements. It's going to do a couple things: initialize and configure the properties contained in a XML element class.

It looks like this:

class MetaXMLElement(type):

    def __init__(cls, name, bases, dct):
        super(MetaXMLElement, cls).__init__(name, bases, dct)
        _initialize_properties(cls, name, bases, dct)

The bulk of the work when using this technique is done by _initialize_properties. I haven't defined it yet. It goes through a class that contains properties and creates a _properties attribute for that class. All of the properties defined in that class are then added to the _properties dict. Another thing I want to be careful about is duplicate properties. You don't want attributes with the same name, which can happen through multiple inheritance. The _initialize_properties method is almost identical to google.appengine.ext.db

def _initialize_properties(cls, name, bases, dct):
    cls._properties = {}
    defined = set()

    # add properties from base classes to cls._properties
    # and make sure there are no duplicates
    for base in bases:
        if hasattr(base, '_properties'):
            property_keys = base._properties.keys()
            dupes = defined.intersection(property_keys)
            if dupes:
                raise Exception(
                    "Duplicate properties in base class %s already defined: %s" %
                    (base.__name__, list(dupes)))
            defined.update(property_keys)
            cls._properties.update(base._properties)

    for name, attr in dct.iteritems():
        if isinstance(attr, Property):
            if name in defined:
                raise Exception("Duplicate property: %s" % name)
            defined.add(name)
            cls._properties[name] = attr
            attr._configure(name)

Okay, so what does the base XMLElement class look like? The whole reason for having a XMLElement base class is so we can inherit core functionality and don't have to write so much specific code for individual XML elements that we actually use.

All of the XML elements we define from here on out will inherit from XMLElement.

class XMLElement(object):

    __metaclass__ = MetaXMLElement
    __node_name__ = None

    def __init__(self, **kwargs):
        for prop in self._properties.values():
            if prop.name in kwargs:
                value = kwargs.pop(prop.name)
            else:
                value = prop.default_value()
            prop.__set__(self, value)

    def _get_node(self):
        self._node = minidom.Element(self.__node_name__)
        for name, attr in self._properties.iteritems():
            value = attr.get_value(self)
            if value:
                self._node.appendChild(TextNode(name, value))
        return self._node
    node = property(_get_node)

__node_name__ is a string used as the containing node's name for a XMLElement's properties. The __metaclass__ attribute allows MetaXMLElement access to XMLElement's properties for configuration.

With all of the code defined above, that leaves us free to actually use it and create XML elements in Python.

Here's an example:

class Person(XMLElement):

    firstname = Property()
    lastname = Property()

    __node_name__ = 'person'

p = Person(firstname='david', lastname='reynolds')
print p.node.toxml()

The output of p.node.toxml() on my machine is:

<person><lastname>reynolds</lastname><firstname>david</firstname></person>

Notice how the lastname property is output first? This has to do with how dicts are handled. Keys are unordered. This generally isn't a problem when you're just sending XML over the wire (SOAP, for example), but if it bothers you, you can order the properties according to a static counter that keeps track of all the properties. You can then sort properties from low-to-high based on their counter value.

All the code for this article is available for download.

Tagged with: python, meta programming, xml
Twitter

davidreynolds: I just became the mayor of Uncle Jack's Billiards on @foursquare! http://4sq.com/a6yEqh