Posts Tagged With “meta programming”

Descriptors, Properties, and Metaprogramming

Comment

While reading through Google's AppEngine code for the datastore API, I came across a new (to me) technique for building classes with properties. The idea is that you can create a class for a Person and attach properties like so:

class Person(db.Model):
    firstname = db.StringProperty()
    lastname = db.StringProperty()

There are a lot of things you can do with those properties. For example, you can make sure they're required to exist and you can even validate their values by passing in a custom validator. When it's time to store a person object in the datastore, it will iterate over the properties, make sure they're valid, and finally convert each value to one suitable for storage. I've been using similar techniques for a few different projects at Based on Content, Inc..

For one of these projects, I wanted to build classes with properties that could be automatically turned into XML elements. The object itself would be the container XML element for its properties. Basically, create a class like this:

class Person(XMLElement):
    firstname = StringProperty()
    lastname = StringProperty()

and have it generate XML like this:

<person>
    <firstname>David</firstname>
    <lastname>Reynolds</lastname>
</person>

To accomplish this, I make use of some metaprogramming and descriptors. For a refresher on descriptors in Python, check out How-To Guide for Descriptors.

First off, I need to implement my property class. It's going to be very basic and not have validators or anything that makes it more useful. This is just a simplified example.

class Property(object):

    def __init__(self, name=None, default=None):
        self.name = name
        self.default = default

    def _configure(self, property_name):
        if self.name is None:
            self.name = property_name

    def __get__(self, instance, klass=None):
        if instance is None:
            return self
        try:
            return getattr(instance, self._attr_name())
        except AttributeError:
            return None

    def __set__(self, instance, value):
        setattr(instance, self._attr_name(), value)

    def _attr_name(self):
        return '_' + self.name

    def default_value(self):
        return self.default

    def get_value(self, instance):
        return self.__get__(instance, instance.__class__)

The Property class is called a "descriptor" since it implements the __get__ and __set__ methods.

The next piece is the metaclass for XML elements. It's going to do a couple things: initialize and configure the properties contained in a XML element class.

It looks like this:

class MetaXMLElement(type):

    def __init__(cls, name, bases, dct):
        super(MetaXMLElement, cls).__init__(name, bases, dct)
        _initialize_properties(cls, name, bases, dct)

The bulk of the work when using this technique is done by _initialize_properties. I haven't defined it yet. It goes through a class that contains properties and creates a _properties attribute for that class. All of the properties defined in that class are then added to the _properties dict. Another thing I want to be careful about is duplicate properties. You don't want attributes with the same name, which can happen through multiple inheritance. The _initialize_properties method is almost identical to google.appengine.ext.db

def _initialize_properties(cls, name, bases, dct):
    cls._properties = {}
    defined = set()

    # add properties from base classes to cls._properties
    # and make sure there are no duplicates
    for base in bases:
        if hasattr(base, '_properties'):
            property_keys = base._properties.keys()
            dupes = defined.intersection(property_keys)
            if dupes:
                raise Exception(
                    "Duplicate properties in base class %s already defined: %s" %
                    (base.__name__, list(dupes)))
            defined.update(property_keys)
            cls._properties.update(base._properties)

    for name, attr in dct.iteritems():
        if isinstance(attr, Property):
            if name in defined:
                raise Exception("Duplicate property: %s" % name)
            defined.add(name)
            cls._properties[name] = attr
            attr._configure(name)

Okay, so what does the base XMLElement class look like? The whole reason for having a XMLElement base class is so we can inherit core functionality and don't have to write so much specific code for individual XML elements that we actually use.

All of the XML elements we define from here on out will inherit from XMLElement.

class XMLElement(object):

    __metaclass__ = MetaXMLElement
    __node_name__ = None

    def __init__(self, **kwargs):
        for prop in self._properties.values():
            if prop.name in kwargs:
                value = kwargs.pop(prop.name)
            else:
                value = prop.default_value()
            prop.__set__(self, value)

    def _get_node(self):
        self._node = minidom.Element(self.__node_name__)
        for name, attr in self._properties.iteritems():
            value = attr.get_value(self)
            if value:
                self._node.appendChild(TextNode(name, value))
        return self._node
    node = property(_get_node)

__node_name__ is a string used as the containing node's name for a XMLElement's properties. The __metaclass__ attribute allows MetaXMLElement access to XMLElement's properties for configuration.

With all of the code defined above, that leaves us free to actually use it and create XML elements in Python.

Here's an example:

class Person(XMLElement):

    firstname = Property()
    lastname = Property()

    __node_name__ = 'person'

p = Person(firstname='david', lastname='reynolds')
print p.node.toxml()

The output of p.node.toxml() on my machine is:

<person><lastname>reynolds</lastname><firstname>david</firstname></person>

Notice how the lastname property is output first? This has to do with how dicts are handled. Keys are unordered. This generally isn't a problem when you're just sending XML over the wire (SOAP, for example), but if it bothers you, you can order the properties according to a static counter that keeps track of all the properties. You can then sort properties from low-to-high based on their counter value.

All the code for this article is available for download.

Tagged with: python, meta programming, xml

Python Meta Programming: Update

Comment

I thought I was being really clever with the meta programming from the previous post but I ran into some roadblocks when using similar techniques in live code. I pretty much scrapped the meta programming stuff for now until I can do some more research and develop something solid. At least the problems showed themselves early on before I really became attached to the code. Here's to failing early and often! ;)

Tagged with: python, meta programming

Python Meta Programming

Comment

For an ORM (Object Relational Mapper) I'm working on, I was trying to figure out how I can make it connect to a database without manually calling any functions. Using MySQLdb, you connect to a database by calling MySQLdb.connect(). I wanted this to happen automatically so I'm not always calling MySQLdb.connect() or the equivalent from my own ORM.

The technique I used was to initialize the class when it's first defined. In that __classinit__ method I set up the database connection. The metaclass I used is pretty straightforward:

class MetaRecord(type):
    def __new__(cls, name, bases, dct):
        klass = type.__new__(cls, name, bases, dct)
        klass.__classinit__.im_func(klass)
        return klass

The class above is used as the __metaclass__ of another class which has a __classinit__ function that sets up the database connection.

class MySQLRecord(dict):
    __metaclass__ = MetaRecord
    __CONN = None

    def __classinit__(cls):
        if not cls.__CONN:
            cls.__CONN = MySQLdb.connect(host='localhost', user='foo', passwd='password', db='some_database')

    # ... Rest of the code goes here.

So now you can create your model classes that inherit from MySQLRecord and it will create a database connection if one doesn't already exist.

class User(MySQLRecord):
    pass

I highly encourage anyone interested in python metaprogramming to rip apart these techniques and others to get a better feel of what's going on under the hood. A good place to start would be to set up your own metaclasses and print out the various variables and follow what happens in what order. I'll be doing follow up posts on python metaprogramming that will clear up any fuzzy details.

Tagged with: python, meta programming
Twitter

davidreynolds: I just ousted @marvinmars as the mayor of Amtrak King Street Station on @foursquare!