Monday, June 13, 2011

Python Classes and XML Nodes

I know...I just keep doing stuff that others have already done. But I thought it was interesting, and that's all that mattered; never mind that using the end product would end up being more work than just writing XML by hand (except for the--not so?--rare case where you repeat the same element many times over), my idea was brilliant! ;-)

First thing's first, the code:

from random import randint
from collections import OrderedDict as orddict

class Node(object):
    def __init__(self, tag, type=None, key=None, text=None, **kwargs):
        """Base class for other parts of an (x)html document

            tag  : The tag that will be used for this node

            type : The node type (currently only None or 'textnode' are used

            key  : The key that is used in the OrderedDict; if unspecified,
                 will try to get the value from the attributes value (a dict)
                 in kwargs. If id is not present in attributes, a random number
                 will be generated instead.

            text : The text of a 'textnode' type Node. Otherwise unused.

            **kwargs : A dict containing one of these keys:
                'attributes' : attributes which will be added to the tag (dict)
                all others   : currently unused

        Node Types:
            None : The default; nothing special.
            'textnode' : only contains text, may not have any children
        self.children = orddict()
        self.tag = tag
        self.type = type

        if key != None:
            self.key = key
            self.key = randint(1, 2**32)

        if type != 'textnode':
            for arg in kwargs:
                if arg == 'attributes':
                    self.attributes = kwargs[arg]
                #elif arg ==
            if text is not None:
                self.text = text
                self.text = ""

    def create_child(self, tag, type=None, key=None, **kwargs):
        """Append a Node to self.children with the given...stuff"""
        if self.type == 'textnode':
            raise NodeTypeError('Node type "%s" cannot contain children' %
            element = Node(tag, type, key, **kwargs)
            self.children[element.key] = element
            return element.key
            # isinstance(element.key, (int, long))

    def append_child(self, node):
        if isinstance(node, Node):
            while node.key in self.children.keys():
                #print "'%s' already exists, trying" % (node.key),
                node.key = randint(1, 2**32)
                print node.key
            self.children[node.key] = node
            #print 'Success!'

    def get_tree(self, debug=False, depth=0):
            Return a string representing a Node tree
            with this Node as the tree root
        if self.type == 'textnode':
            node = self.text + "\n"
            node = '<%s' % (self.tag)
            if hasattr(self, 'attributes'):
                for attrib in self.attributes.iterkeys():
                    if self.attributes[attrib] is not None:
                        node += ' %s="%s"' % (attrib, str(self.attributes[attrib]))
                        node += ' %s="%s"' % (attrib)
            if len(self.children) > 0:
                node += '>\n'
                for child in self.children.iterkeys():
                    node += self.children[child].get_tree(debug=debug, depth=depth + 1)
                node += '%s\n' % (self.tag)
                node += ' />\n'

        if debug == True and depth == 0:
            print node
        return node

class NodeTypeError(Exception):
    """Raised when an attempt to create a child on a parent
    who doesn't support children occurs."""

Yes, it's not that pretty, but it does the job. Ideally, you might build some framework around this, or maybe you want to take it and make chunks of a page which will be reused in different areas of a page. Regardless of what you want to do, you likely can do it with this. The flexibility of Python classes (which I am just now discovering for myself) is quite useful: At first glance, you might think this class is rather useless--that couldn't be more wrong; throw in a few properties (or modify the get_tree method to accept arbitrary kwargs) and intuitive templating is just around the corner.

The breakdown

I think it's decently documented, but in case you're still finding it hard to follow, here's how it goes:

  • You create an instance of Node (say, html)
    • This is your root node of either a node tree or tree fragment
  • You create children by one of two methods:
    • The create_child method, which creates another Node instance and appends it to the parent Node's children property
    • Append a Node instance, which may or may not have children, to a parent Node's children property with the parent's append_child method (preferred) or manually
  • Profit!!!

Other Thoughts

I imagine someone has already done this (or something very similar); I would be very surprised to learn otherwise. However, I figure it couldn't hurt to put this out there anyway, for those who may not have thought of this, but could do a much better job than I. Feel free to tastefully mention other projects with similar intent in a comment, or criticize my poor understanding of the Python language (well thought out corrections to common mistakes are appreciated).
As for the're free to do anything you like to it (meaning you may copy it, paste it, butcher it, turn it into some horrible monster, create a masterpiece, and/or call it your own). I only ask (not require) that you mention me where ever you see fit. Thanks!