Wednesday, 26 November 2008

Tree'd by XML

The standard python distribution comes with a plethora of XML support modules: the usual SAX and DOM parsers, a lightweight minidom, wrappers for the Expat parser and something called ElementTree. While Expat seems to get a lot of attention, I recently tried the ElementTree way of doing things and was very pleasanlty surprised.

ElementTree treats XML documents as a hierarchy of containers, which behave in much the same way as lists or dictionaries. Every node in an XML document is represented as an Element, with elements as sub-elements of their parent.

So, for example, let's declare a root with some sub-nodes:

book = Element('Book')
title = SubElement(book, 'Name')
SubElement(book, 'Title')

To turn this into a valid XML document, just use the tostring(book) function, or write it to a file with the ElementTree().write() method.

Want to add text to an element/node? Just use title.text = 'Bookishness'. To add attributes, again simplicity: title.attrib['binding'] = 'hardback'. Accessing attributes is a case of: book.get('binding').

Now, to get all pythonic on ElementTree, here's how to create a (sub)node, add attributes and text all in one line:

SubElement(book, 'Author', sex='Female', nationality='British').text = 'JK Rowling'

More concise than a Harry Potter doorstop.

No comments: