There’s always the XQuery approach ( example in xmlsh) xquery -q ‘ distinct-values( //node-name(.) ) ‘ ---------------------------------------- David A. Lee From: Uche Ogbuji [mailto:uche@ogbuji.net] Oh what the heck. I might as well offer a quick and dirty Amara [1] recipe. import sys import amara from amara.lib.util import element_subtree_iter from amara.xpath.util import abspath doc = amara.parse(sys.argv[1]) top_prefixes = dict(doc.xml_select('*')[0].xml_namespaces) for e in element_subtree_iter(doc): print abspath(e, prefixes=top_prefixes) attrs = dict(e.xml_attributes) if attrs: print '\t', attrs So for example: python element_census.py http://www.retards.org/library/technology/computers/programming/xml/example-xbel.xml | head /xbel {(None, u'version'): u'1.0'} /xbel/title /xbel/folder {(None, u'folded'): u'yes'} /xbel/folder[2] {(None, u'folded'): u'yes'} /xbel/folder[2]/title /xbel/folder[2]/bookmark {(None, u'href'): u'http://www.sciam.com/1999/0599issue/0599bosak.html'} Pretty easy to tweak or format and details of report, etc. In the example above I pass along a URL. You can also pass along a file name, if you like.
|