Or, if you want interactive analysis ( < 10,000 files) from within a GUI, you could use the following technique with the SketchPath tool:
1. Drag the top-directory of you document collection into the GUI
2. Copy the following expression (and a variation for attribute names) into the XPath 2.0 editor and then press 'save' to add it to the saved expression list:
distinct-values(for $n in //* return local-name($n))
3. Browse the collection, represented as a flattened file list view (or use the directory tree view)
4. Each time you select a new document, the saved expression is re-evaluated, and the number of distinct names is shown in a grid that lists all saved expressions and their resolved value/sequence-length for the current document
5. To expand this to show the full and sortable list of element names, you would then double-click on the saved expression.
6. For a survery, a common technique also, is to use the dual-viewer, so you 'freeze' a file in one view whilst browsing the remaining files with the other for immediate comparison against a 'standard', differences in the XPath results between the two are highlighted
7. You could expand on the above by using the 'except' XPath operator to filter out element names from a sequence of names common to all docbook documents - In SketchPath, a previously saved expression can be used as a variable within a subsequent one.
Phil Fearon