OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] element census tool?

Or, if you want interactive analysis ( < 10,000 files) from within a GUI, you could use the following technique with the SketchPath tool:
1. Drag the top-directory of you document collection into the GUI
2. Copy the following expression (and a variation for attribute names) into the XPath 2.0 editor and then press 'save' to add it to the saved expression list:
       distinct-values(for $n in //* return local-name($n))
3. Browse the collection, represented as a flattened file list view (or use the directory tree view)
4. Each time you select a new document, the saved expression is re-evaluated, and the number of distinct names is shown in a grid that lists all saved expressions and their resolved value/sequence-length for the current document
5. To expand this to show the full and sortable list of element names, you would then double-click on the saved expression.
6. For a survery, a common technique also, is to use the dual-viewer, so you 'freeze' a file in one view whilst browsing the remaining files with the other for immediate comparison against a 'standard', differences in the XPath results between the two are highlighted
7. You could expand on the above by using the 'except' XPath operator to filter out element names from a sequence of names common to all docbook documents - In SketchPath, a previously saved expression can be used as a variable within a subsequent one.
Phil Fearon

On Wed, Feb 2, 2011 at 5:12 PM, Simon St.Laurent <simonstl@simonstl.com> wrote:
I know I could write this myself, but suspect someone else has already done a better job of it - and I just don't know the right Google keywords to summon it.

I'm looking for a tool that I can feed an XML document, and it will tell me which element names were used in the document.

Attributes used on those elements would be a bonus, as would a frequency count for usage, but mostly I'm just trying to survey a collection of (DocBook) documents quickly.

All suggestions welcome.

Simon St.Laurent


XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS