OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Using Tidy for XML correction

[ Lists Home | Date Index | Thread Index ]
  • From: Linda van den Brink <lvdbrink@baan.nl>
  • To: "Xml-Dev (E-mail)" <xml-dev@xml.org>
  • Date: Thu, 05 Oct 2000 15:00:12 +0200

Hi all, 

I'm hoping there's some Tidy expertise on this list. I've been trying to use
this tool with its -xml option for cleaning up my XML files, unsuccesfully
up till now. 

I'm dealing with a very large number of help topics, which are exported by a
program from our application and dumped as a very large number of XML files.
Unfortunately, the program that does the conversion from plain text to XML
is not perfect: on occasion it creates mallformed XML files. It is unlikely
that this program will start to output 100% wellformed XML soon, and we have
to do a new export of the help topics on a regular basis, so manual
correction of the errors in the XML files is not really an option. 

We need wellformed XML, because I'm using XSLT to create HTML files for HTML
Help, our output format. 

I thought I'd see if Tidy can correct the wellformedness errors for me, but
it seems as though it can't. What I get are things like 
<p>
<list>
<listitem>
<courier>
Some text
</courier>
</p>
To which Tidy replies: "Unexpected </p> in <listitem>. The document has
errors that must be fixed before using HTML Tidy to generate a tidied up
version."

Another example: 
<p>Some text <link>this is a link<
/link></p>
To which Tidy replies: "Unexpected </p> in <link>" (rest is same as above)

Is there something I can do to make Tidy work for me? Or is it just not
suited for this job? Are there other tools or approaches that are? 

Thanks, 
Linda van den Brink





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS