Lists Home |
Date Index |
So I would be very interested in hearing (directly or through XML-DEV)
from XML-DEVers (especially lurkers) what kinds of non-DTD
constraint-checking they have had to implement, whether to help
data entry, QC, QA or unit testing.
In screen-scraping some html pages - xml-ified with Tidy first - I found
intermittent html errors like having a "p" element sometimes inside an "a"
although it was supposed to be the other way around. I had to develop a
series of validity tests and still write the stylesheet very defensively.
This particular example is a DTD non-validity, but the same page has a
certain structure of links pointing to indexes pointing to details. Often
the structure of these blocks is messed up, making it hard to locate the
right information to extract although the appearance in the browser is
normal. These are non-DTD errors.