[
Lists Home |
Date Index |
Thread Index
]
My point was not that the example heuristic is any good, nor that a
well-formedness constraint doesn't help, but that well-formedness is
neither sufficient nor necessary. Of course it helps, but even if lots
of (or all) unrecoverable errors are eliminated thereby, surely it is a
business issue how widely to define recoverable and whether the
recoverable errors are worth fixing locally. In some cases it may well
be appropriate to say that the input should only be accepted as-is or
not at all. In other cases it may be acceptable to do a best-effort fix.
Take this example:
<ul>
<li>
<b>This is very important information that must be passed on!</b>
</ul>
Suppose I received it over a TCP channel, so I regard it as effectively
certain that nothing was lost from the MIDDLE. If I also know that it
was previously an HTML document and was supposed to have been converted,
then I can be pretty sure that all I have to do is insert </li> between
</b> and </ul> to achieve well-formedness and correct the conversion.
(Indeed, this is exactly what the converter should have done, but
perhaps he/she/it is no longer available to be sent an error message.)
On the other hand, if I don't trust the source enough to know that there
wasn't a second bullet point that was also omitted, then why would
well-formedness make any difference? There could still have been a
second bullet point that was omitted.
It's like in channel theory: you have to have an error model to be able
to do meaningful calculations. Truncation is one thing, bit-corruption
is another, human-typed markup is another, broken software is another.
|