Hmm .. I guess what I'm saying is this - suppose that you have an input sequence that looks like this:
which you're implying could conceivably valid input.
Because we know the underlying semantics, the processor would be able to parse that as:
However, without those known semantics, there are ambiguities in the input - it could be interpreted as
<lukvi>Line 1 <lukvi> Line 2 <lukvi> Line 3</lukvi></lukvi></lukvi>
which may have very different interpretations based upon structure (I've deliberately scrambled the words to highlight the issue). If that was a known schema instance, it's that which I'm referring to in terms of ambiguity. There may be specific parsing rules in HTML5, but I daresay that anyone writing the initial instance I gave above probably wouldn't be well versed on the specification.
I think the difference in interpretation here is that the HTML5 focus is on tolerating ambiguity (which is what supporting multiple rules for parsing is) and treating precision as a fault, while the XML focus is on being willing to deal with the extra precision if it reduces ambiguity. That's one of the reasons I get antsy when I hear people make statements like the idea that HTML can replace XML. HTML+ARIA might have that additional precision, but it comes at the cost of requiring two languages plus coding to accomplish what can be done in one with XML.
Lockheed / US National Archives ERA Project