On 11/07/2013 09:16 AM, Michael Kay
wrote:
(Apologies for messing up the subject line....)Some time later on, my brain fully exploded when I realised that there is no clean layered separation possible (at least that I could envisage) that would still give you all the features of full-on SGML. Some time later on, I concluded that this is inevitably true in any powerful text processing system because semantics - real semantics - is on great big hermeneutic circle.XML could have achieved far better separation if it had chosen. The intertwingling of the physical and logical layers caused by the rule that elements must be well-balanced within entities is a quite unnecessary constraint on implementation modularity; the interactions between entity expansion and syntactic parsing within DTDs are even worse. I don't think it's at all true that these complications are inevitable. Michael, I absolutely agree that the amount of entanglement present in XML, thanks to its SGML roots, could be a lot less and that that would simplify life for XML parsers for sure. However, I don't think entanglement can be completely removed if you try to cover all the bases, the way full-on SGML does. I think Joel Spolsky's leaky abstractions rule will always apply at some level http://www.joelonsoftware.com/articles/LeakyAbstractions.html I'd love to be wrong :-) In concrete terms, here is what I am trying to say. Imagine a full-on SGML parse decomposed into phases and piped together: p1 | p2 | ... | pn To achieve all the "features"[1] of SGML some back-flow of information back down through the phases, is required. Sean [1] Some of the features of full-on SGML are seen as "bugs" by some people, with good reason. |