A few of my favorite excerpts from recent xml-dev posts

Michael Kay on why attributes are used (not elements) to name and identify:

Making an identifier an attribute rather than a child element is a convention. One can justify it on the basis that attributes are less powerful than elements (they can't repeat, they have no order, and their values are simple strings) and there's some kind of logic in using a construct that has all the power you need and no more. It also makes the identifier stand out for the human reader.

http://lists.xml.org/archives/xml-dev/202202/msg00120.html

Rick Jelliffe on the “tell” of an expert:

We expect Joe Public to be vague about technical terms (calling every cold a "flu" for example) but experts who do it are, what, pushing some dismissive/co-optive agenda? Indeed, it is the "tell" of an expert that they err on the side of precision for technical terminology.

http://lists.xml.org/archives/xml-dev/202202/msg00115.html

Roger Costello: Is XML eternal or ephemeral?

Many algorithms are eternal knowledge. For example, 100 years from now the quicksort algorithm will be as relevant then as it is today and as it was 50 years ago.

Many data structures are eternal knowledge. For example, 100 years from now the stack, queue, and tree data structures will be as relevant then as they are today and as they were 50 years ago.

Algorithms and data structures are knowledge that doesn't expire.

Conversely, "the unavoidable ephemeral knowledge one accumulates during their career comes in many forms" such as "a vogue framework or programming paradigm that falls out of favor in a couple of years."

Where is XML in the spectrum of eternal to ephemeral? Are there aspects of XML that are more on the eternal side of the spectrum and aspects of XML that are more on the ephemeral side of the spectrum? Perhaps the tree structure of XML documents is an aspect of XML that is eternal. Perhaps the DOCTYPE is an aspect of XML that is ephemeral. What do you think?

Is XML eternal or ephemeral?

http://lists.xml.org/archives/xml-dev/202202/msg00097.html

Liam Quin on where XML is strongest:

Where XML is and has always been the strongest is in the interface between the human-readable and the machine-processable.

http://lists.xml.org/archives/xml-dev/202202/msg00071.html

Simon St. Laurent on markup humility:

I hope that someday the SGML-to-XML cycle will begin again, that we'll sort through the piles we've hoarded with a different eye to produce something smaller and more useful. Getting to that, though, likely means that we have to want to do less, not more.

http://lists.xml.org/archives/xml-dev/202202/msg00063.html

What does it mean to say that <foo></foo> and <foo/> are equivalent? Michael Kay weighs in on “equivalence”:

Equivalence is not defined in the XML recommendation.

In the context of the wider XML eco-system, one can say that two XML documents are equivalent if they have the same infoset.

http://lists.xml.org/archives/xml-dev/202202/msg00008.html

Michael Kay on balancing terseness and verboseness:

There's a balance. If we wrote numbers in base 100 we would need half as many characters to write them, but we would have to remember and recognise 100 different digits symbols. The extra cognitive load of remembering 100 digit symbols isn't worth the extra conciseness.

Regular expressions are too concise in my view - contrary to what Whitehead is advocating, you have to think too hard in order to understand them. A concise notation is great if it makes patterns stand out visually, but not otherwise.

But COBOL, and XSLT 1.0, is not concise enough. It's hard to read because it takes too much space so you have to do a lot of scrolling. Again, the patterns don't stand out.

I suspect those who argued that terseness was not important for XML were actually arguing that human readability is more important than message size. That's certainly true, however, conciseness can help human readability. And indeed, the design of namespaces (using prefixes in place of URIs) shows that the value of terseness was recognised.

I also suspect that the reason the matter came up for debate was the more specific question of whether element names should be repeated in the end tag. To be honest, I'm still unsure in my own mind about that decision -- there are arguments both ways. JSON, like LISP, suffers from the problems you get when you have a long string of closing delimiters like "]]}]}]". Indeed, we see this in XPath. Perhaps there's something to be said for the alternative approach of representating hierarchy through indentation.

And I've always felt that editors could do a better job of making hierarchy stand out visually - for example using colouring that varies in intensity or hue depending on the nesting depth. With visual cues like that, we could achieve the readability without worrying so much about the notation.

http://lists.xml.org/archives/xml-dev/202201/msg00093.html

Michael Kay on the evolution of technologies/standards:

I think with most technologies/standards we tend to see

(a) an era of experimentation, where lots of people invent new things

(b) followed by an era of consolidation, where a small number of winners emerge

The number of winners after phase (b) is highly variable. With database technology, it got down to one (SQL). Similarly with the networking stack (TCP-IP), and the web (HTTP / HTML / CSS / JS), and character sets (Unicode). With operating systems it got down essentially to two (Unix / Windows). In most of these cases there were epic contests before winners emerged. With procedural programming languages it never got down to less than half a dozen, though Java looked like a promising convergence point for a while. Some other technologies, like 4GLs and NoSQL databases, withered on the vine primarily because they failed to achieve this convergence.

The forces that determine whether convergence happens and how long it takes are essentially a battle between the desire for innovation and the desire for interoperability. (The pressure for innovation comes primarily from vendors who want a USP rather than from users who want new features, of course).

Once consolidation is achieved, things tend to remain stable for a very long time: it becomes very hard for anyone to break the consensus. If a breakaway does occur, it's most likely to emerge from a niche, or from some technological disruption (voice over IP, say). If anyone ever breaks the 50-year-old duopoly of Windows and Unix, it will probably be some operating system designed for a niche market.

So where does XML fit in this picture? Until 1998 there was a very long period of experimentation; there were some standardisation candidates (SGML, ASN.1, EDI) but by and large, the scene was highly fragmented. The convergence on a single standard was unusually sudden, and I've always been puzzled as to what exactly were the industry dynamics that led to such an explosion of rapid adoption: one of them undoubtedly was the low cost of implementation and adoption. But I think that the very rapidity of this adoption also meant that the consolidation phase was an unstable equilibrium: because it was the only game in town, people embraced it a bit too eagerly for things it was never designed to do. Unlike databases, operating systems, and character sets, it's an area where the benefits of doing something different can exceed the costs, so it was easy for a breakaway niche such as JSON to emerge.

When a breakaway occurs, it's usually a technology that's simpler, less capable, but better suited to the needs of people who need something simple. The old guard maintaining the consensus thus tend to dismiss it as kids' stuff. This ignores the lesson that progress is often achieved by stripping out the debris of complexity that accumulates over time, and starting afresh with a clean sheet of paper.

http://lists.xml.org/archives/xml-dev/202201/msg00033.html

Tim Bray: XML moved Unicode into the mainstream:

I've come to think that, in the long-distance rear-view, one of XML's biggest legacies was moving Unicode from a fringe thing to a place where there was a huge contingent of developers who'd been forced to think about it.

http://lists.xml.org/archives/xml-dev/202201/msg00017.html

Liam Quin: XML is for data that endures:

JSON is great for ad-hoc formats designed by a developer for one particular use. JSON is program-centric, whereas XML is document-centric. JSON is primarily for data that appears, is transmitted, and vanishes in a fleeting moment. XML is primarily for data that endures.

http://lists.xml.org/archives/xml-dev/202201/msg00002.html