XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] What does it mean to say that XML was over-engineered?

On Wed, 2021-09-15 at 16:35 +0100, Peter Flynn wrote:
> . I am not
> convinced that the campaign for "simplification" of XML is a viable
> candidate for our attention.
> 

Nor i.  MicroXML quietly died as far as i can tell.

SGML got layering wrong - it was a product of its time - so that
although it met the needs of some groups, it wasn't able to meet the
needs of some others - hence XML.

We can each make lists of features we don't want. Notations, public
identifiers, mixed content, case sensitivity, quotes round attribute
values, entities, DTDs, schemas of any kind, anchovies, cdata sections,
the list goes on. And then others can make a list of features essential
to them, and it turns out to be the same as the ones "we" don't want.

Probably notations and cdata sections would top my list, mostly for
security reasons, but as soon as you remove any feature you break what
i call the XML Promise, that any XML file can be processed (in some
way) by any XML software.

MicroXML shared with XML 1.1 a design problem that's not easily
fixable: there are XML 1.0 documents that change meaning when processed
with an XML  1.1 processor, or that even become not-well-formed,
because of changes to C0 and C1 control characters. Similarly, microxml
doesn't do attribute value normalization, so different atrtribute
values will be reported. At an API level that makes both XML 1.1 and
microxml a non-starter.

But if you agree every existing document that does not use the features
you dropped must have the same meaning, and you can define "meaning"
:), then you now have to provide  alternate mechanisms for the features
you dropped.

So, you drop DTDs, and internal  subsets, and now you need a
replacement for internal text entities like eacute, that's
translatable(&éague; instead  ofé) and &productName;...
so you endup with more layers, and because they're new, likely they are
overly complex for most people, and the cycle continues.

The truth is that almost all specs have obscure features not widely
used. Sometimes an "obscure" features becomes widely used unexpectedly
(like passive ftp mode when Web browsers started supporting ftp) and
peple rush to implement it. Sometimes problems with a feature mean it
gets used decreasingly (SI and SO in ASCII, to switch to an alternate
character set mid-stream, and requiring parsing text files from the
start, have largely fallen away in favour of Unicode. XML NDATA
entities have largely been replaced by URI-valued attributes.

We could not have cut any more from SGML and still had support from the
SGML world. Every feature was debated.

We've got what we've got - let's agree to cherish it for its strengths
and make good use of it.

Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS