[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Semantics and the Web: An Awkward History
- From: "Simon St.Laurent" <simonstl@simonstl.com>
- To: xml-dev@lists.xml.org
- Date: Wed, 15 Sep 2021 06:59:53 -0400
I wrote:
I don't share your opinion that separating structure and presentation is laughable. I do it daily?
On 9/14/2021 6:25 PM, Marcus Reichardt replied:
Let me try to explain this latter point: I'm guessing you're into
academic publishing right?
Commercial publishing, now videos instead of books, but -
If so, then HTML can serve you well for casual use with hierarchical
text, headings, limited forms of quotes/cites, definition lists for
nav, and so on. After all, this was HTML's original purpose at CERN 30
years ago, with almost all tags/elements except anchor elements coming
from even older practices originating from the roots of SGML.
Sure - that was why the H1 element is the 'hero' of the paper, and why I
chose the documents I did in the slides.
For some reason you're assuming I limit myself to HTML vocabulary and
structure, when we're talking on an XML mailing list...
But for nearly every other use of text, HTML the markup vocabulary obviously isn't a good fit. Think chat logs, digital image brochures/modern web sites, reveal-type web sites, feeds, medical reports, interviews, presentations such as yours, dialog workflows, or mailing list threads for that matter ;)
None of those are actually hard to mark up. That doesn't mean we do it,
and we certainly disagree about how to do it, but taking the documents
and building a structure? Not hard. I can even do it in HTML if I
want, with either fancy web components or annoying but functional
div/class structures. I could represent presentation XSL-FO style if I
wanted, but I likely wouldn't want to do that.
Harder cases exist. The classic one for the SML work was Unix logs,
simply because requiring an end tag created problems. (Chat logs might
have the same problem in certain fast-flowing contexts.) Cases with
overlap are tricky, and a favorite topic for Balisage, but again, there
aren't really things that fit better than markup that can also be shared
beyond a single program state.
Papers get published in proceedings, in journals, as author copies, as preprints, as extended abstracts - for this kind of workflow, the separation of structure and presentation that we've come to accept may work well. Yet the more typical approach for creating artistic text including ads, word marks, tag lines, poetry, propaganda, posters, flyers, and math doesn't consider separation of eg typography and layout from content.
Since you guessed I was into academic publishing and based an argument
on it, and since you already brought up SPARQL, I'll guess that you're a
programmer who sees documents as the result of things accomplished
elsewhere. Why would you want to separate structure and presentation
when, for your use cases, the document is just presentation?
Fitness of a text format for a given task is all about capturing the intent of text close to how it's envisioned by the author, rather than arbitrary separation of structure and presentation. This is one of the basic, explicit tenets of SGML - that you make up your own DOCTYPE for your project at hand, because that structure is the least volatile thing. Yet W3C sat on a fossilized version of HTML4 for such a long time, being busy with trivial meta things and a NIH mindset in anticipation of new general-purpose vocabularies, yet not bringing anything new to SGML, such that everything else had to cater for it - not least the push towards responsive starting with the iPhone launch in 2007.
The history of HTML when compared to SGML and XML is funny in this
regard. HTML seized on the Annex E vocabulary - just as many others had
since GML - because it already existed and did enough. Having the author
create vocabularies has, sadly, never been plausible to most programmers
or managers. That, of course, put HTML into the "everyone must agree or
this negotiation fails" bucket, and early rounds of failure on that,
mostly at and around the IETF, led directly to the creation of the W3C.
I suspect that much of the early enthusiasm of the W3C had for XML came
from exhaustion over those perpetual negotiations of a single
vocabulary. Let people sort it out themselves!
Similarly, the early XHTML efforts went along that path. XHTML 1.0 was
a light syntax conversion, with advice about what worked and didn't work
in browsers. It didn't do much, but it really couldn't have - setting
the stage for extensibility took some work. XHTML 1.1, with the
amazing/horrifying/ingenious DTD work that allowed for the addition of
modules, definitely also pointed to the "basic, explicit tenets of SGML".
None of that seemed fossilized to me, and the XHTML 2.0 work that
followed definitely didn't seem intent on treating HTML as a fossil.
Yes, HTML 4.01 itself sat quietly in the corner, but that was because
the XHTML work was supposed to be giving it the SGML features you say
you want.
HTML5 was a short burst of new markup followed by nothing at all in that
space. HTML pretty much fossilized when the WHATWG added a pile of
not-markup features to HTML, and then focused almost all their energies
on everything other than markup. They ejected XForms and left us with
the same terrible form choices we've had since the earliest days of HTML.
Personally, I follow a different path. I separate structure and
presentation daily (okay, probably more weekly lately) because I have no
qualms at all about creating new markup when it's convenient for me. No
DTDs, no schemas, no committees - just chucking tags that make sense to
me into the flow of plain text or HTML. I don't tend to share those
documents with the world, and it's probably 5% of the markup I create
total, but there's no reason to stay locked up. CSS can even style my
random markup (in an HTML or XML context) if I think it's worth some
declarations.
Eventually I'll have a presentation on that, too.
Thanks,
Simon
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]