RE: The remarkable similarities between XSLT and Flex/Lex

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: Roger L Costello <costello@mitre.org>
To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Date: Sun, 26 Jun 2022 13:01:41 +0000

Hi Folks,

The below message was sent to me privately. I was given permission to share it with the list.

Regarding XSLT's similarity to Lex:

It's not really about a single tool like Lex.

Before XML there was SGML, which XML was supposed to "simplify". SGML
included a schema language (DTD), which defines the hierarchical structure of a
document using regular expressions over elements. There was also a strange
unnecessary constraint on these expressions called "ambiguity", which
*everybody* who wrote SGML software needed to understand, and so the idea of
applying formal language techniques to SGML was inevitable.

Long before XSLT, there were a variety of attempts to define languages that
would allow users to specify an automatic translation from SGML into printed
form. Many of these languages were context-free grammars at their core, with
translation rules as actions. This is called "syntax-directed translation"
and was a well-known concept long before that.

With SGML, though, the problem of syntax-directed translation is different
than it is in other contexts, and more difficult in many ways, because the
basic structures in the input are very easy to parse -- elements are delimited
after all -- but the input was a semantically marked up text and the output
was a published document that had to follow all the ambiguously-defined
stylistic rules that people use when they actually do typography. This meant
that complicated grammars, over *element trees* instead of linear text, and
lots of other ideas, needed to be applied. Lots of companies put a lot of
work into it.

So by the time XSLT came around, everyone on the committee as already familiar
with a lot of this history from SGML processing, which was based on a lot of
work rooted in the same formal language theory that goes into lexers and
parsers, and that is why some of XSLT looks a lot like Lex.

Unfortunately, XSLT kind of sucks. When the standard was written, the problem
itself had not really been solved by industry in a really acceptable way (and
it still hasn't been!), and the W3C committee fell into the trap of trying to
innovate instead of codifying best practice.

Follow-Ups:
- Re: [xml-dev] RE: The remarkable similarities between XSLT and Flex/Lex
  - From: Marcus Reichardt <u123724@gmail.com>

References:
- The remarkable similarities between XSLT and Flex/Lex
  - From: Roger L Costello <costello@mitre.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]