[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Please stop writing specifications that cannot be parsed/processed by software
- From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
- To: Dimitre Novatchev <dnovatchev@gmail.com>,Michael Kay <mike@saxonica.com>
- Date: Sun, 04 Jun 2023 08:01:57 -0400
At 2023-06-03 18:52 -0700, Dimitre Novatchev wrote:
> For example, the XPath function library is defined in an XML documentÂ
> that contains all the function signatures in a custom vocabularyÂ
> reflecting the object model for XPath
functions, and that data is extremely useful;
> it can be used for example to create the data used by a type-checker.
> I'm sure there are cases where an XML format can be standardised
> across a wide range of specifications (for
example, a format for defining BNF grammars)
> but I'm sure that highly specialised custom
formats also have a role to play.
>Â
> Of course in our own community we're very
prepared to eat our own dogfood in this way.
Getting people to use a similar approach when
they're writing safety standards for industrialÂ
> washing machines is a different kettle of
fish. Those guys just click on the word processor icon and start typing.
To this day I have been often wondering where to
find the XML Schema for this type of document. Or is it a secret?
"XML Schema? I don't need no stinking XML Schema!"
Isn't that the big draw of XML over SGML?
Mind you, I was only a user of the W3C documents
expressing the XML standards in XML, so I didn't
have to worry about being constrained to create that content.
But certainly writing the stylesheets that
converted the specifications of XSLT and XSL-FO
and their content models into my own book's
document model (which *did* have its own DTD,
since I was creating content) didn't rely on
having a document model for the specifications.
Empirical examination of content was all I needed:
<proto role="example" name="eg:if-empty" return-type="xs:anyAtomicType*"
returnEmptyOk="no" isSpecial="yes" returnSeq="no" returnVaries="no"
isSchema="no" isDatatype="no" isOp="no">
<arg name="node" type="node()" emptyOk="yes"/>
<arg name="value" type="xs:anyAtomicType"/>
</proto>
<prod num="64" id="prod-xpath-ElementTest">
<lhs>ElementTest</lhs>
<rhs>"element" "(" (<nt xmlns:xlink="http://www.w3.org/1999/xlink"
def="prod-xpath-ElementNameOrWildcard"
xlink:type="simple">ElementNameOrWildcard</nt> ("," <nt
xmlns:xlink="http://www.w3.org/1999/xlink" def="prod-xpath-TypeName"
xlink:type="simple">TypeName</nt> "?"?)?)? ")"</rhs>
</prod>
A reference to this and a few examples will be greatly appreciated.
I never felt the need to go looking for a
document model when I could correlate what I saw
in the markup with what I saw on the formatted results published in HTML.
I acknowledge that the NISO-STS document model
doesn't (yet!?) have models for BNF or other
formal specification grammars, unless one
shoehorn's such using generic named-content
semantic constructs. But that is because it is
leveraging JATS, which itself doesn't have such.
For me, using such "hi-tech" language in order
to specify what you want to say and be
understood, has always seemed an unwanted and
unnecessary obstacle in the
specification-creation process -- one that
stifles the author and digresses him elsewhere
-- not where the focus of the main topic is.
I envy GitHub authors who only have to use MD,
and can easily produce stunning documents.
"Stunning" to the eye, I agree. And that is what
one gets with NISO-STS off-the-shelf: stunning to the eye.
But you get out what you put in, and the XML
specification writers put in a helluva lot of
effort into marking up the documents they were
responsible for, and users like myself could
leverage such quickly and effectively without the need for a document model.
And so with NISO-STS where some of my clients are
leveraging the semantic constructs for
requirements markup that is then handled
downstream well after the publication process.
If someone needs so much strict structure, please use ChatGPT or iXML --
(there are people working on it... stay tuned)
I hope this is considered helpful.
. . . . . Ken
but please, behind the scenes, where these do belong.
Thanks,
Dimitre
On Thu, May 25, 2023 at 5:03 PM Michael Kay
<<mailto:mike@saxonica.com>mike@saxonica.com> wrote:
>no-one has to invent something new to get what you are asking for
But if you're prepared to invent something new
then you can probably do better...
For example, the XPath function library is
defined in an XML document that contains all the
function signatures in a custom vocabulary
reflecting the object model for XPath functions,
and that data is extremely useful; it can be
used for example to create the data used by a
type-checker. I'm sure there are cases where an
XML format can be standardised across a wide
range of specifications (for example, a format
for defining BNF grammars) but I'm sure that
highly specialised custom formats also have a role to play.
Of course in our own community we're very
prepared to eat our own dogfood in this way.
Getting people to use a similar approach when
they're writing safety standards for industrial
washing machines is a different kettle of fish.
Those guys just click on the word processor icon and start typing.
Michael Kay
Saxonica
> On 26 May 2023, at 00:20, G. Ken Holman
<gkholman@CraneSoftwrights.com> wrote:
>
> Roger, already standards from ISO and CEN are
being published in NISO STS XML:
>
> <https://www.niso-sts.org/>https://www.niso-sts.org/
>
> And there some SDOs (Standards Development
Organizations) that are building requirements
into their STS XML so they can be harvested
downstream after publishing by requirements
management software tracking, for example, "may", "shall", "should", etc.:
>
>
<https://www.ncbi.nlm.nih.gov/books/NBK556169/#holman-semantics2>https://www.ncbi.nlm.nih.gov/books/NBK556169/#holman-semantics2
>
> I commend that paper I wrote regarding the
identification of semantics (say, of requirements) in standards content.
>
> I've co-founded a company in Ireland that is
servicing the standards development community
of SDOs with software that is publishing these
richly-encoded XML documents into PDF, HTML, and DOCX:
>
> <https://RealtaOnline.com>https://RealtaOnline.com
>
> Moreover, SDOs are looking to us to enrich
their XML and we are experimenting with AI in this regard. Exciting stuff.
>
> I'm delivering a presentation at JATS-Con
2023 you may wish to attend to learn more about
how Réalta Online is using standards such as
XSLT and XSL-FO to enrich and publish standards
with fidelity across output products:
>
>
<https://jats.nlm.nih.gov/jats-con/2023/schedule2023a.html#1-1145>https://jats.nlm.nih.gov/jats-con/2023/schedule2023a.html#1-1145
>
> So I think all that is needed is an awareness
campaign to make standards writers and SDOs
aware that the technology exists already. We
don't have to wait to be able to do what it is
you are asking. It just has to be done with the tools at hand.
>
> And not just for ISO and CEN standards.
Hundreds of SDOs exist out there publishing
thousands of standards documents. Please spread
the word about NISO STS XML and the leverage
they can get by adopting something that exists
... no-one has to invent something new to get what you are asking for.
>
> I hope this is helpful.
>
> . . . . . . . Ken
>
> At 2023-05-25 19:57 +0000, Roger L Costello wrote:
>> Dear Specification Writer,
>>
>> Please stop writing specifications that
cannot be parsed/processed by software. Please
stop formatting your specifications as Word and
PDF. Instead, use a format that is amenable to
machine processing. The XML format is ideal. We
want to analyze your specifications. We don't
want to spend dozens of hours screen-scraping your Word/PDF documents.
>>
>> If you simply must persist in writing
Word/PDF documents, then please write in a
consistent way so that we can screen-scrape
without having to write special case code. To
illustrate, in one of your specifications you
provide a bunch of tables with data; each table
has many rows. In some tables you reference a
note. Here's a row with a note reference:
>>
>> 119 Approach Route (1) Note 1 5.7
>>
>> Here's another row with a note reference:
>>
>> 52 SID Ident (1) (Note 1) 5.78
>>
>> Why did you embed Note 1 within parentheses
in the second case but not the first? That's an
example of not being consistent. Such
inconsistencies make it difficult to do
screen-scraping. Please be consistent. If at
all possible, write a parser to parse the data
that you embed in your specification. This will
immediately inform you of any inconsistencies.
>>
>> Thank you,
>> From the people who must read, understand, and analyze your specifications
>>
--
Contact info, blog, articles, etc. http://www.CraneSoftwrights.com/x/ |
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Streaming hands-on XSLT/XPath 2 training class @US$125 (5 hours free) |
Essays (UBL, XML, etc.) http://www.linkedin.com/today/author/gkholman |
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]