Re: [xml-dev] Developing open business information exchange documents

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
To: u123724 <u123724@gmail.com>
Date: Thu, 23 Feb 2017 10:48:10 -0500

At 2017-02-23 15:01 +0100, u123724 wrote:

Thanks for this interesting read.

I appreciate the opportunity to discuss it. I'm proud of the work of our committee.

As I understand, CTS (not 100% sure about terminology here) models a
component model using UML, and then defines bindings to XML, JSON, EDI
etc.

Not quite, no. CCTS models a component model. Full stop. That model describes structures of aggregates (called ABIEs for Aggregate Business Information Entities). Each ABIE includes a set of branch components (called ASBIEs for Association Business Information Entities, each one defined by an ABIE) and leaf atomic components (called BBIEs for Basic Business Information Entities). Each leaf is described by one of 20 available Core Component Types (CCT). Each CCT has defined metadata.

CCTS itself ends right there: a hierarchical structure of business information entities as the semantic model of the information in the document being transmitted.

CCTS has nothing to do with UML, but some people find the UML graphical representation of this hierarchical structure convenient to read, and so the UBL committee published a UML alternative representation to the normative CCTS here:

http://docs.oasis-open.org/ubl/UBL-2.1-UML/v1.0/UBL-2.1-UML-v1.0.html

CCTS says nothing about syntax. About 15 years ago the UBL committee created the concept of "Naming and Design Rules (NDR)", and since then a number of different groups have latched on to this concept of synthesizing syntactic constraints from a model. NEIM has NDRs. CCTS created their own NDRs. The UBL TC genericized its view of naming and design rules and the most recent version is out for public review with the addition of JSON schema synthesis rules:

http://docs.oasis-open.org/ubl/Business-Document-NDR/v1.1/Business-Document-NDR-v1.1.html

We also published the UBL-specific application of these generic rules:

http://docs.oasis-open.org/ubl/UBL-NDR/v3.0/UBL-NDR-v3.0.html

Accordingly, we have recently published the JSON schemas for UBL 2.1 for public review as an alternative syntax to the normative XSD here:

http://docs.oasis-open.org/ubl/UBL-2.1-JSON/v1.0/UBL-2.1-JSON-v1.0.html

So ... the model is separate from the syntax. CCTS is standalone as a modeling tool. A user community chooses which NDRs to use to go from the model to the syntax. We believe our OASIS Business Document Naming and Design Rules (BDNDR) addresses real-world deployment issues better than do other NDRs. A user community then chooses which sytaxes they wish to make normative.

I've depicted the relationship of NDRs here in the BDNDR:

http://docs.oasis-open.org/ubl/Business-Document-NDR/v1.1/Business-Document-NDR-v1.1.html#F-NAMING-AND-DESIGN-RULES-IN-AN-OPEN-EDI-APPLICATION

The spec document
(http://www.unece.org/fileadmin/DAM/cefact/codesfortrade/CCTS/CCTS_V2-01_Final.pdf)
goes right into an example where postal addresses are being modeled.

(noting that in the CCTS document that is but an example of method, not a prescription for an address model)

But having been involved in a project for modeling international
postal/logistical addresses, I found this to be an area where a
component model is impractical to use.

That's because postal addresses, as written on an envelope, carry a
huge legacy of special notations, concepts and nuances that can only
be conveyed in its conventionally written text form (postal addressing
also contains the problem of naming persons, which in itself is a
culturally diverse topic). In my project, I modeled addresses using
XML/XSD based on terms and concepts of UPU S42 (international postal
union standard, also an ISO affiliate organization). In particular, I
modeled it such that an address (in one of the supported formats)
could be stored/represented as tagged semistructured text, where the
line-oriented address (as would appear on a postal piece) was
optionally tagged with elements such as "postcode", "given name", etc.

Very true! And so the UBL model for the address includes a "catch all" unstructured address line, though unlike your approach it does not support mixed content, it is totally unstructured (not semi-structured). Here is UBL 2.1's CCTS model for a postal address (the address line is at the end):

http://docs.oasis-open.org/ubl/os-UBL-2.1/mod/summary/reports/UBL-AllDocuments-2.1.html#Table_Address.Details

When a user community deploys UBL, it decides *which* of the UBL items the community wishes to use for an address. UBL doesn't tell user communities what to do, it just makes semantic items available to the user community to choose from. With each new revision of UBL we solicit input from user communities to add to the UBL model the semantics they need they cannot already find. In a deployment users decide the UBL subset they agree upon. Here is an example of a deployment from Denmark:

http://oioubl.info/classes/en/index.html

In any other CCTS project, the user community can model the address any way they wish, provided it is element content.

I'm writing this because, even though we recently agreed here that XML
isn't designed as a data model format (or didn't we?), I've found
XML/markup (semistructured data) to be a proper representation format
for addresses specifically.

Personally, I'm growing to accept that an XML schema expressed entirely of element content and no mixed content doesn't have benefits over a JSON schema. CCTS defines only element content and no mixed content, and so I saw the opportunity to expand the NDRs to be able to synthesize JSON schemas from CCTS.

The proposed approach out for public review is specifically tuned for backward-compatibility in future revisions of the CCTS model ... for example, cardinality is on every information item and I've implemented cardinality using JSON arrays. That way if the user community increases the upper bound of cardinality in a future version of their specification, their legacy JSON instances will still validate even if their original cardinality was "1" because items of cardinality "1" are implemented with 1-sized arrays indexed with "[0]". This may appear wasteful on first blush, but the orthogonality on all items (including the document element) is intended to make simplify implementations. It is the committee's intention that JSON users will ingest our JSON document interchange format and create their customized big-data representations or database schemas tuned to their applications' requirements.

Again, the committee is only dictating information interchange between parties, not what the parties *do* with the information once ingested. That has a different mindset than application development. The choice of syntax becomes only a choice of how to ingest the content from a trading partner.

The UBL committee hasn't (yet?) decided to make the JSON serialization normative, but other committees using the NDR may wish to do so. It is the syntax du jour and element-only CCTS content works just fine with it. And I've written XSLT to transliterate UBL XML to UBL JSON losslessly, so there are no benefits of one over the other. It is what the *community of users* wants to choose. The UBL committee was struck in 2001 with the goal of creating normative XML constraints and so JSON wasn't in the picture at the time.

Maybe Roger wants to add this problem (which favours SGML/XML over
other approaches for one) to his ongoing data modeling effort. More
generally, I think there's a place for document-oriented
representation of eg. business transactions; namely, when you receive
and send eg. orders as *documents* and want to store these as-is (for
digitial signing, compliance, or other reasons);

I've come to believe the domain doesn't matter. As I've said above, it has come down for me to be a choice between all element-content or some mixed-content. All element-content can be either XML or JSON, any mixed-content dictates XML.

in this model, you
establish a business operational view (to use a term from UBL) by
extracting atomic values from documents.

The Business Operational View (BOV) is a term from the ISO/IEC 14662 Open-edi Reference Model, neither from UBL nor CCTS. It is the *semantic* view of electronic business. The Functional Services View (FSV) is the *implementation* view of electronic business. The committee overseeing Open-edi projects is ISO/IEC JTC 1/SC 32/WG 1 e-Business (which I volunteer on for Canada), and I am the editor of ISO/IEC 15944-20 that illustrates linking the BOV to the FSV.

The BOV establishes the existence of a semantic atomic value and its business component type in a document model, the FSV implements that as syntax. And I didn't just say "type" there because the core component types are business types, not XSD nor JSON types. For example, the "Amount" business type is a combination of a mandatory numeric value, a mandatory currency identifier and an optional currency code list version value. There is an XSD complex type and a set of JSON object properties defined accordingly.

Most of the members of the UBL committee are semantic business modelers who work only with the CCTS model and they never work directly with XML nor JSON. I'm the only one who has to look at the angle brackets and brace brackets.

I hope this has been helpful for all readers. Please let me know if there are any other questions.

. . . . . . . . . . . Ken

On Wed, Feb 22, 2017 at 3:43 AM, G. Ken Holman
<gkholman@cranesoftwrights.com> wrote:
> Not all XML-ers enjoy committee work. Imagine!
>
> It happens that I do, and I have had the privilege to volunteer in a number
> of standardization committees over the years related to SGML and XML
> projects of different kinds, from markup technology committees to markup
> user committees.
>
> A business document interchange specification governs the structure and
> expression of information to be exchanged between members of an industry or
> economic sector. I have come to believe that the burden of developing such
> a specification is on the users and not on the technical people supporting
> them. Accordingly, these user groups need processes, techniques and tools
> to enable subject matter experts to lead the development of open
> standardized work products.
>
> I've written an essay on how the Organization for the Advancement of
> Structured Information Standards (OASIS) technical committee process
> supports a group of members from an industry or economic sector in creating
> business exchange document specifications. The essay is an adaptation of a
> response I wrote last year for an RFP for the development of such an open
> document standard in XSD for XML. I ended up no-bidding the contract
> because the constraints were not loose enough to accommodate my proposal.
> But my proposal should be of interest to those just embarking on a project
> to develop exchange specifications without pre-conceived constraints.
>
> I illustrate my points using my experience with the OASIS Universal Business
> Language Technical Committee that produced the OASIS UBL 2.1 Standard that
> was subsequently ratified globally as ISO/IEC 19845:2015. The two normative
> components of the specification are the semantics of the information items
> and the XSD schemas for XML syntax. Non-normative deliverables include UML
> models, ASN.1 schemas and JSON schemas.
>
> Those not interested in committee work will find this essay an excellent
> treatment for insomnia. But for those of us XML-ers who are approached by
> their management or clients regarding the "big picture" of developing
> document exchange specifications, maybe even with the goal of developing an
> ISO standard for such, I hope you find this interesting:
>
> https://www.linkedin.com/pulse/developing-open-business-information-exchange-documents-ken-holman
>
> . . . . . . . . Ken
>
> cc: XML Dev, XML-L, W3C XML Schema, UBL Dev


--
UBL introduction lecture - Exchange Summit - Orlando, FL - 2017-04-24 |
Contact info, blog, articles, etc. http://www.CraneSoftwrights.com/x/ |
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Streaming hands-on XSLT/XPath 2 training class @ US$45 (5 hours free) |

Follow-Ups:
- Re: [xml-dev] Developing open business information exchange documents
  - From: u123724 <u123724@gmail.com>

References:
- Developing open business information exchange documents
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
- Re: [xml-dev] Developing open business information exchange documents
  - From: u123724 <u123724@gmail.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]