[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML vs JSON
- From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
- To: Mukul Gandhi <gandhi.mukul@gmail.com>,"xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Tue, 01 Aug 2017 10:12:34 -0400
I do not consider myself an "expert" in the comparison of JSON and
XML, but I thought I would offer my own observations. In my work
with UBL we have been pressured to support both syntaxes, much to my
chagrin and reluctance. Accordingly, I proposed the JSON
serialization of UBL that is being successfully adopted in our
community, but also criticized by others in its approach. This has
helped me to better understand my personal feelings about this
topic. Your mileage may vary.
I have not reached the same conclusion as you that "JSON's major use
case is in the use within REST". As the JavaScript Object Notation,
I understand JSON's major use case is in AJAX-like interactions
between programs and the JavaScript in browsers. In such
low-threshold and low-latency environments, the mating of in-memory
data structures between the server and the browser puts the burden on
the server to arrange the information in a way that spoon-feeds it to
the browser for quick action. There is a burden to format the
information as needed, but that burden is borne by the high-speed
server rather than the low-speed browser.
I worry that programmers have conflated their personal "ease of use"
with this targeted benefit that low-speed browsers enjoy by having
their memory structures quickly populated by such a syntax. The
browser becomes an extension of the server. There is incredibly
tight coupling in such a solution.
But, horses for courses, a syntax should be used as fit for purpose
and not be used as a convenience for programmers. A programmer's
task is to make their user's job easier, not to make their own job easier.
And I posit that JSON is fit for purpose only for quickly populating
browser memory structures for JavaScript manipulation in a tight
coupling between sender and receiver. I claim it is not fit for
purpose as a generalized information serialization format where
decoupling between sender and receiver is paramount. It may be a
convenience for programmers, but it is not fit for generalized
information processing data flows.
In my comparison I have ruled out JSON as a candidate for expressing
mixed content (sibling element and text content). I hope it is
accepted that generalized text processing of mixed content remains
the domain of XML and such content cannot be efficiently expressed in
JSON syntax. I think the only logical discussion is the comparison
of XML element content (all elements contain only element children
and possibly irrelevant indentation white space) and JSON structures.
Such is the case for OASIS UBL - ISO/IEC 19845:2015 ... all content
is element content. The UBL XML [1] and the UBL JSON [2]
serializations are guaranteed isomorphic because the JSON is not
derived from the XML. Rather, both are derived from a common Core
Component Technical Specification (CCTS) Version 2.01 [3]. CCTS is a
modeling approach of business information entities published by the
United Nations trade facilitation group.
Key in my mind to the difference is the coupling of systems I
mentioned earlier. I was taught when learning SGML that generalized
information processing supports the decoupling of
systems. Maintaining independence of one processing system from
another partner processing system allows either processing system to
change without impacting its partner.
You cite REST and the REST interface is an implementation of such
decoupling between systems. So I feel XML very much has a role in
REST approaches, and it very much did before JSON came along. There
is nothing about REST that favours one syntax representation or the
other. In my mind it comes back to the features of the syntax used
over REST and the fit for purpose. So I think any discussion of REST
is orthogonal to a discussion of XML and JSON.
I see the purpose of an independent syntax between dissimilar systems
is the unambiguous representation of information. A recipient can
extract from the syntax all components of the data without using any
subject-matter awareness. Once so obtained, the recipient then
applies the subject-matter awareness (the semantics) to the
information independent of whatever syntax that was used.
There may be intermediaries between the sender and the recipient. In
the business circles where UBL is used, an auditor is an example of
an intermediary. It is critical that independent of both the sender
and the recipient that an auditor be able to inspect the
content. This, too, is a two-step process: teasing out the
information from the syntax in a manner independent of the semantics
of the information, and then applying whatever semantics is needed on
the extracted content. The auditor's semantics may be different than
the trading partners exchanging the information, and so the auditor
needs to unambiguously identify the content in order to apply their
own semantics.
XML achieves this semantic-free syntactic identification of content
by the unambiguous labelling using element names and attribute
names. The (dreaded) namespace concept is crucial in creating
world-wide unambiguous labels, but let's not get into that
discussion. I am a big fan of namespaces because they solve this
ambiguity issue, regardless of whether people like or dislike their syntax.
It has been my experience that JSON users seek to abandon the
labeling of content by inferring the identification of content. The
in-memory representation of information in one system is instantly
conveyed to the in-memory representation of that information in the
other system. This can only be done by pre-agreement between sender
and recipient. Instantly this engages the need to interpret the
meaning of content *at the syntax level* and not only at the
semantics level. Instantly this disenfranchises third parties that
may, perhaps even for legal reasons, need to inspect the content and
determine what information was intended to be conveyed in the absence
of an unambiguous label. These third parties may not know all of the
implicit agreements on information representation short-cuts or
assumptions that lie in that in-memory structure.
The role of syntax is the identifying of the information being
conveyed, independent of the semantic interpretation done by the
programs that act on the information.
Accordingly, my approach to serializing CCTS-based data models in
JSON is wholly based on the explicit labelling of content. This
allows me to write transliteration applications between XML and JSON
syntax for all CCTS-based models (UBL or any other) without any
awareness whatsoever of the semantics behind the syntax. Such an
approach with JSON lead me to the use of objects and keys for
distinguishing sibling content under different labels, and the use of
arrays for distinguishing sibling content under the same
label. Addressing the content becomes rote and semantic-free through
the dot and array notation working one's way through the unique key
labels and the counted like-named siblings. In effect, it is a
generalized syntax using JSON notation.
This has led to criticism of the approach as being too verbose and
not in the spirit of JSON for programmers. The actual quote in [4]
is "processing code is a mess". I feel the response [5] to this
successfully defended the design decisions in the CCTS approach to
all of the points made in the analysis. Regardless, this attitude
towards JSON has led the group of programmers to promote an
alternative localized serialization of UBL in JSON that is based on
agreed short-cuts and associations. Thus, a third party inspecting
such serializations must know, a priori, what those assumptions are.
I acknowledge the same can be said for assumptions about the
semantics of information between the sender and recipient after the
information has been teased out of the syntax. But that is not my
point. My point is that the syntax itself should be a semantic-free
labeling of the content for the unambiguous transmission of the data
from the program of the sender to the program of the
recipient. There should be no semantic triggers in the syntax that
would impact on how the information in the syntax is to be
identified. Semantic interpretation should only happen after the
information is identified.
The tight coupling between the server and the browser that lead to
developing JSON promotes a tight coupling between sender and
recipient when JSON is used for generalized information
interchange. This, in turn, promotes the need for prior agreement
and a tight coupling of representations in the computer systems of
sender and recipient. Independence between systems is lost. In
fact, the information interchange is no longer generalized. So JSON
is fine when the information interchange does not have to be
generalized, but it has to be coerced in order to be used in a
generalized fashion.
And many real-world scenarios of information interchange must be
generalized in order to decouple the systems involved in that
interchange. The international interchange of business information
is the other end of the wide spectrum to the server populating a
browser's in-memory data structures to speed up the browser's
response to user keyboard interaction.
So to summarize my overall personal perspective: if JSON has to be
coerced into having the same expressive generalized labelling
features as XML already has, what is the benefit of having used
JSON? It *can* work as I have shown it to work for UBL, but it was
not designed to work in the fashion needed to be generalized. How
does reinventing the XML processing stack for JSON advance the
objective of the interchange of data between dissimilar decoupled
systems? And XML already handles mixed content and element content.
I keep getting back to horses for courses: use XML for generalized
interchange and use JSON for the tight binding between systems
sharing identical in-memory representations.
I firmly believe the job of the programmer is to take on the burden
to make the user's task easier or better. It isn't responsible for a
programmer to take the easy way if it adds effort to the users or
takes away features from the users. My father, a programmer, taught
me this long ago.
I hope this perspective is helpful to your thoughts in this regard,
Mukul. No doubt others will feel differently.
. . . . . . . . Ken
[1] http://docs.oasis-open.org/ubl/UBL-2.1.html
[2] http://docs.oasis-open.org/ubl/UBL-2.1-JSON/v1.0/UBL-2.1-JSON-v1.0.html
[3]
http://docs.oasis-open.org/ubl/Business-Document-NDR/v1.1/Business-Document-NDR-v1.1.html
[4] https://lists.oasis-open.org/archives/ubl-comment/201705/msg00007.html
[5] https://lists.oasis-open.org/archives/ubl-comment/201705/msg00011.html
At 2017-08-01 17:04 +0530, Mukul Gandhi wrote:
Hello,
I don't intend to spark a bitter debate between XML & JSON, where
the outcome of debate is win of one over other. Rather, I wish to
present in a friendly manner, according to me, where these two
technologies differ.
When talking about designing REST services, JSON seems to clearly
win. The whole software world seems to be biased in favour of JSON
it seems, for this criteria. Although I have read, that in many
cases REST services can use XML instead of JSON. I think, JSON's
major use case is in the use within REST.
Having worked quite a bit with Android mobile apps, that framework
by default relies heavily on XML. I haven't seen JSON being used by
default in that area. Although, JSON is many times used in feeding
and fetching JSON data from various kinds of services (remote REST
services, local API calls etc), in Android apps.
XML when considering other technologies in combination, like XML
Schema and XML databases, have a scale close to RDBMSs. JSON is no
where near this.
Any other thoughts from the experts here?
--
Regards,
Mukul Gandhi
--
Contact info, blog, articles, etc. http://www.CraneSoftwrights.com/x/ |
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Streaming hands-on XSLT/XPath 2 training class @ US$45 (5 hours free) |
- References:
- XML vs JSON
- From: Mukul Gandhi <gandhi.mukul@gmail.com>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]