As a
developer, I've often been burned by wasting time investigating a tool that
supposedly supports some particular standard or protocol, only to find after
investing time and effort that the tool does not live up to expectations. I take
this very seriously. I'm too busy to waste time sifting through the claims of
dishonest vendors to determine if the tool really meets my needs. There are
vendors that I routinely ignore; I don't bother to investigate their tools
because I've been burned by them in the past. I wish more developers would do
the same. If developers would boycott vendors that misrepresent their products,
than we would have much more honest vendors and better interoperability in the
world.
Does
it make sense to write specialized parsers that only deal with a specific
DTD/schema? Certainly it does. If I have a need to deal with SVG in a program, I
am going to try to find an SVG parser before I search for an XML parser, because
an SVG parser will probably give me much greater value. Likewise, if I need to
parse RDF, I probably will try to identify an RDF parser before resorting to a
generic XML parser. But if you are going to write an SVG parser, than call it a
"SVG parser", not an XML parser! If you write an RDF parser, than call it an RDF
parser, not an XML parser! For that matter, if you write a MinML parser, than
call it a MinML parser and not an XML parser. If you do that, then you've got no
argument from me. When you call it an XML parser, though, then we have an
argument, and I will boycott your products.
If I
obtain an "XML parser" from someone, only to find that it only supports a subset
of XML or a specific document type, I will feel deceived and I will boycott that
vendor's or developer's products. If I need something that only supports a
specific document type or a specific subset of XML, than I will seek something
that only supports that document type or a specific subset. If I obtain an "XML
parser" it is because I am looking for an XML parser, and it better support XML!
Otherwise, my time has been wasted because the developer or vendor has
misrepresented their product, and I resent that.
I will
also add that there is typically great value in implementing specialized parsers
as a layer on top of more generalized parsers. The value is in not reinventing
the wheel and in making sure that the subset of XML relevant to that specialized
parser is implemented properly. I've been working with
SOAP quite a bit lately. When I first started working with it and surveyed the
toolkits available at the time, I found the existing toolkits to be in a very
sorry state. Not only that, but implementors posting on the SOAP discussion list
often raised "issues" and proposed "solutions" for those issues blissfully
ignorant of the fact that these issues had already been solved by available XML
technologies. One that kept surfacing, for instance, was various proposals for
how to encode Unicode characters in a SOAP message (which became a particularly
important issue since the "XML" implementations of the SOAP libraries did not
deal with character encoding issues properly). I think I got a bit irate in some
of my postings in response to this, but I was frustrated over people who had
swept aside mature, proven XML technologies and decided to implement the
"relevant subset" themselves -- and they kept doing it wrong! I found it more
expedient to build my own SOAP implementation built atop generic XML
technologies, than to waste time with more specialized SOAP libraries that had
major deficiencies and interoperability issues because the implementors failed
to leverage proven, mature, general XML technologies that were readily
available. I think there's some lessons for folks to learn,
there.
Okay, so here are the two basic
camps:
1) XML is a standard that should be conformed to
100% in parsers. If it isn't going to be, then is should be called
an "XML Parser". Certainly, this is an ideal goal if you do not know how
the parser will be used. As a result, it needs to stay generic and
expect to handle all combinations of needs.
2) XML is a recommendation that should be
implemented in parsers as is appropriate to the situation. It is still
an XML parser of sorts, however.
The second camp is the way I choose to see
XML. For any particular implementation of
XML, I define a DTD or Schema that fits my needs. In my case, I deal
exclusively with e-commerce markup and I use only a specific subset of the XML
specification. So I have two choices: use a parser specific to my
needs or use a general-purpose parser that will work of anyone. While
the latter will work, it is overkill (just like using an SGML parser would be
overkill when processing XML). My needs are fixed. I will never
receive XML that doesn't conform to the subset I use. As a result, a
parser that handles only that subset makes more sense. Is it an actual
XML parser? Yes, since it does process certain XML. No, since it
doesn't process all XML. But remember, I am only ever using certain XML,
so from my point-of-view, the answer is "yes".
Everyone can complain until they are blue in the
face about which is better. In the end, it's a moot point. If 1
million people use a special-purpose XML parser for a specific purpose, then
that is absolutely fine. It doesn't matter what the rest of the universe
is doing with XML because it's not within their problem domain. If
there is cross-over of domains, then those people will use a different parser
that fits their needs.
In the end, I would expect any development
process to go something like this:
1) Define XML DTD or Schema.
2) Choose parser that will handle the
application of this XML.
3) Implement XML using chosen
parser.
If it means that someone uses more than one
parser to get different jobs done, that's fine. If it means that someone
uses the "fully standard" parser to get different jobs done, that's fine as
well. In the end, deciding what subset of parser should be used for
development is every bit as important as deciding what subset of XML should be
used for DTD or Schema definition.
|