OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Note from the Troll

[ Lists Home | Date Index | Thread Index ]

It seems rather odd to be discussing HR XML problems on this list 
when the whole topic area has its own standards body,  the HR-XML 
Consorium, http://www.hr-xml.org

Of course, you do have to pitch your pennies into their cup, but they 
have a whole bucketful of standards described already, and they are 
going to straighten out their namespace in Jan., they say, to make it 
more useful. That doesn't change the nature of the problem, but it 
does put it, and others of similar nature into a kind of perspective. 
You pay your dime and you take your chances.

You can either dig in and try to get a standard more to your liking, 
if you are motivated sufficiently, or you just take whatever de facto 
standard is left standing when the dust settles. Given the fact that 
we don't have either a Y2K boogie man to scare us into a frenzy of IT 
restructuring, nor a dot com pipe dream bubbling along with enough 
brainless cash to fund it, the market is being dictated by inertia. 
So it looks like XML out to the horizon.


At 11:15 AM -0500 10/27/02, Amelia A Lewis wrote:
>Thanks for clarifying.  I hope you don't mind if I respond to some of
>the points.  I'm in agreement with many of them, but not with the
>overall conclusion that XML has no value.
>On Sun, 2002-10-27 at 06:49, tblanchard@mac.com wrote:
>>  Lately, I'm working for a company that is exchanging HR information
>>  with job boards (like monster and hot jobs) - which has its own working
>>  groups trying to define HRXML.
>That's an interesting problem space, especially for someone strongly
>convinced of the value of the relational model.  In the past ten to
>fifteen years, in my experience, many of the largest firms moved
>personnel information out of databases and into LDAP.  LDAP is all sorts
>of things, but it isn't very relational.  It's very easy to model as a
>hierarchical database.  Or as XML.
>On the other hand, LDAP has some significant limitations.  It isn't very
>relational, for one.  *laugh*  The problem, in the HR space, is that
>information fits neatly into hierarchies, except when it doesn't.  And
>relations do a nice job, except for the rigor of column definitions.  So
>that area, in particular, is a hard problem, one that probably requires
>a synergy of technologies.  For all the hype surrounding the XMLization
>of the current leading relational database products, we aren't there
>>  1) XML Tools suck - they're little more than syntax coloring editors.
>Hmm.  My favorites are, but I work for a company that has produced
>strongly graphical editors.  Available on a Mac, no less (the graphics
>used to describe various schemata have a tendency to appear in a variety
>of books; we pass them around at work, when found).
>On the whole, I tend to agree that tools aren't up to par as yet.  But
>.... different people need to do different things.  Cue rant from ERH on
>the uselessness of tree view as an editing model.
>>  2) The Hype is at the same level as the hype was for AI and it can't
>>  possibly live up to it.  It should be written <genuflect>XML</genuflect>
>All too true.  Are the hypesters identical to the developers?  I think
>that that was true for the AI model.  I'm doubtful that it is the case
>for XML.  THe most outrageous claims seem to be made by PHBs.
>>  3) The weight of the processing model is really really heavy.  As an
>>  example, using URLs to reference DTD's causes all sorts of problems for
>>  computers when they're off the network.  XML  parsing simply halts. 
>>  This is especially annoying when running something like BEA WebLogic on
>>  your machine because you're doing a web app.  BEA stores config info in
>>  XML which references some DTD at BEA and the server simply won't start
>>  if that server isn't available.  You can argue this is a misuse of XML
>>  - I think so - but its one of those things thats going to hurt people's
>>  impressions of XML.
>Hmm.  I think that much more can be said here, and that some of it can
>tie in with points 5 and 6 below.  It no longer surprises me that W3C
>recs tend to show the adverse effects of prolonged URI abuse.  One of
>the canons is that URIs are the perfect addressing mechanism.  No, wait,
>the perfect identification mechanism.  No, wait.  Oh, and sometimes URIs
>are not URIs.  Massive confusion is created over whether the use of
>URIs, in a particular context, is for identification, for comparison,
>for location, or for decoupling.
>In fact, I would come to the same conclusion here, but argue that the
>problem is an *inadequate* processing model, not one that's too
>heavyweight.  It isn't, on the whole, clear when you should retrieve a
>URI, and when you ought to compare it to something else by the rules
>governing URIs, and when you ought to compare it as a string, ignoring
>the fact that it has the form of a URI.
>Add linking (the XML version of relations, if you will), and life
>becomes most unpleasant.  See the discussion between ERH, SSL, and UU on
>order of processing of XInclude during XPath processing.  Eric van der
>Vlist has proposed a model for specifying processing order, to address
>the issue.
>>  4) Schema is really an insane spec.  I mentioned just the data types -
>>  too many too complicated.  Do we have to specify the number of bytes
>>  for the ints?  Thats a physical issue and for this Smalltalker it
>>  doesn't even make sense (Smalltalk handles arbitrary sized numbers).
>Umm.  See [me] for rants on schema type cluelessness.  Argh.  In fact, I
>don't much care about the existence of the various register-based
>integers.  They're among the twenty-five derived base types.  The
>derivations even make a certain amount of sense.  On the other hand,
>there are *nineteen* primitive types.  That's string, boolean, number
>.... uh, another number.  And oh, yeah, another number.  Only different.
>And, umm, date, time, dateTime, and duration.  Oh, and another five or
>so things that have to do with time or duration.  Not related to one
>another.  Oh, and remember we were talking about how URIs are sometimes
>just strings?  Well, in schema, they *aren't* strings.  No, I said I
>wouldn't go on this rant.  Sorry.  This is a place of deep flaw, that
>needs very serious, very careful work.  What's worse is that the spec
>was so delayed, and so anticipated, that most folks really, really want
>to overlook its enormous hairy dangling warts and just get on with the
>>  5) Like C++, the average developer can't cope with the excessive
>>  complexity XML introduces relative to its value.  The average
>>  programmer doesn't really know the difference between nonnegative int
>>  and positive int.  In fact, the schemas I'm getting from biz partners
>>  (the couple that want to use XML because everyone is using it - and its
>>  less than half) are AWFUL.
>Ugh.  There is a solution to this problem; it's called RELAX NG.  But it
>isn't a solution to the problem of primitive types, which hasn't been
>addressed, as yet.
>>  XSLT files are maintainable with the same level of ease as densely
>>  written perl.  Developers asked to modify them routinely rewrite them
>>  because they can't figure out what the last guy was doing.
>Hmm.  I haven't encountered that one.  I think some of the more
>web-oriented here have reported similar things, though (my work doesn't
>bring me into contact with that segment).  So the transform sets that
>I've seen in use tend to be much more maintainable.
>Come to think of it, though, it wouldn't surprise me to find that
>quickie XSLT transforms would share the ease of confusion of quickie
>perl hacks.  Similar problem spaces; perl addresses text, XSLT XML.
>>  loved it because the thing itself is the damnedest puzzle.  It
>>  entertained and challenged their intellect to work with it.  I'm
>>  beginning to suspect the same about XML.
>*laugh*  I, for one, *hate* having the nasty little corners crop up.  I
>end up having to explain (for instance) that attributes are *not* in the
>default namespace when unadorned, even though similarly unadorned
>elements are.  I'm not looking forward to having to explain why 0x0D is
>in the whitespace production, even though it can't appear in an XML
>stream ... I'll have to refer folks to the post describing that
>particular Stupid Entity Trick, because I can't remember it.  And when
>they ask why any spec would support something so extremely obscure, I'll
>just have to shrug.  Smiling and wincing.
>>  6) From the stand point of business process and enterprise architecture
>>  - XML is an evolutionary step backwards.  Hierarchical databases were
>>  abandoned for relational models long ago and systems made out of lots
>Err.  There's been one objection to this already.  I can add that there
>are a number of spaces where hierarchical organization is taking share
>from relational models (LDAP is one).  For long term storage of
>information, XML actually does make sense, because it's easy to
>untangle, in comparison to a number of alternatives.  Note that it is
>not necessarily better than plists, or S-expressions, except that it's
>in wider use, and that means that the knowledge of how to tease
>information out has a better chance of remaining over the long term.
>This is important for governments, for instance.  Last I heard, the 1960
>US census is stored in a format that can be read by only two working
>machines in the world (one of which is on display at the Smithsonian).
>Having trained as an historian in the long ago, I can say that that is
>an unmitigated disaster.  Cost of transformation, if undertaken, will be
>enormous.  If not undertaken, loss of information will be enormous.
>Folks (including, from an apocryphal source, the IRS) store proprietary
>formats into databases as BLOBs.  Better if that's XML.
>XML's looser restrictions provide an anodyne to the limitations of
>databases, and one that ought to, at some point, be taken up as a
>synthesis.  Common databases do far better with primitive types, but
>have far less ability to handle semi-structured information.  This, in
>fact, is probably the HR problem.  There's personnel information that
>ought to be loosely structured, the sorts of things that are well
>described by DTD and RELAX NG without types.  There's other information
>that has strong typing, and there's a real need to get at things fast,
>to be able to index and look up in a variety of ways, and to avoid
>redundancy of information.  Somewhere in a synthesis of hierarchy and
>relation may lie the answer.  It isn't currently available, though, to
>the best of my knowledge.
>>  I need to write the follow on to this piece but it will focus on point
>>  6 above plus the assertion that XML fails as both a markup language
>>  (markup shouldn't require well-formedness) and as a serialization
>*ABSOLUTELY* disagreed.
>Well-formedness is the thing that makes XML worthwhile.  SGML, intended
>for writing in text editors, has all sorts of cute little tricks called
>"markup minimization".  You can end a tag with </>.  If SGML had
>primitive types for elements, you'd see minimization like <price/2.99/.
>And other Stupid Tag Minimization Tricks.  These are great for typing,
>until you reach the point in the document that looks like this:
>Hope you don't have to extend one of those sections.  Whatever they
>are.  Still worse if you have full minimization, where the end tag is
>merely implied.  Does this element imply the end of the last three open
>tags?  Or just of the last two?
>Well-formedness removes ambiguity.  It does so at the expense of
>terseness.  It's a good choice for a markup language.  It's a poor
>choice for a data exchange format, where all the data is in very tiny
>chunks, and the ambiguity does not arise.
>>  format (too hierarchically oriented, verbose, and weirdly structured
>>  relative to ER models).
>*shrug*  ER models have their limitations as well.  More loosely
>structured data than tables can easily model are commonly encountered.
>It's a good place to use XML.
>>  7) While there is lots of heavyweight support for reading XML, there
>>  isn't any help for writing it from various other data structures.
>Hmmm.  I kinda need to think about that one, I guess.  We've been having
>go-rounds on the subject of "serialization", at work.
>>  So there's my viewpoint.  Take it for what its worth.  Am I trolling? 
>Well, I thought so.  Even I can claim to have "used" DTDs for eight
>years, since I was writing HTML with doctype decls at the top in '94.
>Clearly, I was wrong to think so, but it was hard for me to see what the
>needs were, behind the very strongly expressed antagonism.
>>  Because it runs against what I've found to be true in my own work. 
>>  Centralize business logic and ER/OO modeling to model your business
>>  entities and processes.  This works well when the company has the
>>  discipline to pick an implementation language and stick with it and
>>  focuses all developers on this goal.
>I was a grunt programmer at some of those companies.  Speaking from the
>trenches (RPC, DCOM, CORBA), I don't think those models work.  In fact,
>I think that they're seriously flawed by adopting Sun's marketing
>slogan.  The network is not a computer.
>>  But of course, XML philosophy says the opposite.  Where is the business
>>  rule repository in XML?
>Where do you want it to be?  XML is data, not objects.  It isn't even
>very good at linking, just at the moment, even though it was always
>*supposed* to be.
>>  Have you considered that you're breathing your own exhaust?  You don't
>*laugh*  Such an image.  No, I'm being a hairy nuisance to various
>committees, trying to convince them *not* to make any more nasty little
>complexities that I'll end up having to explain.  Unconvincingly, since
>there really isn't any excuse for monstrosities like gHorribleKludge.
>However, I think that there are some underlying principles that are very
>valuable, and that address a number of problems hard to deal with using
>other tools.  You may not agree, if you see no value in
>well-formedness.  Alaric doesn't agree, 'cause he doesn't see value in
>text-only formats.  XML well-formedness and text-only format are very
>important for easing exchange and encouraging long-term preservation of
>information.  The looser-than-database-columns structure of schemas (not
>meaning WXS, here, but DTD and RNG, primarily) provides a useful match
>for many kinds of information that fit awkwardly into relational models
>(this information often ends up as unindexable blobs).  Localization of
>information is important when information is transient; XML documents
>are often better solutions for the delivery of information than are
>rowsets (particularly when the information may be loosely structured, as
>noted in the previous point).  As a format for exchange of strongly
>typed data, it's currently a mess and a failure (in my opinion), because
>there is no type system (what's in WXS isn't a system).
>Amelia A. Lewis       amyzing@talsever.com      alicorn@mindspring.com
>To be whole is to be part; true voyage is return.
>                 -- Laia Asieo Odo (Ursula K. LeGuin, "The Dispossessed")
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>

Rex Brooks
Starbourne Communications Design
1361-A Addison, Berkeley, CA 94702 *510-849-2309
http://www.starbourne.com * rexb@starbourne.com


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS