OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Please no UOM (was Re: maps)

[ Lists Home | Date Index | Thread Index ]

(Not that anyone here is advocating this, but just in case...)
 
Regular expressions that enforce syntactic patterns for particular units of measure are fine, but I would strongly caution against trying to assign schema constraints based on the actual values (other than range constraints to specify valid values, which I would classify as regular expressions anyway).  Even worse are co-occurence constraints on the values, and worst of all are constraints that involve UOM translations.
 
These types of constraints tend to fall into the realm of "business rules", and in no way belong in a data schema language.  I am aware that some people advocate this, but I think anyone who has experience with UOM processing in systems like EAP, BOM, Financials, and so on will agree that to do so is just begging for sorrow.  It's one of those "nice in theory, if you don't know the theory very well" things.
 
If you allow flexible UOM on a value, how do you set constraint?  Suppose that you permit Euros and Dollars, and you set the constraint that the amount must not exceed some particular amount in dollars.  What does that mean for values entered in Euros?  No matter what you do, you will either be committing to a fixed exchange rate (set the constraint on Euros to a fixed amount as well), or saying that the constraint on Euros depends on the exchange rate at whichever moment in time you happen to check the constraint.  Both are pretty useless to any company that actually cares about exchange rates.  Again, it's fine to store values as both Euros and Dollars, but the point is that nobody in the world is going to let their exchange rates be tied to schema processing, and any value-based constraints will lead to this unacceptable situation.
 
Or how about a manufacturer who builds and sells product in multiple UOMs (in other words, the typical manufacturer).  They sell by unit, box (defined in terms of unit), carton (defined in terms of box), they package boxes inside other SKUs that are measured in unit, and so on.  So "a unit of SKU x can be sold in quantity as a box, but a box of SKU x can be sold as a unit of SKU y, and a unit of SKU y cannot be sold in quantity as a box".  This example of UOM isn't contrived; it is so common that I would say it's ubiquitous.  And unfortunately, the business practices are based on what works best for the business rather than what works best for the business rules markup system.  As soon as SAP and Baan and PeopleSoft and so on agree on a schema language that is sufficient to exchange all of the UOM constraints that their clients need, we'll see a standard that stands a chance of being useful for this purpose -- but not before (and it won't be in XSD or Relax).
 
Another example could be a system that allows people to store temperatures as centigrade, fahrenheit, or kelvin.  In this example, some would argue that the conversion formula is simple enough, deterministic enough, and not dependent on external factors, so it is OK.  But a conversion that is so acceptably deterministic begs the question "why store it in different units of measure in the first place?"  Why not convert everything to centigrade and store that way?  If you care what unit of measure was used when the measurement was taken, presumably you care because you want the option to apply a different conversion at some point in the future when you have other information merged from another source (such as elevation, which could make a difference in temperature conversions).  If you are keeping the data in the original UOM in order to preserve fidelity of the data, you are implicitly acknowledging that the data isn't otherwise capable of having full fidelity, and any constraints you assign are approximate constraints.  There is nothing wrong with approximate constraints, but its tempting for people to convince themselves that they have something precise, especially when it is a nice formula like (5/9) * (f - 32).
 

	-----Original Message----- 
	From: Simon St.Laurent [mailto:simonstl@simonstl.com] 
	Sent: Sun 8/4/2002 1:50 PM 
	To: Liam Quin 
	Cc: xml-dev@lists.xml.org 
	Subject: [xml-dev] Re: maps
	
	

	At 03:51 PM 8/4/2002 -0400, Liam Quin wrote:
	>[I have cc' postmaster@xml.org becasue none of my xml dev posts in the past
	>  year or two seem to have made it to the list]
	
	This seems to be a continuing problem for some folks. (I'll happy forward
	messages to xml-dev if that helps.)
	
	>On Sun, Aug 04, 2002 at 12:43:36PM -0400, Simon St.Laurent wrote:
	> > There's no type in WXS for locations.  I can't use the built-in types to
	> > express something like:
	> >
	> > <zoo>
	> >    <name>Utica Zoo</name>
	> >    <lat>75°15'00" N</lat>
	> >    <long>43°05'00" W</long>
	> > </zoo>
	> >
	>
	>It sounds like you're hankering after some of the minimisation featuers
	>in SGML -data tag, shortref, etc.
	
	No, I'm looking for a cleaner approach to mapping between the lexical and
	value spaces that lets me map the lexical space to my own value spaces -
	value spaces which can be represented explicitly (if verbosely) in markup,
	at that.
	
	>The XML way to do this, I claim, is
	><zoo>
	>   <name>Utica Zoo</name>
	>     <lat>
	>       <deg>75</deg>
	>       <min>15</min>
	>     </lat>
	>    <long>
	>       <deg>43</deg>
	>       <min>05</min>
	>    </long>
	></zoo>
	>
	>Now you can represent both lat and long in W3C XML Schema.
	
	But that's simply a bizarre way to make me type more.  A smarter approach
	is to let humans (or GPS gadgets) type 75°15'00" N and provide a mapping
	from that to a value structure.  Using Regular Fragmentations, I can use
	regular expressions break that down into:
	
	   <lat><deg>75</deg><min>15</min><sec>00</sec><hem>N</hem></lat>
	
	If the normalized form you present above is so obviously good, then how
	exactly did we get stuck with gHorribleKluge and a variety of other nastiness?
	
	I'd guess we got there by focusing on notions of strong typing from other
	aspects of computing - which, I've argued previously, are simply a bad fit
	for markup.
	
	>XML is a way to represent structure. You aer saying that it is flawed
	>because it doesn't natively understand arbitrary other ways to
	>represent structure.  But that is not a goal of XML. Neither was
	>terseness, of course :-)
	
	Unfortunately, that's a goal that W3C XML Schema expressly sets out to
	achieve and then falls flat on its face for any but a tiny number of
	mappings from lexical to value.  It is both too much (a huge pile of spec
	to implement) and too little (for cartographers, anyone who cares about
	units, etc.)
	
	
	Simon St.Laurent
	"Every day in every way I'm getting better and better." - Emile Coue
	
	
	-----------------------------------------------------------------
	The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
	initiative of OASIS <http://www.oasis-open.org>
	
	The list archives are at http://lists.xml.org/archives/xml-dev/
	
	To subscribe or unsubscribe from this list use the subscription
	manager: <http://lists.xml.org/ob/adm.pl>
	
	





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS